Open Conference Systems, ITC 2016 Conference

Font Size: 
PAPER: Value in Subscores: Case Study of 2014 Belize Primary School Examinations (PSE)
Betty Jean Usher-Tate

Building: Pinnacle
Room: 3F-Port of Hong Kong
Date: 2016-07-02 03:30 PM – 05:00 PM
Last modified: 2016-06-27

Abstract


Subscores are values calculated for specific segments or subtests of an assessment or scale. The term is often used interchangeably with terms such as: diagnostic-, skill-, domain-, dimension-, and subject-scores. Even in the absence of a test manual, reasonable educational and psychological assessments have blueprints that outline subtests, scoring, and reporting. Subscores function logically as indicators for particular, subject areas, reasons or domains within a comprehensive assessment. For example: the Internet Based Test of English as a Foreign Language (TOEFL/iBT) has values assigned for component skills (listening, reading, writing, speaking).

Decisions for classification, diagnosis, or selection depend on information that differentiates similar candidates. Proponents of education often rely on distinctions derived from Math and English subscores. However, other researchers have shown that subscores do not always offer additional value beyond what should be inferred from the total score.

The PSE is Belize’s most widely used standardized assessment. The assessment has seven subtests across four subject areas, used as an eight-grade measure of academic achievement and exit criterion. This paper aims to provide evidence about the underlying structure of the PSE and evaluate psychometric properties or value in subscores. The dataset included sub-tests scores and overall scores of 7,122 students. Test structure, value in subscores, and two models for subscore reporting were compared using descriptive statistics, distribution analyses, reliability estimates, inter-subscore correlations, dimensionality, and tests for differences.

Like most educational tests, the PSE-2014 is unidimensional. The overall reliability estimates for both 4- and 6-subscore models were excellent (alpha .927 and .916).  Relationships and tests of differences between subscores, and also for other variables, indicate additional value in subscores. Fundamentally, scores are validated for specific inferences, for a given group, at a particular point in time. The PSE-2014 subscores demonstrate meaningful value to report, however, test development protocols limit generalizations across administrations.


An account with this site is required in order to view papers. Click here to create an account.