Building: Pinnacle
Room: 3F-Port of Hong Kong
Date: 2016-07-02 11:00 AM – 12:30 PM
Last modified: 2016-05-22
Abstract
Introduction & Objectives
The current study considers methods for automatically selecting degrees of postsmoothing for equipercentile equating functions. Selections for degrees of postsmoothing in equating applications have not been considered much due to the lack of inferential statistical framework of postsmoothing and a reliance of descriptive, judgmental selections in practice which are presumably not very automatable. Therefore, this simulation study was designed to examine the implications of postsmoothing for the accuracy of scaled and equated functions using fixed and automated methods for selecting the postsmoothing degrees.
Design/Methodology
For the simulations, populations were defined to represent four equating situations involving test forms with long, medium and short ranges of raw scores and scale scores. Simulations were run by drawing random samples of 1,000 and other random samples of 4,000 from the population distributions, estimating equipercentile equating functions in the samples, and postsmoothing the equipercentile functions with postsmoothing degrees chosen by several different criteria recommended in other studies such as matching of moments, smoothness, reducing gaps in the rounded scale score ranges, and reducing observed vs. smoothed differences with respect to +/- standard error bands.
Results & Conclusions
Results show that different selection criteria for postsmoothing vary in terms of the extent of smoothing (more smoothness vs. less smoothness). These differences were directly associated with the bias and random variability of the equating results, such that the selection criteria that tended to select more postsmoothing produced more biased and less variable equating functions than other criteria that tended to select less postsmoothing. Results were further differentiated by sample size and test length (i.e., more smoothness and greater reduction of random error would be preferred in smaller sample sizes and/or longer tests with less data per score point). This study reveals the most useful postsmoothing selection criteria for equating accuracy in practice.