Show simple item record

A General Method for Estimating the Classification Reliability of Complex Decisions Based on Configural Combinations of Multiple Assessment Scores

dc.contributor.advisorMislevy, Roberten_US
dc.contributor.authorDouglas, Karenen_US
dc.description.abstractThis study presents a general method for estimating the classification reliability of complex decisions based on multiple scores from a single test administration. The proposed method consists of four steps that can be applied to a variety of measurement models and configural rules for combining test scores: Step 1: Fit a measurement model to the observed data. Step 2: Simulate replicate distributions of plausible observed scores based on the measurement model. Step 3: Construct a contingency table that shows the congruence between true and replicate scores for decision accuracy, and two replicate scores for decision consistency. Step 4: Calculate measures to characterize agreement in the contingency tables. Using a classical test theory model, a simulation study explores the effect of increasing the number of tests, strength of relationship among tests, and number of opportunities to pass on classification accuracy and consistency. Next the model is applied to actual data from the GED Testing Service to illustrate the utility of the method for informing practical decisions. Simulation results support the validity of the method for estimating classification reliability, and the method provides credible estimation of classification reliability for the GED Tests. Application of configural rules results in complex findings which sometimes show different results for classification accuracy and consistency. Unexpected findings support the value of using the method to explore classification reliability as a means of improving decision rules. Highlighted findings: 1) The compensatory rule (in which test scores are added) performs consistently well across almost all conditions; 2) Conjunctive and complementary rules frequently show opposite results; 3) Including more tests in the decision rule influences classification reliability differently depending on the rule; 4) Combining scores from highly-related tests increases classification reliability; 5) Providing multiple opportunities to pass yields mixed results. Future studies are suggested to explore use of other measurement models, varying levels of test reliability, modeling multiple attempts in which learning occurs between testings; and in-depth study of incorrectly classified examinees.en_US
dc.format.extent1639502 bytes
dc.titleA General Method for Estimating the Classification Reliability of Complex Decisions Based on Configural Combinations of Multiple Assessment Scoresen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.contributor.departmentMeasurement, Statistics and Evaluationen_US
dc.subject.pqcontrolledEducation, Tests and Measurementsen_US
dc.subject.pqcontrolledEducation, Generalen_US
dc.subject.pquncontrolledclassification reliabilityen_US
dc.subject.pquncontrolledhigh-stakes testingen_US
dc.subject.pquncontrolledmeasurement erroren_US
dc.subject.pquncontrolledGED testsen_US

Files in this item


This item appears in the following Collection(s)

Show simple item record