SUBSCORE REPORTING FOR DOUBLE-CODED INNOVATIVE ITEMS EMBEDDED IN MULTIPLE CONTEXTS

dc.contributor.advisorJiao, Hongen_US
dc.contributor.authorLi, Chenen_US
dc.contributor.departmentMeasurement, Statistics and Evaluationen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2018-07-11T05:33:34Z
dc.date.available2018-07-11T05:33:34Z
dc.date.issued2018en_US
dc.description.abstractReporting subscores is a prevalent practice in standardized tests to provide diagnostic information for learning and instruction. Previous research has developed various methods for reporting subscores (e.g. de la Torre & Patz, 2005; Wainer et al., 2001; Wang, Chen, & Cheng, 2004; Yao & Boughton, 2007; Yen, 1987). However, the existing methods are not suitable for reporting subscores for a test with innovative item types, such as double-coded items and paired stimuli. This study proposes a two-parameter doubly testlet model with internal restrictions on the item difficulties (2PL-DT-MIRID) to report subscores for a test with double-coded items embedded in paired-testlets. The proposed model is based on a doubly-testlet model proposed by Jiao and Lissitz (2014) and the MIRID (Butter, De Boeck, & Verhelst, 1998). The proposed model has four major advantages in reporting subscores— (a) it reports subscores for a test with double-coded items in complex scenario structures, (b) it reports subscores designed for content clustering, which is more common than subscores based on construct dimensionality in standardized tests, (c) it is computationally less challenging than the Multidimensional Item Response Theory (MIRT) models when estimating subscores, (d) it can be used to conduct Item Response Theory (IRT) based number-correct scoring (NCS, Yen, 1984a). A simulation study is conducted to evaluate the model parameter recovery, subscore estimation and subscore reliability. The simulation study manipulates three factors: (a) the magnitude of testlet effect variation, (b) the correlation between testlet effects for the dual testlets and (c) the percentage of double-coded items in the test. Further, the study compares the proposed model with other underspecified models in terms of model parameter estimation and model fit. The result of the simulation study has shown that the proposed 2PL-DT-MIRID yields more accurate model parameter and subscore estimates, in general, when the testlet effect variation is small, the dual testlets are weakly correlated and there are more double-coded items in a test. Across the study conditions, the proposed model outperforms other competing models in model parameter estimation. The reliability yielded from models ignoring dual testlets are spuriously inflated, the 2PL-DTMIRID produces higher overall score reliability and subscore reliability than models ignoring double-coded items, in most study conditions. In terms of model fit, none of the model fit indices investigated in this study (i.e. AIC, BIC and DIC) can achieve satisfactory rates of identifying the proposed true model as the best fitting model.en_US
dc.identifierhttps://doi.org/10.13016/M25H7BX46
dc.identifier.urihttp://hdl.handle.net/1903/20732
dc.language.isoenen_US
dc.subject.pqcontrolledEducational tests & measurementsen_US
dc.subject.pquncontrolledBayesian estimationen_US
dc.subject.pquncontrolleddouble-coded itemsen_US
dc.subject.pquncontrolledpaired-testletsen_US
dc.subject.pquncontrolledreliabilityen_US
dc.subject.pquncontrolledsubscore reportingen_US
dc.titleSUBSCORE REPORTING FOR DOUBLE-CODED INNOVATIVE ITEMS EMBEDDED IN MULTIPLE CONTEXTSen_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Li_umd_0117E_19021.pdf
Size:
6.05 MB
Format:
Adobe Portable Document Format