Human Development & Quantitative Methodology

Permanent URI for this communityhttp://hdl.handle.net/1903/2248

The departments within the College of Education were reorganized and renamed as of July 1, 2011. This department incorporates the former departments of Measurement, Statistics & Evaluation; Human Development; and the Institute for Child Study.

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    CROSS-CLASSIFIED MODELING OF DUAL LOCAL ITEM DEPENDENCE
    (2014) Xie, Chao; Jiao, Hong; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Previous studies have mainly focused on investigating one source of local item dependence (LID). However, in some cases, such as scenario-based science assessments, LID might be caused by two possible sources simultaneously. In this study, such kind of LID that is caused by two factors simultaneously is named as dual local item dependence (DLID). This study proposed a cross-classified model to account for DLID. Two simulation studies were conducted with the primary purpose of evaluating the performance of the proposed cross-classified model. Data sets with DLID were simulated with both testlet effects and content clustering effects. The second purpose of this study was to investigate the potential factors affecting the need to use the more complex cross-classified modeling of DLID over the simplified multilevel modeling of LID by ignoring cross-classification structure. For both simulation studies, five factors were manipulated, including sample size, number of testlets, testlet length, magnitude of the testlet effects represented by standard deviations (SDs), and magnitude of the content clustering effects represented by SDs. The difference between the two simulation studies was that, simulation study 1 constrained the SDs of the testlet effects and content clustering effects as the same across testlets and content areas, respectively; simulation study 2 released this constraint by having mixed SDs of the testlet effects and mixed SDs of the content clustering effects. Results of both simulation studies indicated that the proposed cross-classified model yielded more accurate parameter recovery, including item difficulty, persons' ability, and random effects' SD parameters with smaller estimation errors than the two multilevel models and the Rasch model which ignored one or both item clustering effects. The two manipulated variables, the magnitude of the testlet effects and the magnitude of the content clustering effects, determined the necessity of using the more complex cross-classified model over the simplified multilevel models and the Rasch model: the larger the magnitude of the testlet effects and the content clustering effects, the more necessary to use the proposed cross-classified model. Limitations are discussed and suggestions for future research are presented at the end.
  • Item
    Detecting Local Item Dependence in Polytomous Adaptive Data
    (2011) Mislevy, Jessica Lynn; Harring, Jeffrey R.; Rupp, Andre A.; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    A rapidly expanding arena for item response theory (IRT) is in attitudinal and health-outcomes survey applications, often with polytomous items. In particular, there is interest in computer adaptive testing (CAT). Meeting model assumptions is necessary to realize the benefits of IRT in this setting, however. Although initial investigations of local item dependence (LID) have been studied both for polytomous items in fixed-form settings and for dichotomous items in CAT settings, there have been no publications applying LID detection methodology to polytomous items in CAT despite its central importance to these applications. The research documented herein investigates the extension of widely used methods of LID detection, Yen's Q3 statistic and Pearson's Statistic X2, in this context, via a simulation study. The simulation design and results are contextualized throughout with a real item bank and data set of this type from the Patient-Reported Outcomes Measurement Information System (PROMIS).
  • Item
    IMPACTS OF LOCAL ITEM DEPENDENCE OF TESTLET ITEMS WITH THE MULTISTAGE TESTS FOR PASS-FAIL DECISIONS
    (2010) Lu, Ru; Jiao, Hong; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The primary purpose of this study is to investigate the impact of the local item dependence (LID) of testlet items on the performance of the multistage tests (MST) that make pass/fail decisions. In this study, LID is simulated in testlet items. Testlet items are those that physically share the same stimulus. In the MST design, the proportion of testlet items is a manipulated factor. Other studied factors include testlet item position, LID magnitude, and test length. The second purpose of this study is to use a testlet response model to account for LID in the context of MSTs. The possible gains of using a testlet model against a standard IRT model are evaluated. The results indicate that under the simulated conditions, the testlet item position has a very minimal effect on the precision of ability estimation and decision accuracy, while the item pool structure (the proportion of testlet items), the LID magnitude and test length have fairly substantial effects. Ignoring the LID effects and fitting a unidimensional 3PL model result in the loss of ability estimation precision and decision accuracy. The ability estimation is adversely impacted by larger proportion of testlet items, the moderate and high LID levels and short test lengths. As the LID condition gets worse (large LID magnitude, or large proportion of testlet items), the decision accuracy rates decrease. Fitting a 3PL testlet response model does not reach the same level of ability estimation precision under all simulations conditions. In fact, it proves that ignoring LID and fitting the 3PL model provides inflated ability estimation precision and the accuracy of decision accuracies.
  • Item
    REWEIGHTING DATA IN THE SPIRIT OF TUKEY: USING BAYESIAN POSTERIOR PROBABILITIES AS RASCH RESIDUALS FOR STUDYING MISFIT
    (2010) Dardick, William Ross; Mislevy, Robert J; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    A new variant of the iterative "data = fit + residual" data-analytical approach described by Mosteller and Tukey is proposed and implemented in the context of item response theory psychometric models. Posterior probabilities from a Bayesian mixture model of a Rasch item response theory model and an unscalable latent class are expressed as weights for the original data. The data weighted by the units' posterior probabilities for the unscalable class is used for further exploration of structures. Data were generated in accordance with departures from the Rasch model that have been studied in the literature. Factor analysis models are compared with the original data and the data as reweighted by the posterior probabilities for the unscalable class. Eigenvalues are compared with Horn's parallel analysis corresponding to each class of factor models to determine the number of factors in a dataset. In comparing two weighted data sets, the Rasch weighted data and the data were considered unscalable, and clear differences are manifest. Pattern types are detected for the Rasch baselines that have different patterns than that of random or systematic contamination. The Rasch baseline patterns are strongest around item difficulties that are closest to the mean generating value of è's. Patterns in baseline conditions are weaker as they depart from a item difficulty of zero and move toward extreme values of ±6. The random contamination factor patterns are typically flat and near zero regardless of the item difficulty with which it is associated. Systematic contamination using reversed Rasch generated data produces alternate patterns to the Rasch baseline condition and in some conditions shows an opposite effect when compared to the Rasch patterns. Differences can also be detected within the residually weighted data between the Rasch generated subtest and contaminated subtest. In conditions that have identified factors, the Rasch subtest often had Rasch patterns and the contaminated subtest has some form of random/flat or systematic/reversed pattern.