UMD Theses and Dissertations

Permanent URI for this collectionhttp://hdl.handle.net/1903/3

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 7 of 7

Detecting Local Item Dependence in Polytomous Adaptive Data
(2011) Mislevy, Jessica Lynn; Harring, Jeffrey R.; Rupp, Andre A.; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
A rapidly expanding arena for item response theory (IRT) is in attitudinal and health-outcomes survey applications, often with polytomous items. In particular, there is interest in computer adaptive testing (CAT). Meeting model assumptions is necessary to realize the benefits of IRT in this setting, however. Although initial investigations of local item dependence (LID) have been studied both for polytomous items in fixed-form settings and for dichotomous items in CAT settings, there have been no publications applying LID detection methodology to polytomous items in CAT despite its central importance to these applications. The research documented herein investigates the extension of widely used methods of LID detection, Yen's Q₃ statistic and Pearson's Statistic X2, in this context, via a simulation study. The simulation design and results are contextualized throughout with a real item bank and data set of this type from the Patient-Reported Outcomes Measurement Information System (PROMIS).
AN INVESTIGATION OF ASSESSMENT AND IEP DEVELOPMENT IN THE FUNCTIONING AREAS OF SOCIAL, BEHAVIORAL, AND COMMUNICATION OF HIGH SCHOOL STUDENTS WITH AUTISM SPECTRUM DISORDERS
(2011) Sigerseth, Susan Carol; Kohl, Frances L; Special Education; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
Autism Spectrum Disorders (ASD) are life-long disabilities which manifest impairments in social skills, communication skills, and restricted, repetitive behaviors (DSM-IV, 1994). The purpose of this study was to investigate assessment and Individualized Education Program (IEP) development among high school students with an ASD, focusing on the assessment of social, behavioral, and communication skills. The design of this study was descriptive utilizing structured record reviews. Assessment selections and outcomes leading to IEP development were documented for 16 high school students with an ASD during the 2009-2010 school year. The assessment records of each participant were examined to determine what assessment domains had been requested and assessed, extracting information on social, behavioral, and communication skills, and which assessment instruments were used. Additionally, the IEP was examined to determine what instructional goals and objectives were written in the areas of social, behavioral, and communication. Variability among student records made retrieving assessment data difficult. Assessments that had been requested were not always given and assessments were given that had not been requested. Assessment domains did not yield basic information they were intended to provide. Although on average half of the students' IEPs contained goals that were social, behavioral, and/or communication, these goals and objectives were neither rigorous enough for the academic level of the student nor lead to independence to be successful, productive adults.
Exploring the Full-information Bifactor Model in Vertical Scaling with Construct Shift
(2011) Li, Ying; Lissitz, Robert W.; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
To address the lack of attention to construct shift in IRT vertical scaling, a bifactor model is proposed to estimate the common dimension for all grades and the grade-specific dimensions. The bifactor model estimation accuracy is evaluated through a simulation study with manipulated factors of percent of common items, sample size, and degree of construct shift. In addition, the unidimensional IRT (UIRT) estimation model that ignores construct shift is examined to represent the current practice for IRT vertical scaling; comparisons on parameter estimation accuracy of the bifactor and UIRT models are discussed. The major findings of the simulation study are (1) bifactor models are well recovered overall, even though item discrimination parameters are underestimated to a small degree; (2) item discrimination parameter estimates are overestimated in UIRT models due to the effect of construct shift; (3) person parameters of UIRT models are less accurately estimated than that of bifactor models, and the accuracy decreases as the degree of construct shift increases; (4) group mean parameter estimates of UIRT models are less accurate than that of bifactor models, and a large effect due to construct shift is found for the group mean parameter estimates of UIRT models. The real data analysis provides an illustration of how bifactor models can be applied to a problem involving for vertical scaling with construct shift. General procedures for testing practice are also discussed.
The Relationship Between Temperament and Emotion Understanding in Preschoolers: An Examination of the Influence of Emotionality, Self-Regulation, and Attention
(2010) Genova-Latham, Maria de los Angeles; Teglasi, Hedwig; Counseling and Personnel Services; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
This study examined the links between temperament and emotion understanding in preschoolers. Temperamental facets of emotionality, attention, and self-regulation were utilized. Emotion understanding is the ability to identify feelings based on facial expressions, behaviors, or situations. Historically, temperamental variables and emotion understanding have been poorly defined, impacting the clarity of research findings. The Structured Temperament Interview (STI) measured facets of temperament and the Emotion Comprehension Test examined emotion understanding. Both measures offer clear definitions of their associated constructs. Additionally, principal components analyses were run on STI dimensions. Correlational analyses were run on the STI and Child Behavior Questionnaire (CBQ), an established measure of temperament, to further determine the STI's utility as a measure of temperament. Results, though mixed, suggest that components of Attention and Emotionality from the STI explain a great deal of the variance in ECT scale scores.
IMPACTS OF LOCAL ITEM DEPENDENCE OF TESTLET ITEMS WITH THE MULTISTAGE TESTS FOR PASS-FAIL DECISIONS
(2010) Lu, Ru; Jiao, Hong; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
The primary purpose of this study is to investigate the impact of the local item dependence (LID) of testlet items on the performance of the multistage tests (MST) that make pass/fail decisions. In this study, LID is simulated in testlet items. Testlet items are those that physically share the same stimulus. In the MST design, the proportion of testlet items is a manipulated factor. Other studied factors include testlet item position, LID magnitude, and test length. The second purpose of this study is to use a testlet response model to account for LID in the context of MSTs. The possible gains of using a testlet model against a standard IRT model are evaluated. The results indicate that under the simulated conditions, the testlet item position has a very minimal effect on the precision of ability estimation and decision accuracy, while the item pool structure (the proportion of testlet items), the LID magnitude and test length have fairly substantial effects. Ignoring the LID effects and fitting a unidimensional 3PL model result in the loss of ability estimation precision and decision accuracy. The ability estimation is adversely impacted by larger proportion of testlet items, the moderate and high LID levels and short test lengths. As the LID condition gets worse (large LID magnitude, or large proportion of testlet items), the decision accuracy rates decrease. Fitting a 3PL testlet response model does not reach the same level of ability estimation precision under all simulations conditions. In fact, it proves that ignoring LID and fitting the 3PL model provides inflated ability estimation precision and the accuracy of decision accuracies.
Measuring Teaching Practices: Does A Self-Report Measure Of Instruction Predict Student Achievement?
(2010) Berger, Jill M.; Gottfredson, Gary D.; Counseling and Personnel Services; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
Teachers affect student achievement. Measuring what makes teachers "effective" is a challenge without a clear definition of the construct or constructs involved. Self-reports cost little and allow for data collection from large samples, but the reliability and validity of self-report measures for studying teacher effectiveness have not been adequately examined. This study explored the utility of a self-report measure of instruction (Instructional Practices Scale). Hierarchical linear modeling was used to examine the effects of the scale on students' reading and math standardized test scores and report card grades. Although the scale showed small to moderate relationships with teacher characteristics, results suggested little predictive validity and little discriminant validity. Further, the effects of teacher-reported instruction on achievement were not dependent on students' entering level of achievement. When measuring loosely defined constructs such as "effective instruction," the cost of using a self-report measure may outweigh the benefits.
REWEIGHTING DATA IN THE SPIRIT OF TUKEY: USING BAYESIAN POSTERIOR PROBABILITIES AS RASCH RESIDUALS FOR STUDYING MISFIT
(2010) Dardick, William Ross; Mislevy, Robert J; Measurement, Statistics and Evaluation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
A new variant of the iterative "data = fit + residual" data-analytical approach described by Mosteller and Tukey is proposed and implemented in the context of item response theory psychometric models. Posterior probabilities from a Bayesian mixture model of a Rasch item response theory model and an unscalable latent class are expressed as weights for the original data. The data weighted by the units' posterior probabilities for the unscalable class is used for further exploration of structures. Data were generated in accordance with departures from the Rasch model that have been studied in the literature. Factor analysis models are compared with the original data and the data as reweighted by the posterior probabilities for the unscalable class. Eigenvalues are compared with Horn's parallel analysis corresponding to each class of factor models to determine the number of factors in a dataset. In comparing two weighted data sets, the Rasch weighted data and the data were considered unscalable, and clear differences are manifest. Pattern types are detected for the Rasch baselines that have different patterns than that of random or systematic contamination. The Rasch baseline patterns are strongest around item difficulties that are closest to the mean generating value of è's. Patterns in baseline conditions are weaker as they depart from a item difficulty of zero and move toward extreme values of ±6. The random contamination factor patterns are typically flat and near zero regardless of the item difficulty with which it is associated. Systematic contamination using reversed Rasch generated data produces alternate patterns to the Rasch baseline condition and in some conditions shows an opposite effect when compared to the Rasch patterns. Differences can also be detected within the residually weighted data between the Rasch generated subtest and contaminated subtest. In conditions that have identified factors, the Rasch subtest often had Rasch patterns and the contaminated subtest has some form of random/flat or systematic/reversed pattern.

UMD Theses and Dissertations

Browse

Filters

Settings

Sort By

Results per page

Search Results