A Proposed Index to Detect Relative Item Performance when the Focal Group Sample Size is Small

Hansen, Kari

A Proposed Index to Detect Relative Item Performance when the Focal Group Sample Size is Small

dc.contributor.advisor	Stapleton, Laura M	en_US
dc.contributor.advisor	Jiao, Hong	en_US
dc.contributor.author	Hansen, Kari	en_US
dc.contributor.department	Measurement, Statistics and Evaluation	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2018-01-23T06:32:49Z
dc.date.available	2018-01-23T06:32:49Z
dc.date.issued	2017	en_US
dc.description.abstract	When developing educational assessments, ensuring that the test is fair to all groups of examinees is an essential part of the process. The primary statistical method for identifying potential bias in assessments is known as differential item functioning (DIF) analysis, where DIF refers to differences in performance on a specific test item between two groups assuming that the two groups have an overlap in their ability distribution. However, this requirement may be less likely to be feasible if the sample size for the focal group is small. A new index, relative item performance, is proposed to address the issue of small focal group sample sizes without the requirement of an overlap in ability distribution. This index is calculated by obtaining the effect size of the difference in item difficulty estimates between the two groups. A simulation study was conducted to compare the proposed method with the Mantel-Haenszel test with score group widths and the Differential Item Pair Functioning in terms of Type I error rates and power. The following factors were manipulated: the sample size of the focal group, the mean of the ability distribution, the amount of DIF, the number of items on the assessment, and the number of items that have different item difficulties. For all three methods, the main factors that affect the Type I error rates are the amount of item contamination, the size of the DIF, the ability mean for the focal group, and the item parameters. The sample size and the number of items were found not to have an effect on the Type I error rates for all methods. As the Type I error rate overall for the RI method is much lower than that of the MH1 and MH2 methods and not controlled across the simulation factors, power was only evaluated for the MH1 and MH2 methods. The median power of these methods were .203 and .181, respectively. It is recommended that the MH1 and MH2 methods be used only when the sample size is larger than 100 and in conjunction with expert and cognitive review of the items on the assessment.	en_US
dc.identifier	https://doi.org/10.13016/M2MS3K334
dc.identifier.uri	http://hdl.handle.net/1903/20282
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Educational tests & measurements	en_US
dc.subject.pqcontrolled	Statistics	en_US
dc.subject.pquncontrolled	Differential item functioning	en_US
dc.subject.pquncontrolled	item bias	en_US
dc.subject.pquncontrolled	item response	en_US
dc.subject.pquncontrolled	Mantel-Haenszel	en_US
dc.subject.pquncontrolled	relative item performance	en_US
dc.subject.pquncontrolled	small sample sizes	en_US
dc.title	A Proposed Index to Detect Relative Item Performance when the Focal Group Sample Size is Small	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Hansen_umd_0117E_18469.pdf
Size:: 3.8 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Human Development & Quantitative Methodology Theses and Dissertations