Date of Award
Doctor of Philosophy (PhD)
Terry L. Dickinson
Glynn D. Coates
Louis H. Janda
Edward P. Markowski
Differential item functioning (DIF) occurs when an item performs statistically differently for a reference group than for a focal group. DIF is a threat to the validity of a test, and it can lead to illegal usage of a test in situations such as employee selection. Thus, DIF has important implications for test construction and practice. This research is a Monte Carlo study that compares Item Response Theory (IRT) and three Confirmatory Factor Analysis (CFA) methods for detecting DIF. The three CFA methods were Model Comparison (MC), Modification Indexes (MI), and Modification Indexes-Divided sample (MI-Divided).
The research compared the detection rates of DIF by the methods for reference and focal groups. Each group consisted of 1000 examinees who responded to 50 items. Nine of the 50 items were designed to show DIF for the two groups. Responses were simulated using a two-parameter logistic model. Three types of DIF were manipulated for the nine items using the logistic model's a and b parameters. DIF was manipulated (1) only on the a (or discrimination) parameter, (2) only on the b (difficulty) parameter, and (3) on both the a and b parameters. In addition, the nine items were designed to have a crossing of the magnitude of the a and b parameters (i.e., low, medium, and high levels of the parameters). The amount of DIF was held constant to a value of .5 through use of Raju's (1988) formula for the area separation between item characteristic curves of the reference and focal groups.
Results indicated that the all of the methods were very good at detecting DIF due to the b parameter (i.e., item difficulty). For DIF on the a and both the a and b parameters, the IRT and MI-Divided methods yielded significantly higher detection rates than the MC and MI methods. Further, the IRT and MI-Divided methods did not differ in their detection rates for these two types of DIF, and similarly, the MC and MI methods also did not differ in their detection rates. DIF due to both the a and b parameters was the hardest for all methods to detect. Although the MI-Divided method had a high detection rate, it also had false positive rates two to three times greater than expected. Future research on the methods was suggested for variables such as amount of DIF, sample size, and ability differences between focal and reference groups.
"Methods of Detecting Differential Item Functioning: A Comparison of Item Response Theory and Confirmatory Factor Analysis"
(2001). Doctor of Philosophy (PhD), Dissertation, Psychology, Old Dominion University, DOI: 10.25777/z1ce-4m45