IRT Model Fit Software
Item response theory (IRT) is a collection of statistical models and methods used for questionnaire development, evaluation, and scoring. IRT models describe, in probabilistic terms, the relationship between a person's response to a survey question and his or her standing on the construct being measured by the survey. These measured constructs include any latent (i.e., unobservable) variable, such as depression, fatigue, or pain, that requires multiple survey items to estimate a person's level on the construct. One of the fundamental assumptions for application of IRT methods is that the IRT models fit the data.
A range of indices have been created for examining the fit of various IRT models to item response data, which are mostly dichotomous response data. However, no one fit index has been universally accepted nor applied routinely in educational, psychological, and health outcomes measurement. The performance of these indices varies, depending on sample size, model type, number of items, and properties of the items in the data. The lack of a standardized set of fit indices that can be applied to a range of IRT models estimated from various IRT software programs has limited the acceptability of these powerful measurement tools for application in health outcomes research.
The Outcomes Research Branch contracted with QualityMetric, Inc. to create a SAS program (a compiled macro) that will produce a range of indices for testing the fit of IRT models to polytomous response data. The program reads in both IRT model parameter estimates provided by various IRT model software programs (e.g., MULTILOG, PARSCALE, WINSTEPS) and the individual response patterns. It returns a range of fit statistics, including extensions of the S-X2 and the S-G2 tests for polytomous items and the X2* statistic. To visualize misfit, the program provides observed-expected plots of fit. The program and its accompanying documentation may be downloaded here:
Last Modified: 03 Sep 2013