Is Rasch model analysis applicable in small sample size pilot studies for assessing item characteristics? An example using PROMIS pain behavior item bank data

Chen, Wen-Hung; Lenderking, William; Jin, Ying; Wyrwich, Kathleen W.; Gelhorn, Heather; Revicki, Dennis A.

doi:10.1007/s11136-013-0487-5

Is Rasch model analysis applicable in small sample size pilot studies for assessing item characteristics? An example using PROMIS pain behavior item bank data

Published: 03 August 2013

Volume 23, pages 485–493, (2014)
Cite this article

Quality of Life Research Aims and scope Submit manuscript

Wen-Hung Chen¹,
William Lenderking¹,
Ying Jin²,
Kathleen W. Wyrwich¹,
Heather Gelhorn¹ &
…
Dennis A. Revicki¹

3264 Accesses
107 Citations
2 Altmetric
1 Mention
Explore all metrics

Abstract

Purpose

Large samples are generally considered necessary for Rasch model to obtain robust item parameter estimates. Recently, small sample Rasch analysis was suggested as preliminary assessment of items’ psychometric properties. This study is to evaluate the Rasch analysis results using small sample sizes.

Methods

Ten PROMIS pain behavior items were used. Random samples of 30, 50, 100, and 250, and a targeted sample of 30 were drawn 10 times each from a total of 800 subjects. Rasch analysis was conducted for each of these samples and the full sample.

Results

In the full sample, there were 104 cases of extreme scores, no null categories, two incorrectly ordered items, and four misfit items. For samples of 250, 100, 50, 30, and targeted 30, the average numbers of extreme scores were 42.2, 17.1, 9.6, 6.1, and 1.2; the average numbers of null categories were 1.0, 3.2, 8.7, 13.4, and 8.3; the average numbers of items with incorrectly ordered item parameters were 0.1, 0.8, 2.9, 4.7, and 3.7; and the average numbers of items with fit residuals exceeding ±2.5 were 0.8, 0.3, 0.1, 0.2, and 0.3, respectively.

Conclusions

Rasch analysis based on small samples (≤50) identified a greater number of items with incorrectly ordered parameters than larger samples (≥100). However, fewer items were identified as misfitting. Results from small samples led to opposite conclusions from those based on larger samples. Rasch analysis based on small samples should be used for exploratory purposes with extreme caution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comparison of Item Parameter and Standard Error Recovery Across Different R Packages for Popular Unidimensional IRT Models

Item fit statistics for Rasch analysis: can we trust them?

Article Open access 28 August 2020

Model Evaluation in the Presence of Categorical Data: Bayesian Model Checking as an Alternative to Traditional Methods

Article 14 September 2021

References

Ring, L., Gross, C. R., & McColl, E. (2010). Putting the text back into context: Toward increased use of mixed methods for quality of life research. Quality of Life Research, 19(5), 613–615.
Article PubMed Google Scholar
Klassen, A. C., Creswell, J., Plano Clark, V. L., Smith, K. C., & Meissner, H. I. (2012). Best practices in mixed methods for quality of life research. Quality of Life Research, 21(3), 377–380.
Article PubMed Google Scholar
Food and Drug Administration. (2009). Guidance for industry on patient-reported outcome measures: use in medical product development to support labeling claims. Federal Register, 74(235), 65132–65133.
Google Scholar
Hudgens, S., Globe, D., & Burgess, S. M. (2012). Utilization of Rasch measurement models for assessing validity: a mixed methods approach. Workshop presented at the International Society Pharmacoeconomics and Outcome Research (ISPOR) 17th Annual International Meeting, Washington, DC.
Hudgens, S., Norquist, J., Wyrwich, K. W., Coons, S. J., & Lenderking, W. R. (Oct 24, 2012). Perspectives on mixed methods to assess content validity of a PRO measure. Presented at the Industry Advisory Committee Symposium, International Society for Quality of Life Research (ISOQOL) 19th annual conference, Budapest.
Linacre, J. M. (1994). Sample size and item calibrations stability. Rasch Measurement Transactions, 7(4), 328.
Google Scholar
Linacre, J. M. (2002). Optimizing rating scale category effectiveness. Journal of Applied Measurement, 3(1), 85–106.
PubMed Google Scholar
Reise, S. P., & Yu, J. (1990). Parameter recovery in the graded response model using MULTILOG. Journal of Educational Measurement, 27(2), 133–144.
Article Google Scholar
Revicki, D. A., Chen, W. H., Harnam, N., Cook, K. F., Amtmann, D., Callahan, L. F., et al. (2009). Development and psychometric analysis of the PROMIS pain behavior item bank. Pain, 146(1–2), 158–169.
Article PubMed Central PubMed Google Scholar
Muthén, L. K., & Muthén, B. (1998–2004). Mplus user’s guide (3rd ed.). Los Angeles, CA: Muthén & Muthén.
Serlin, R. C., Mendoza, T. R., Nakamura, Y., Edwards, K. R., & Cleeland, C. S. (1995). When is cancer pain mild, moderate or severe? Grading pain severity by its interference with function. Pain, 61(2), 277–284.
Article PubMed CAS Google Scholar
Andrich, D., Sheridan, B., & Lou, G. (2009). RUMM2030. Perth, Australia: RUMM Laboratory.
Google Scholar
Andrich, D. (2004). Controversy and the Rasch model: a characteristic of incompatible paradigms? Medical Care, 42(1 Suppl), I7–16.
PubMed Google Scholar
Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65–104). New Jersey: Lawrence Erlbaum Associates.
Google Scholar
Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88(3), 588–606.
Article Google Scholar
Greenwood, P. E., & Nihulin, M. S. (1996). A guide to Chi square testing. New York, NY: Wiley.
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Health Outcomes Research, United BioSource Corporation, 7101 Wisconsin Ave., Suite 600, Bethesda, MD, 20814, USA
Wen-Hung Chen, William Lenderking, Kathleen W. Wyrwich, Heather Gelhorn & Dennis A. Revicki
Association of American Medical Colleges, 2450 N St NW, Washington, DC, 20037, USA
Ying Jin

Authors

Wen-Hung Chen
View author publications
You can also search for this author in PubMed Google Scholar
William Lenderking
View author publications
You can also search for this author in PubMed Google Scholar
Ying Jin
View author publications
You can also search for this author in PubMed Google Scholar
Kathleen W. Wyrwich
View author publications
You can also search for this author in PubMed Google Scholar
Heather Gelhorn
View author publications
You can also search for this author in PubMed Google Scholar
Dennis A. Revicki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen-Hung Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, WH., Lenderking, W., Jin, Y. et al. Is Rasch model analysis applicable in small sample size pilot studies for assessing item characteristics? An example using PROMIS pain behavior item bank data. Qual Life Res 23, 485–493 (2014). https://doi.org/10.1007/s11136-013-0487-5

Download citation

Accepted: 18 July 2013
Published: 03 August 2013
Issue Date: March 2014
DOI: https://doi.org/10.1007/s11136-013-0487-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Is Rasch model analysis applicable in small sample size pilot studies for assessing item characteristics? An example using PROMIS pain behavior item bank data