Skip to main content

Item response theory as a feature selection and interpretation tool in the context of machine learning


Optimizing the number and utility of features to use in a classification analysis has been the subject of many research studies. Most current models use end-classifications as part of the feature reduction process, leading to circularity in the methodology. The approach demonstrated in the present research uses item response theory (IRT) to select features independent of the end-classification results without the biased accuracies that this circularity engenders. Dichotomous and polytomous IRT models were used to analyze 30 histological breast cancer features from 569 patients using the Wisconsin Diagnostic Breast Cancer data set. Based on their characteristics, three features were selected for use in a machine learning classifier. For comparison purposes, two machine learning–based feature selection protocols were run—recursive feature elimination (RFE) and ridge regression—and the three features selected from these analyses were also used in the subsequent learning classifier. Classification results demonstrated that all three selection processes performed comparably. The non-biased nature of the IRT protocol and information provided about the specific characteristics of the features as to why they are of use in classification help to shed light on understanding which attributes of features make them suitable for use in a machine learning context.

Graphical abstract

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4.
Fig. 5
Fig. 6


  1. Y. Bergner, S. Droschler, G. Kortemeyer, S. Rayyan, D. Seaton, and D.E. Pritchard. Model-based collaborative filtering analysis of student response date: machine learning item response theory. In Proceedings of the 5th International Conference on Educational Data Mining, 95-102. Chinia, Greece, June 19-21, 2012.

  2. J. Brownee. A gentle introduction to the rectified linear unit (relu). In Better Deep Learning, Retrieved from: January 9, 2019.

  3. Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79

    Article  Google Scholar 

  4. Chandrashekar F, Sahin G (2014) A survey on feature selection methods. Computers and Electrical Engineering 40:16–28

    Article  Google Scholar 

  5. De Vlaming R, Groenen PJF (2015) The current and future use of ridge regression for prediction in quantitative genetics. BioMed Research International:1–18

  6. Deo RC (2015) Machine learning in medicine. Circulation 132(20):1920–1930

    Article  Google Scholar 

  7. D. Dua and C Graff. In UCI machine learning repository. irvine, California: University of California, School of Information and Computer Science [], 2019.

  8. S.E. Embretson and S.P. Reise. Item response theory for psychologists. 2000.

  9. M.A. Hall and L.A. Smith. Practical feature subset for machine learning. In C.McDonald (Ed.), Computer Science ‘98 Proceedings of the 21st Australian Computer Science Conference, 181-191. Springer, Perth, 1998.

  10. K.T. Han. Parscale. In BB. (Ed.) Frey, editor, The SAGE encyclopedia of educational research, measurement, and evaluation, 1208-1210. Sage, Thousand Oaks, 2018.

  11. Handelman GS, Kok HK, Chandra RV, Razavi AH, Huang S, Brooks M, Lee MJ, Asadi H (2019) Peering into the black box of artificial intelligence: evaluating metrics of machine learning methods. American Journal of Roentgenology 212(1):38–43

    Article  Google Scholar 

  12. A. Jovic, J. Prados, and M. Hilario. A review of feature selection methods with applications. In 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pages 1200-1205. IEEE, doi:, Opatija, Croatia, 2015.

  13. Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowledge and Information Systems 12(1):95–116

    Article  Google Scholar 

  14. D.P. Kingma and J.L. Ba. Adam: A method for stochastic optimization. In 3rd International conference on Learning Representations (ICLR). San Diego, California, May 7-9, 2015. Retrieved from

  15. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artificial Intelligence 97(1-2):273–324

    Article  Google Scholar 

  16. Krawczuk J, Lukaszuk T (2016) The feature selection bias problem in relation to high dimensional gene data. Artificial Intelligence in Medicine 66:63–71

    Article  Google Scholar 

  17. J.P. Lalor, H. Wu, and H. Yu. Building an evaluation scale using item response theory. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 648-657. Austin, Texas, November 1-5, 2016.

  18. Lee CH, Yoon HJ (2017) Medical big data: promise and challenges. Kidney Research and Clinical Practice 36(1):3–11

    Article  Google Scholar 

  19. H. Liu and R. Setiono. A probabilistic approach to feature selection - a filter solution. In Proceedings of 9th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, 319-327, Fukuoka, Japan, 6 1996.

  20. Liu C, Wang W, Zhao Q, Shen X, Konan M (2017) A new feature selection method based on a validity index of feature subset. Pattern Recognition Letters 92:1–8

    Article  Google Scholar 

  21. Mangasarian OL, Street WN, Wolberg WH (1995) Breast cancer diagnosis and prognosis via linear programming. Operations Research 43(4):570–577

    Article  Google Scholar 

  22. Marinez-Plumed F, Pudencio RBC, Martinez-Uso J, Hernandez-Orallo A (2019) Survey of multi-objective optimization methods for engineering. Artificial Intelligence 271:18–72

    Article  Google Scholar 

  23. Marler RT, Arora JS (2004) Survey of multi-objective optimization methods for engineering. Structural and Multidisciplinary Optimization 26(6):369–395

    Article  Google Scholar 

  24. M. Mohri, A. Rostamizadeh, and A. Talwalker. Foundations of machine learning. MIT press, 2012.

  25. E. Muraki and D. Bock. PARSCALE 4. Lincolnwood, IL: Scientific Software Inc., 2003.

  26. Parmar C, Gossman P, Bussink J, Lambin P, Aerts HJWL (2015) Machine learning methods for quantitative radiomic biomarkers. Scientific reports 5(13087):1–11

    Google Scholar 

  27. Pliakos K, Seang-Hwane J, Park JY, Cornillie F, Vens C, Van den Noortgate W (2019) Integrating machine learning into item response theory for addressing the cold start problem in adaptive learning systems. Computers and Education 137:91–103

    Article  Google Scholar 

  28. G. Rasch. Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research, Copenhagen, Denmark, 1960.

  29. Rumsfeld JS, Joynt KE, Maddox TM (2016) Big data analytics to improve cardiovascular care: promise and challenges. Nature Reviews Cardiology 13(6):350–359

    CAS  Article  Google Scholar 

  30. Samejima F (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement 34(4):100

    Google Scholar 

  31. Wolberg WH, Street WN, Heisy DM, Mangasarian OL (1995) Computer-derived nuclear features distinguish malignant from benign breast cytology. Human Pathology 26(7):272–796

    Article  Google Scholar 

  32. Zhang Z, Beck MW, Winkler DA, Huang B, Sibanda W, Goyal H (2018) Opening the black box of neural networks: methods for interpreting neural network models in clinical applications. Annals of Translational Medicine 6(11):216–226

    Article  Google Scholar 

  33. Zimowksi MF (2018) BILOG-MG. In: Frey BB (ed) The SAGE encyclopedia of educational research, measurement, and evaluation. Sage, Thousand Oaks, California, pp 199–202

    Google Scholar 

  34. Zimowski M, Muraki E, Mislevy R, Bock D (2003) BILOG-MG, vol 3. Scientific Software Inc., Lincolnwood

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Adrienne S. Kline.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kline, A.S., Kline, T.J.B. & Lee, J. Item response theory as a feature selection and interpretation tool in the context of machine learning. Med Biol Eng Comput 59, 471–482 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Item response theory
  • Machine learning
  • Feature selection
  • Breast cancer