Knowledge and Information Systems

, Volume 51, Issue 1, pp 235–257 | Cite as

Supervised item response models for informative prediction

  • Tsuyoshi IdéEmail author
  • Amit Dhurandhar
Regular Paper


Supporting human decision-making is a major goal of data mining. The more decision-making is critical, the more interpretability is required in the predictive model. This paper proposes a new framework to build a fully interpretable predictive model for questionnaire data, while maintaining a reasonable prediction accuracy with regard to the final outcome. Such a model has applications in project risk assessment, in healthcare, in social studies, and, presumably, in any real-world application that relies on questionnaire data for informative and accurate prediction. Our framework is inspired by models in item response theory (IRT), which were originally developed in psychometrics with applications to standardized academic tests. We extend these models, which are essentially unsupervised, to the supervised setting. For model estimation, we introduce a new iterative algorithm by combining Gauss–Hermite quadrature with an expectation–maximization algorithm. The learned probabilistic model is linked to the metric learning framework for informative and accurate prediction. The model is validated by three real-world data sets: Two are from information technology project failure prediction and the other is an international social survey about people’s happiness. To the best of our knowledge, this is the first work that leverages the IRT framework to provide informative and accurate prediction on ordinal questionnaire data.


Questionnaire data Item response theory Metric learning 


  1. 1.
    Baker FB, Kim SH (2004) Item Response Theory: Parameter Estimation Techniques, 2nd edn. CRC Press, Boca RatonGoogle Scholar
  2. 2.
    Bellet A, Habrard A, Sebban M (2013) A survey on metric learning for feature vectors and structured data. arXiv:1306.6709
  3. 3.
    Bishop CM (2006) Pattern Recognition and Machine Learning. Springer-Verlag New YorkGoogle Scholar
  4. 4.
    Borji A, Itti L (2013) Bayesian optimization explains human active search. In: Burges C, Bottou L, Welling M, Ghahramani Z, Weinberger K (eds) Advances in Neural Information Processing Systems 26, pp 55–63Google Scholar
  5. 5.
    Chapelle O, Chang Y, Liu T (eds) (2011) Proceedings of the Yahoo! Learning to Rank Challenge, held at ICML 2010, Haifa, Israel, June 25, 2010, JMLR Proceedings, vol 14Google Scholar
  6. 6.
    Cuturi M, Avis D (2014) Ground metric learning. Journal of Machine Learning Research 15(1):533–564MathSciNetzbMATHGoogle Scholar
  7. 7.
    Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33(1):1–22CrossRefGoogle Scholar
  8. 8.
    Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2005) Neighbourhood component analysis. Advances in Neural Information Processing Systems 17:513–520Google Scholar
  9. 9.
    Guillaumin M, Verbeek J, Schmid C (2009) Is that you? metric learning approaches for face identification. In: Computer Vision, 2009 IEEE 12th International Conference, pp 498–505Google Scholar
  10. 10.
    Hastie T, Tibshirani R, Friedman J (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New YorkGoogle Scholar
  11. 11.
    Hildebrand HB (1974) Introduction to Numerical Analysis, 2nd edn. DoverGoogle Scholar
  12. 12.
    Hinton G, Deng L, Yu D, Dahl GE, Mohamed A, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainathand T, Kingsbury B (2012) Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29(6):82–97CrossRefGoogle Scholar
  13. 13.
    Idé T, Dhurandhar A (2015) Informative prediction based on ordinal questionnaire data. In: Proceedings of 2015 IEEE International Conference on Data Mining (ICDM 15), pp 191–200Google Scholar
  14. 14.
    Idé T, Güven S, Jan EE, Makogon S, Venegas A (2015) Latent trait analysis for risk management of complex information technology projects. In: Proceedings of the 14th IFIP/IEEE International Symposium on Integrated Network Management, IM 15, pp 305–312Google Scholar
  15. 15.
    Koren Y, Sill J (2011) Ordrec: An ordinal model for predicting personalized item rating distributions. In: Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys ’11, pp 117–124Google Scholar
  16. 16.
    Koren Y, Sill J (2013) Collaborative filtering on ordinal user feedback. In: IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp 3022–3026Google Scholar
  17. 17.
    Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37CrossRefGoogle Scholar
  18. 18.
    Kostinger M, Hirzer M, Wohlhart P, Roth P, Bischof H (2012) Large scale metric learning from equivalence constraints. In: Proc. 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2288–2295Google Scholar
  19. 19.
    Lan AS, Waters AE, Studer C, Baraniuk RG (2014) Sparse factor analysis for learning and content analytics. Journal of Machine Learning Research 15(1):1959–2008MathSciNetzbMATHGoogle Scholar
  20. 20.
    Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp 609–616Google Scholar
  21. 21.
    McCullagh P (1980) Regression models for ordinal data. Journal of the Royal Statistical Society Series B (Methodological) 42(2):109–142MathSciNetzbMATHGoogle Scholar
  22. 22.
    Murray W, Wright MH (1995) Line search procedures for the logarithmic barrier function. SIAM Journal on Optimization 4(2):229–246MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Osogami T, Otsuka M (2014) Restricted boltzmann machines modeling human choice. Advances in Neural Information Processing Systems 27:73–81Google Scholar
  24. 24.
    Pang B, Lee L (2008) Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1–2):1–135CrossRefGoogle Scholar
  25. 25.
    SAT (2015) Wikipedia.
  26. 26.
    Stevens SS (1946) On the theory of scales of measurement. Science 103(2684):677–680CrossRefzbMATHGoogle Scholar
  27. 27.
    Sun BY, Li J, Wu D, Zhang XM, Li WB (2010) Kernel discriminant learning for ordinal regression. IEEE Transactions on Knowledge and Data Engineering 22(6):906–910CrossRefGoogle Scholar
  28. 28.
    Terada Y, Luxburg UV (2014) Local ordinal embedding. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), JMLR Workshop and Conference Proceedings, pp 847–855Google Scholar
  29. 29.
    Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244zbMATHGoogle Scholar
  30. 30.
    Wilson M (2004) Constructing Measures. Psychology PressGoogle Scholar
  31. 31.
    World Values Survey Association (2015) World Values Survey., Wave 6, 2010–2014, Official Aggregate v.20150418
  32. 32.
    Xing EP, Jordan MI, Russell S, Ng AY (2002) Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems, pp 505–512Google Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  1. 1.IBM Research, T. J. Watson Research CenterYorktown HeightsUSA

Personalised recommendations