, Volume 84, Issue 1, pp 105–123 | Cite as

Metric Transformations and the Filtered Monotonic Polynomial Item Response Model

  • Leah M. FeuerstahlerEmail author


The \(\theta \) metric in item response theory is often not the most useful metric for score reporting or interpretation. In this paper, I demonstrate that the filtered monotonic polynomial (FMP) item response model, a recently proposed nonparametric item response model (Liang & Browne in J Educ Behav Stat 40:5–34, 2015), can be used to specify item response models on metrics other than the \(\theta \) metric. Specifically, I demonstrate that any item response function (IRF) defined within the FMP framework can be re-expressed as another FMP IRF by taking monotonic transformations of the latent trait. I derive the item parameter transformations that correspond to both linear and nonlinear transformations of the latent trait metric. These item parameter transformations can be used to define an item response model based on any monotonic transformation of the \(\theta \) metric, so long as the metric transformation is approximated by a monotonic polynomial. I demonstrate this result by defining an item response model directly on the approximate true score metric and discuss the implications of metric transformations for applied testing situations.


parametric item response theory nonparametric item response theory model identification metric transformations linking 

Supplementary material

11336_2018_9642_MOESM1_ESM.pdf (172 kb)
Supplementary material 1 (pdf 171 KB)


  1. Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrox & F. Caski (Eds.), Second international symposium on information theory (pp. 267–281). Budapest: Akademiai Kiado.Google Scholar
  2. Bock, R. D., Thissen, D., & Zimowski, M. F. (1997). IRT estimation of domain scores. Journal of Educational Measurement, 34, 197–211.CrossRefGoogle Scholar
  3. Butcher, J. N., Williams, C. L., Graham, J. R., Archer, R. P., Tellegen, A., Ben-Porath, Y. S., et al. (1992). MMPI-A (Minnesota Multiphasic Personality Inventory-Adolescent): Manual for administration, scoring, and interpretation. Minneapolis: University of Minnesota Press.Google Scholar
  4. Elphinstone, C. D. (1983). A target distribution model for nonparametric density estimation. Communication in Statistics-Theory and Methods, 12, 161–198.CrossRefGoogle Scholar
  5. Elphinstone, C. D. (1985). A method of distribution and density estimation (Unpublished dissertation). Pretoria: University of South Africa.Google Scholar
  6. Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8, 341–349.CrossRefGoogle Scholar
  7. Falk, C. F., & Cai, L. (2016a). Maximum marginal likelihood estimation of a monotonic polynomial generalized partial credit model with applications to multiple group analysis. Psychometrika, 81, 434–460.CrossRefGoogle Scholar
  8. Falk, C. F., & Cai, L. (2016b). Semiparametric item response functions in the context of guessing. Journal of Educational Measurement, 53, 229247.CrossRefGoogle Scholar
  9. Feuerstahler, L. M. (2018). flexmet: Flexible latent trait metrics using the filtered monotonic polynomial item response model. R package version Scholar
  10. Haebara, T. (1980). Equating logistic ability scales by a weighted least squares method. Japanese Psychological Research, 22, 144–149.CrossRefGoogle Scholar
  11. Hawkins, D. M. (1994). Fitting monotonic polynomials to data. Computational Statistics, 9, 233–247.Google Scholar
  12. Kolen, M. J. (1988). Defining score scales in relation to measurement error. Journal of Educational Measurement, 25, 97–110.CrossRefGoogle Scholar
  13. Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking (3rd ed.). New York: Springer.CrossRefGoogle Scholar
  14. Kyngdon, A. (2008). The Rasch model from the perspective of the representational theory of measurement. Theory & Psychology, 18, 89–109.CrossRefGoogle Scholar
  15. Liang, L. (2007). A semi-parametric approach to estimating item response functions (Unpublished doctoral dissertation). Columbus, OH: The Ohio State University.Google Scholar
  16. Liang, L., & Browne, M. W. (2015). A quasi-parametric method for fitting flexible item response functions. Journal of Educational and Behavioral Statistics, 40, 5–34.CrossRefGoogle Scholar
  17. Lord, F. M. (1974). The relative efficiency of two tests as a function of ability level. Psychometrika, 39, 351–358.CrossRefGoogle Scholar
  18. Lord, F. M. (1975). The ‘ability’ scale in item characteristic curve theory. Psychometrika, 40, 205–217.CrossRefGoogle Scholar
  19. Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.Google Scholar
  20. Mokken, R. J. (1971). A theory and procedure of scale analysis with applications in political research. The Hague: Mouton.CrossRefGoogle Scholar
  21. Murray, K., Müller, S., & Turlach, B. A. (2013). Revisiting fitting monotone polynomials to data. Computational Statistics, 28, 1989–2005.CrossRefGoogle Scholar
  22. Murray, K., Müller, S., & Turlach, B. A. (2016). Fast and flexible methods for monotone polynomial fitting. Journal of Statistical Computation and Simulation, 86, 2946–2966.CrossRefGoogle Scholar
  23. Perline, R., Wright, B. D., & Wainer, H. (1979). The Rasch model as additive conjoint measurement. Applied Psychological Measurement, 3, 237–255.CrossRefGoogle Scholar
  24. Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–630.CrossRefGoogle Scholar
  25. Ramsay, J. O., & Wiberg, M. (2017). A strategy for replacing sum scoring. Journal of Educational and Behavioral Statistics, 42, 282–307.CrossRefGoogle Scholar
  26. Ramsay, J. O., & Winsberg, S. (1991). Maximum marginal likelihood estimation for semiparametric item analysis. Psychometrika, 56, 365–379.CrossRefGoogle Scholar
  27. R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from
  28. Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods, 8, 164–184.CrossRefGoogle Scholar
  29. Schulz, E. M., & Nicewander, W. A. (1997). Grade equivalent and IRT representations of growth. Journal of Educational Measurement, 34, 315–331.CrossRefGoogle Scholar
  30. Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–680.CrossRefGoogle Scholar
  31. Stocking, M. L. (1996). An alternative method for scoring adaptive tests. Journal of Educational and Behavioral Statistics, 21, 365–389.CrossRefGoogle Scholar
  32. Stocking, M. L., & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7, 201–210.CrossRefGoogle Scholar
  33. Tadikamalla, P. R. (1980). On simulating non-normal distributions. Psychometrika, 45, 273–279.CrossRefGoogle Scholar
  34. Turlach, B., & Murray, K. (2016). MonoPoly: Functions to fit monotone polynomials. R package version 0.3-8.Google Scholar
  35. van der Linden, W. J., & Barrett, M. D. (2016). Linking item response model parameters. Psychometrika, 81, 650–673.CrossRefGoogle Scholar
  36. Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6, 473–492.CrossRefGoogle Scholar
  37. Yi, Q., Wang, T., & Ban, J.-C. (2001). Effects of scale transformation and test-termination rule on the precision of ability estimation in computerized adaptive testing. Journal of Educational Measurement, 38, 267–292.CrossRefGoogle Scholar
  38. Zwick, R. (1992). Statistical and psychometric issues in the measurement of educational achievement trends: Examples from the National Assessment of Educational Progress. Journal of Educational Statistics, 17, 205–218.Google Scholar

Copyright information

© The Psychometric Society 2018

Authors and Affiliations

  1. 1.Fordham UniversityNew York CityUSA

Personalised recommendations