Testing Equality of Functions Across Multiple Experimental Conditions for Different Ability Levels in the IRT Context: The Case of the IPRASE TLT 2016 Survey

Maturo, Fabrizio; Fortuna, Francesca; Di Battista, Tonio

doi:10.1007/s11205-018-1893-4

Testing Equality of Functions Across Multiple Experimental Conditions for Different Ability Levels in the IRT Context: The Case of the IPRASE TLT 2016 Survey

Published: 02 April 2018

Volume 146, pages 19–39, (2019)
Cite this article

Social Indicators Research Aims and scope Submit manuscript

Fabrizio Maturo¹,
Francesca Fortuna² &
Tonio Di Battista²

437 Accesses
13 Citations
2 Altmetric
1 Mention
Explore all metrics

Abstract

In the educational field, it is common to analyze test data through item response theory models. In this context, a key role is played by item characteristic curves (ICCs) and item information curves (IICs). In many real cases, practitioners are interested in understanding if some factors have a significant influence on the probability of correctly answering items. In the literature, this problem has been addressed by applying the standard analysis of variance model, which is based on the total scores or the proportion of correct responses. However, this method needs to meet some strong assumptions and may present some limitations because it does not consider useful information typical of the IRT, such as the shapes of the ICCs and IICs, which provide interesting insights for different ability levels. To overcome these issues, this research suggests the use of the functional analysis of variance approach and a novel functional tool in the IRT context. The main advantages of this approach are that it is distribution-free and allows us to check the degree of consistency with the hypothesis of equality among mean curves for different ability levels. Specifically, the proposed method is applied on ICCs and IICs for improving the existing techniques in the educational studies. A real dataset drawn from the IPRASE Trentino Language Testing Survey 2016 is considered. The final purpose of this study is to provide additional tools for scholars and practitioners in defining specific educational plans.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CTT and No-DIF and ? = (Almost) Rasch Model

Reliability Issues in High-Stakes Educational Tests

Using Item Response Theory as a Tool in Educational Measurement

References

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1109/TAC.1974.1100705.
Article Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. Lord & M. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Boston: Addison-Wesle.
Google Scholar
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443–459. https://doi.org/10.1007/bf02293801.
Article Google Scholar
Carpita, M. (2017). L’analisi psicometrica dei test. In L. Covi & M. Dutto (Eds.), Rapporto TLT 2016 Trentino Language Testing Esito delle rilevazioni delle competenze linguistiche degli studenti trentini (pp. 71–86). Provincia Autonoma di Trento: IPRASE. (ISBN 978-88-7702-426-8).
Google Scholar
Ceccatelli, C., Di Battista, T., Fortuna, F., & Maturo, F. (2013). Best practice to improve the learning of statistics: The case of the national olympics of statistics in italy. Procedia: Social & Behavioral Sciences, XCIII, 2194–2199. https://doi.org/10.1016/j.sbspro.2013.10.186.
Article Google Scholar
Chen, S., Hwang, F., & Lin, S. (2013). Satisfaction rating of QOLPAV: Psychometric properties based on the graded response model. Social Indicator Research, 110, 367–383.
Article Google Scholar
Council of Europe. (2011). Common European framework of reference for languages: Learning, teaching, assessment. Cambridge: Cambridge University Press.
Google Scholar
Covi, L., & Dutto, M. (2017). Rapporto TLT 2016 Trentino Language Testing. Esito delle rilevazioni delle competenze linguistiche degli studenti trentini. Provincia Autonoma di Trento: IPRASE. (ISBN 978-88-7702-426-8).
Google Scholar
de Ayala, R. (2009). The theory and practice of item response theory. New York: The Guilford Press.
Google Scholar
Di Battista, T., & Fortuna, F. (2016). Clustering dichotomously scored items through functional data analysis. Electronic Journal of Applied Statistical Analysis, 9, 433–450.
Google Scholar
Di Battista, T., & Fortuna, F. (2017). Functional confidence bands for lichen biodiversity profiles: A case study in Tuscany region (central Italy). Statistical Analysis and Data Mining: The ASA Data Science Journal, 10, 21–28.
Article Google Scholar
Di Battista, T., Fortuna, F., & Maturo, F. (2014). Parametric functional analysis of variance for fish biodiversity. In International conference on marine and freshwater environments, iMFE 2014. www.scopus.com.
Di Battista, T., Fortuna, F., & Maturo, F. (2016). Parametric functional analysis of variance for fish biodiversity assessment. Journal of Environmental Informatics, 28(2), 101–109. https://doi.org/10.3808/jei.201600348.
Article Google Scholar
Di Battista, T., Fortuna, F., & Maturo, F. (2017). BioFTF: An R package for biodiversity assessment with the functional data analysis approach. Ecological Indicators, 73, 726–732. https://doi.org/10.1016/j.ecolind.2016.10.032.
Article Google Scholar
Drasgow, F. (1984). Scrutinizing psychological tests: Measurement equivalence and equivalent relations with external variables are the central issues. Psychological Bulletin, 95(1), 134–135. https://doi.org/10.1037/0033-2909.95.1.134.
Article Google Scholar
Ferraty, F., & Vieu, P. (2006). Nonparametric functional data analysis. New York: Springer.
Google Scholar
Fortuna, F., & Maturo, F. (2018). K-means clustering of item characteristic curves and item information curves via functional principal component analysis. Quality & Quantity. https://doi.org/10.1007/s11135-018-0724-7.
Article Google Scholar
Hambleton, R., & van der Linden, W. (1997). Handbook of modern item response theory. New York: Springer.
Google Scholar
Liu, Y. (2016). Modelling and testing differential item functioning in unidimensional binary item response models with a single continuous covariate: A functional data analysis approach. Psychometrika, 81, 371–398.
Article Google Scholar
Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum.
Google Scholar
Lord, F., & Novick, M. (1968a). Statistical theories of mental test scores (with contributions by A. Birnbaum). Reading, MA: Addison-Wesley.
Google Scholar
Lord, F., & Novick, M. (1968b). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
Google Scholar
Manly, B. F. J. (1997). Randomization, Bootstrap and Monte Carlo Methods in Biology. Chapman and Hall, London. (ISBN 0412721309).
Matthew, S. (2007). Modeling dichotomous item responses with free-knot splines. Computational Statistics & Data Analysis, 51, 4178–4192.
Article Google Scholar
Maturo, F. (2018). Unsupervised classification of ecological communities ranked according to their biodiversity patterns via a functional principal component decomposition of hill’s numbers integral functions. Ecological Indicators, 90, 305–315. https://doi.org/10.1016/j.ecolind.2018.03.013.
Article Google Scholar
Maturo, F., & Di Battista, T. (2018). A functional approach to Hill’s numbers for assessing changes in species variety of ecological communities over time. Ecological Indicators, 84(C), 70–81. https://doi.org/10.1016/j.ecolind.2017.08.016..
Article Google Scholar
Maturo, F., Di Battista, T., & Fortuna, F. (2016). BioFTF: Biodiversity assessment using functional tools. https://cran.r-project.org/web/packages/BioFTF/index.html.
Maturo, F., Migliori, S., & Paolone, F. (2017). Do institutional or foreign shareholders influence national board diversity? Assessing Board diversity through functional data analysis (pp. 199–217). Cham: Springer. https://doi.org/10.1007/978-3-319-54819-7_14.
Book Google Scholar
Maturo, F., Migliori, S., & Paolone, F. (2018). Measuring and monitoring diversity in organizations through functional instruments with an application to ethnic workforce diversity of the U.S. federal agencies. Computational and Mathematical Organization. https://doi.org/10.1007/s10588-018-9267-7.
Article Google Scholar
O’Connor, B., Crawford, M., & Holder, M. (2015). An item response theory analysis of the subjective happiness scale. Social Indicator Research, 124, 249–258.
Article Google Scholar
Ramsay, J. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611–630.
Article Google Scholar
Ramsay, J. (1997). A functional approach to modeling test data. In W. van der Linden & R. Hambleton (Eds.), Handbook of modern item response theory (pp. 381–394). New York: Springer.
Chapter Google Scholar
Ramsay, J. O., & Silverman, B. W. (2005). Functional data analysis (2nd ed.). New York: Springer.
Book Google Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and achievement tests. Copenhagen: Danish Institute for Educational Research.
Google Scholar
Rizopoulos, D. (2006). ltm: An r package for latent variable modeling and item response theory analysis. Journal of Statistical Software, 17(5), 1–25.
Article Google Scholar
Roju, N. S., van der Linden, W. J., & Fleer, P. F. (1995). IRT-Based Internal Measures of Differential Functioning of Items and Tests. Applied Psychological Measurement, 19(4), 353–368. https://doi.org/10.1177/014662169501900405.
Article Google Scholar
Rossi, N., Wang, X., & Ramsay, J. (2002). Nonparametric item response function estimates with the em algorithm. Journal of Educational and Behavioral Statistics, 27, 291–317.
Article Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. https://doi.org/10.1214/aos/1176344136.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Management and Business Administration, “G. d’ Annunzio” University, Pescara, Italy
Fabrizio Maturo
DISFPEQ, “G. d’ Annunzio” University, Pescara, Italy
Francesca Fortuna & Tonio Di Battista

Authors

Fabrizio Maturo
View author publications
You can also search for this author in PubMed Google Scholar
Francesca Fortuna
View author publications
You can also search for this author in PubMed Google Scholar
Tonio Di Battista
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabrizio Maturo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maturo, F., Fortuna, F. & Di Battista, T. Testing Equality of Functions Across Multiple Experimental Conditions for Different Ability Levels in the IRT Context: The Case of the IPRASE TLT 2016 Survey. Soc Indic Res 146, 19–39 (2019). https://doi.org/10.1007/s11205-018-1893-4

Download citation

Accepted: 28 March 2018
Published: 02 April 2018
Issue Date: November 2019
DOI: https://doi.org/10.1007/s11205-018-1893-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Testing Equality of Functions Across Multiple Experimental Conditions for Different Ability Levels in the IRT Context: The Case of the IPRASE TLT 2016 Survey

Abstract

Access this article

Similar content being viewed by others

CTT and No-DIF and ? = (Almost) Rasch Model

Reliability Issues in High-Stakes Educational Tests

Using Item Response Theory as a Tool in Educational Measurement

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Testing Equality of Functions Across Multiple Experimental Conditions for Different Ability Levels in the IRT Context: The Case of the IPRASE TLT 2016 Survey

Abstract

Access this article

Similar content being viewed by others

CTT and No-DIF and ? = (Almost) Rasch Model

Reliability Issues in High-Stakes Educational Tests

Using Item Response Theory as a Tool in Educational Measurement

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation