Skip to main content

An Observed-Score Equating Framework

  • Conference paper
  • First Online:
Looking Back

Part of the book series: Lecture Notes in Statistics ((LNSP,volume 202))

Abstract

Paul Holland has made remarkable contributions to equating theory and practice and has influenced the work of many researchers and psychometricians. In this paper, it is argued that the methodology introduced by Holland and Thayer (1989) and von Davier, Holland, and Thayer (2004b), along with the kernel method of test equating, involves more than simply a continuization method for test score distributions: It has introduced a powerful equating framework1 for all observed-score equating (OSE) methods. This framework has already proven to be useful for various research purposes outside of Gaussian kernel equating (KE). Referred to in this paper as the observed-score equating (OSE) framework, it is one example of the application of Holland’s work to the practice of equating.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    “Conceptual frameworks (theoretical frameworks) are a type of intermediate theory that has the potential to connect to all aspects of inquiry (e.g. problem definition, purpose, literature review, methodology, data collection and analysis). Conceptual frameworks act like maps that give coherence to empirical inquiry” (Conceptual framework, 2010, para 2).

  2. 2.

    The appendix contains a summary of the three generations of the OSE framework. The first column shows the steps employed within the framework, while additions made in 2004 and in 2009 are included in the next two columns. Examination of the appendix reveals, for example, that Step 2 of the current approach was not added until 2004 (hence the odd numbering in column one of 1, 3, 4, and 5). The table also reveals that another major shift between the 1989 and 2004 was the addition of the SEED to the framework.

References

  • Chen, H., & Holland, P. W. (2008, March). True score equating under the KE framework, the associated log-linear model and its relation with Levine equating. Paper presented at the annual meeting of the National Council on Measurement in Education, New York, NY.

    Google Scholar 

  • Chen, H., Yan, D., Hemat, L., Han, N., & von Davier, A. A. (2007). LOGLIN/KE user guide (Version 3.0) [Computer software manual]. Princeton, NJ: ETS.

    Google Scholar 

  • Cid, J., & von Davier, A. A. (2009, April). Examining potential boundary bias effects in kernel smoothing on equating. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego, CA.

    Google Scholar 

  • Conceptual framework. (2010). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Conceptual_framework.

  • Cui, Z., & Kolen, M. (2007, April). An introduction of two new smoothing methods in equating: The cubic b-spline presmoothing method and the direct presmoothing method. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago.

    Google Scholar 

  • Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equatability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37, 281–306.

    Article  Google Scholar 

  • Duong, M., & von Davier, A. A. (2008, March). Kernel equating with observed mixture distributions in a single-group design. Paper presented at the annual meeting of the National Council on Measurement in Education, New York.

    Google Scholar 

  • ETS. (2006). KE-software (Version 1) [Computer software]. Princeton, NJ: Author.

    Google Scholar 

  • ETS. (2007). KE-software (Version 2) [Computer software]. Princeton, NJ: Author.

    Google Scholar 

  • ETS. (2010). KE-software (Version 3) [Computer software]. Princeton, NJ: Author.

    Google Scholar 

  • Haberman, S. J. (2008). Continuous exponential families: An equating tool (ETS Research Rep. No. RR-08-05). Princeton, NJ: ETS.

    Google Scholar 

  • Holland, P. W., & Hoskens, M. (2003). Classical test theory as a first-order item response theory: Application to true-score prediction from a possibly nonparallel test. Psychometrika, 68, 123–149.

    Article  MathSciNet  Google Scholar 

  • Holland, P. W., King, B. F., & Thayer, D. T. (1989). The standard error of equating for the kernel method of equating score distributions (ETS Research Rep. No. RR-89-06). Princeton NJ: ETS.

    Google Scholar 

  • Holland, P. W., & Moses, T. P. (2007). Kernel and traditional equipercentile equating with degrees of presmoothing (ETS Research Rep. No. RR-07-15). Princeton, NJ: ETS.

    Google Scholar 

  • Holland, P. W., Sinharay, S., von Davier, A. A., & Han, N. (2008). An approach to evaluating the missing data assumptions of the chain and post-stratification equating methods for the NEAT design. Journal of Educational Measurement, 45, 17–43.

    Article  Google Scholar 

  • Holland, P. W., & Thayer, D. T. (1987). Notes on the use of log-linear models for fitting discrete probability distributions (ETS Research Rep. No. RR-87-31). Princeton NJ: ETS.

    Google Scholar 

  • Holland, P. W., & Thayer, D. T. (1989). The kernel method of equating score distributions (ETS Research Rep. No. RR-89-07). Princeton, NJ: ETS.

    Google Scholar 

  • Holland, P. W., & Thayer, D. T. (2000). Univariate and bivariate log-linear models for discrete test score distributions. Journal of Educational and Behavioral Statistics, 25, 133–183.

    Google Scholar 

  • Holland, P.W., von Davier, A. A., Sinharay, S., & Han, N. (2006). Testing the untestable assumptions of the chain and poststratification equating methods for the NEAT design (ETS Research Rep. No. RR-06-17). Princeton, NJ: ETS.

    Google Scholar 

  • Jiang, Y., von Davier, A. A., & Chen, H. (2011). Evaluating equating results: Percent relative error for chained kernel equating. Manuscript submitted for publication.

    Google Scholar 

  • Kendall, M. G., & Stuart, A. (1977). The advanced theory of statistics (4th ed.). New York, NY: Macmillan.

    MATH  Google Scholar 

  • Kolen, M. J., & Brennan, R. J. (2004). Test equating, scaling and linking (2nd ed.). New York, NY: Springer.

    MATH  Google Scholar 

  • Lee, Y.-H., & von Davier, A. A. (2008). Comparing alternative kernels for the kernel method of test equating: Gaussian, logistic and uniform kernels (ETS Research Rep. No. RR-08-12). Princeton NJ: ETS.

    Google Scholar 

  • Liang, T., & von Davier, A. A. (2009, July). Alternative methods to determine the optimal bandwidth for the kernel equating function. Paper presented at the international meeting of the Psychometric Society, Cambridge, UK.

    Google Scholar 

  • Livingston, S. (1993a). Small-sample equatings with log-linear smoothing. Journal of Educational Measurement, 30, 23–39.

    Article  Google Scholar 

  • Livingston, S. (1993b). An empirical tryout of kernel equating (ETS Research Rep. No. RR-93-33). Princeton NJ: ETS.

    Google Scholar 

  • Mekhael, M., & von Davier, A. A. (2007, April). The effects of log-linear models on kernel equating results. Paper presented at the meeting of the National Council on Measurement in Education, Chicago.

    Google Scholar 

  • Moses, T. (2008). An evaluation of statistical strategies for making equating decisions (ETS Research Rep. No. RR-08-60). Princeton NJ: ETS.

    Google Scholar 

  • Moses, T., Deng, W., & Zhang, Y. (2010). The use of two anchors in nonequivalent groups with anchor test (NEAT) equating (ETS Research Report No. RR-10-23). Princeton, NJ: ETS.

    Google Scholar 

  • Moses, T. P., & Holland, P. W. (2008). Notes on the general framework for observed score equating (ETS Research Rep. No. RR-08-59). Princeton NJ: ETS.

    Google Scholar 

  • Moses, T. P., & von Davier, A. A. (2006). A SAS macro for log-linear smoothing: Applications and implications (ETS Research Rep. No. RR-06-05). Princeton, NJ: ETS.

    Google Scholar 

  • Petersen, N. S., Marco, G. L., & Stewart, E. E. (1982). A test of the adequacy of linear score equating models. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 71–135). New York, NY: Academic Press.

    Google Scholar 

  • Puhan, G., von Davier, A. A., & Gupta, S. (2008). Impossible scores resulting in zero frequencies in the anchor test: Impact on smoothing and equating (ETS Research Rep. No. RR-08-10). Princeton, NJ: ETS.

    Google Scholar 

  • Rao, C. R. (1973). Linear statistical inference and its applications (2nd ed.). New York, NY: Wiley.

    Book  MATH  Google Scholar 

  • Rijmen, F., Qu, Y., & von Davier, A. A. (2008, March). Hypothesis testing of equating differences in the KE framework. Paper presented at the annual meeting of the National Council on Measurement in Education, New York.

    Google Scholar 

  • Shen, X., & von Davier, A. A. (2007, April). An exploration of constructing criteria equating for simulation studies comparing kernel and IRT equating methods. Paper presented at the annual meeting of the American Educational Research Association, Chicago.

    Google Scholar 

  • von Davier, A. A. (2011). A statistical perspective on equating test scores. In A. A. von Davier (Ed.), Statistical models for test equating, scaling and linking (pp. 1–17). New York, NY: Springer.

    Chapter  Google Scholar 

  • von Davier, A. A., Fournier-Zajac, S., & Holland, P. W. (2007). An equipercentile version of the Levine linear observed-score equating function using the methods of kernel equating (ETS Research Rep. No. RR-07-14). Princeton, NJ: ETS.

    Google Scholar 

  • von Davier, A. A., Holland, P. W., Livingston, S. A., Casabianca, J., Grant, M. C., & Martin, K. (2006). An evaluation of the kernel equating method: A special study with pseudotests constructed from real test data (ETS Research Rep. No. RR-06-02). Princeton, NJ: ETS.

    Google Scholar 

  • von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004a). The chain and poststratification methods for observed-score equating: Their relationship to population invariance. Journal of Educational Measurement, 41(1), 15–32.

    Article  Google Scholar 

  • von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004b). The kernel method of test equating. New York, NY: Springer.

    MATH  Google Scholar 

  • von Davier, A. A., & Kong, N. (2005). A unified approach to linear equating for the nonequivalent groups design. Journal of Educational and Behavioral Statistics, 30(3), 313–342.

    Article  Google Scholar 

  • Wang, T. (2011). An alternative continuization method: the continuized log-linear method. In A. A. von Davier (Ed.), Statistical models for Test Equating, Scaling, and Linking (pp. 141–158). New York, NY: Springer.

    Google Scholar 

Download references

Acknowledgments

My thanks go to Dan Eignor and Skip Livingston for their valuable feedback and suggestions on previous versions of the manuscript. I also thank Kim Fryer for her help with the editorial work. Any opinions expressed here are those of the author and not necessarily of Educational Testing Service.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alina A. von Davier .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this paper

Cite this paper

von Davier, A.A. (2011). An Observed-Score Equating Framework. In: Dorans, N., Sinharay, S. (eds) Looking Back. Lecture Notes in Statistics(), vol 202. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9389-2_12

Download citation

Publish with us

Policies and ethics