An Observed-Score Equating Framework

von Davier, Alina A.

doi:10.1007/978-1-4419-9389-2_12

Alina A. von Davier³

Part of the book series: Lecture Notes in Statistics ((LNSP,volume 202))

894 Accesses
1 Citations

Abstract

Paul Holland has made remarkable contributions to equating theory and practice and has influenced the work of many researchers and psychometricians. In this paper, it is argued that the methodology introduced by Holland and Thayer (1989) and von Davier, Holland, and Thayer (2004b), along with the kernel method of test equating, involves more than simply a continuization method for test score distributions: It has introduced a powerful equating framework1 for all observed-score equating (OSE) methods. This framework has already proven to be useful for various research purposes outside of Gaussian kernel equating (KE). Referred to in this paper as the observed-score equating (OSE) framework, it is one example of the application of Holland’s work to the practice of equating.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
“Conceptual frameworks (theoretical frameworks) are a type of intermediate theory that has the potential to connect to all aspects of inquiry (e.g. problem definition, purpose, literature review, methodology, data collection and analysis). Conceptual frameworks act like maps that give coherence to empirical inquiry” (Conceptual framework, 2010, para 2).
2.
The appendix contains a summary of the three generations of the OSE framework. The first column shows the steps employed within the framework, while additions made in 2004 and in 2009 are included in the next two columns. Examination of the appendix reveals, for example, that Step 2 of the current approach was not added until 2004 (hence the odd numbering in column one of 1, 3, 4, and 5). The table also reveals that another major shift between the 1989 and 2004 was the addition of the SEED to the framework.

References

Chen, H., & Holland, P. W. (2008, March). True score equating under the KE framework, the associated log-linear model and its relation with Levine equating. Paper presented at the annual meeting of the National Council on Measurement in Education, New York, NY.
Google Scholar
Chen, H., Yan, D., Hemat, L., Han, N., & von Davier, A. A. (2007). LOGLIN/KE user guide (Version 3.0) [Computer software manual]. Princeton, NJ: ETS.
Google Scholar
Cid, J., & von Davier, A. A. (2009, April). Examining potential boundary bias effects in kernel smoothing on equating. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego, CA.
Google Scholar
Conceptual framework. (2010). In Wikipedia. Retrieved from http://en.wikipedia.org/wiki/Conceptual_framework.
Cui, Z., & Kolen, M. (2007, April). An introduction of two new smoothing methods in equating: The cubic b-spline presmoothing method and the direct presmoothing method. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago.
Google Scholar
Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equatability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37, 281–306.
Article Google Scholar
Duong, M., & von Davier, A. A. (2008, March). Kernel equating with observed mixture distributions in a single-group design. Paper presented at the annual meeting of the National Council on Measurement in Education, New York.
Google Scholar
ETS. (2006). KE-software (Version 1) [Computer software]. Princeton, NJ: Author.
Google Scholar
ETS. (2007). KE-software (Version 2) [Computer software]. Princeton, NJ: Author.
Google Scholar
ETS. (2010). KE-software (Version 3) [Computer software]. Princeton, NJ: Author.
Google Scholar
Haberman, S. J. (2008). Continuous exponential families: An equating tool (ETS Research Rep. No. RR-08-05). Princeton, NJ: ETS.
Google Scholar
Holland, P. W., & Hoskens, M. (2003). Classical test theory as a first-order item response theory: Application to true-score prediction from a possibly nonparallel test. Psychometrika, 68, 123–149.
Article MathSciNet Google Scholar
Holland, P. W., King, B. F., & Thayer, D. T. (1989). The standard error of equating for the kernel method of equating score distributions (ETS Research Rep. No. RR-89-06). Princeton NJ: ETS.
Google Scholar
Holland, P. W., & Moses, T. P. (2007). Kernel and traditional equipercentile equating with degrees of presmoothing (ETS Research Rep. No. RR-07-15). Princeton, NJ: ETS.
Google Scholar
Holland, P. W., Sinharay, S., von Davier, A. A., & Han, N. (2008). An approach to evaluating the missing data assumptions of the chain and post-stratification equating methods for the NEAT design. Journal of Educational Measurement, 45, 17–43.
Article Google Scholar
Holland, P. W., & Thayer, D. T. (1987). Notes on the use of log-linear models for fitting discrete probability distributions (ETS Research Rep. No. RR-87-31). Princeton NJ: ETS.
Google Scholar
Holland, P. W., & Thayer, D. T. (1989). The kernel method of equating score distributions (ETS Research Rep. No. RR-89-07). Princeton, NJ: ETS.
Google Scholar
Holland, P. W., & Thayer, D. T. (2000). Univariate and bivariate log-linear models for discrete test score distributions. Journal of Educational and Behavioral Statistics, 25, 133–183.
Google Scholar
Holland, P.W., von Davier, A. A., Sinharay, S., & Han, N. (2006). Testing the untestable assumptions of the chain and poststratification equating methods for the NEAT design (ETS Research Rep. No. RR-06-17). Princeton, NJ: ETS.
Google Scholar
Jiang, Y., von Davier, A. A., & Chen, H. (2011). Evaluating equating results: Percent relative error for chained kernel equating. Manuscript submitted for publication.
Google Scholar
Kendall, M. G., & Stuart, A. (1977). The advanced theory of statistics (4th ed.). New York, NY: Macmillan.
MATH Google Scholar
Kolen, M. J., & Brennan, R. J. (2004). Test equating, scaling and linking (2nd ed.). New York, NY: Springer.
MATH Google Scholar
Lee, Y.-H., & von Davier, A. A. (2008). Comparing alternative kernels for the kernel method of test equating: Gaussian, logistic and uniform kernels (ETS Research Rep. No. RR-08-12). Princeton NJ: ETS.
Google Scholar
Liang, T., & von Davier, A. A. (2009, July). Alternative methods to determine the optimal bandwidth for the kernel equating function. Paper presented at the international meeting of the Psychometric Society, Cambridge, UK.
Google Scholar
Livingston, S. (1993a). Small-sample equatings with log-linear smoothing. Journal of Educational Measurement, 30, 23–39.
Article Google Scholar
Livingston, S. (1993b). An empirical tryout of kernel equating (ETS Research Rep. No. RR-93-33). Princeton NJ: ETS.
Google Scholar
Mekhael, M., & von Davier, A. A. (2007, April). The effects of log-linear models on kernel equating results. Paper presented at the meeting of the National Council on Measurement in Education, Chicago.
Google Scholar
Moses, T. (2008). An evaluation of statistical strategies for making equating decisions (ETS Research Rep. No. RR-08-60). Princeton NJ: ETS.
Google Scholar
Moses, T., Deng, W., & Zhang, Y. (2010). The use of two anchors in nonequivalent groups with anchor test (NEAT) equating (ETS Research Report No. RR-10-23). Princeton, NJ: ETS.
Google Scholar
Moses, T. P., & Holland, P. W. (2008). Notes on the general framework for observed score equating (ETS Research Rep. No. RR-08-59). Princeton NJ: ETS.
Google Scholar
Moses, T. P., & von Davier, A. A. (2006). A SAS macro for log-linear smoothing: Applications and implications (ETS Research Rep. No. RR-06-05). Princeton, NJ: ETS.
Google Scholar
Petersen, N. S., Marco, G. L., & Stewart, E. E. (1982). A test of the adequacy of linear score equating models. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 71–135). New York, NY: Academic Press.
Google Scholar
Puhan, G., von Davier, A. A., & Gupta, S. (2008). Impossible scores resulting in zero frequencies in the anchor test: Impact on smoothing and equating (ETS Research Rep. No. RR-08-10). Princeton, NJ: ETS.
Google Scholar
Rao, C. R. (1973). Linear statistical inference and its applications (2nd ed.). New York, NY: Wiley.
Book MATH Google Scholar
Rijmen, F., Qu, Y., & von Davier, A. A. (2008, March). Hypothesis testing of equating differences in the KE framework. Paper presented at the annual meeting of the National Council on Measurement in Education, New York.
Google Scholar
Shen, X., & von Davier, A. A. (2007, April). An exploration of constructing criteria equating for simulation studies comparing kernel and IRT equating methods. Paper presented at the annual meeting of the American Educational Research Association, Chicago.
Google Scholar
von Davier, A. A. (2011). A statistical perspective on equating test scores. In A. A. von Davier (Ed.), Statistical models for test equating, scaling and linking (pp. 1–17). New York, NY: Springer.
Chapter Google Scholar
von Davier, A. A., Fournier-Zajac, S., & Holland, P. W. (2007). An equipercentile version of the Levine linear observed-score equating function using the methods of kernel equating (ETS Research Rep. No. RR-07-14). Princeton, NJ: ETS.
Google Scholar
von Davier, A. A., Holland, P. W., Livingston, S. A., Casabianca, J., Grant, M. C., & Martin, K. (2006). An evaluation of the kernel equating method: A special study with pseudotests constructed from real test data (ETS Research Rep. No. RR-06-02). Princeton, NJ: ETS.
Google Scholar
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004a). The chain and poststratification methods for observed-score equating: Their relationship to population invariance. Journal of Educational Measurement, 41(1), 15–32.
Article Google Scholar
von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004b). The kernel method of test equating. New York, NY: Springer.
MATH Google Scholar
von Davier, A. A., & Kong, N. (2005). A unified approach to linear equating for the nonequivalent groups design. Journal of Educational and Behavioral Statistics, 30(3), 313–342.
Article Google Scholar
Wang, T. (2011). An alternative continuization method: the continuized log-linear method. In A. A. von Davier (Ed.), Statistical models for Test Equating, Scaling, and Linking (pp. 141–158). New York, NY: Springer.
Google Scholar

Download references

Acknowledgments

My thanks go to Dan Eignor and Skip Livingston for their valuable feedback and suggestions on previous versions of the manuscript. I also thank Kim Fryer for her help with the editorial work. Any opinions expressed here are those of the author and not necessarily of Educational Testing Service.

Author information

Authors and Affiliations

Educational Testing Service, Rosedale Road, Princeton, NJ, 08541, USA
Alina A. von Davier

Authors

Alina A. von Davier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alina A. von Davier .

Editor information

Editors and Affiliations

, Research and Development, Educational Testing Service, MS 12T, Rosedale Road, Princeton, 08541, New Jersey, USA
Neil J. Dorans
, Research and Development, Educational Testing Service, MS 12T, Rosedale Road, Princeton, 08541, New Jersey, USA
Sandip Sinharay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

von Davier, A.A. (2011). An Observed-Score Equating Framework. In: Dorans, N., Sinharay, S. (eds) Looking Back. Lecture Notes in Statistics(), vol 202. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9389-2_12

Download citation

DOI: https://doi.org/10.1007/978-1-4419-9389-2_12
Published: 02 June 2011
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-9388-5
Online ISBN: 978-1-4419-9389-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics