Different teacher-level effectiveness estimates, different results: inter-model concordance across six generalized value-added models (VAMs)

Sloat, Edward; Amrein-Beardsley, Audrey; Holloway, Jessica

doi:10.1007/s11092-018-9283-7

Different teacher-level effectiveness estimates, different results: inter-model concordance across six generalized value-added models (VAMs)

Published: 28 July 2018

Volume 30, pages 367–397, (2018)
Cite this article

Educational Assessment, Evaluation and Accountability Aims and scope Submit manuscript

Edward Sloat¹,
Audrey Amrein-Beardsley ORCID: orcid.org/0000-0001-6924-3025¹ &
Jessica Holloway²

760 Accesses
9 Citations
2 Altmetric
1 Mention
Explore all metrics

Abstract

In this study, researchers compared the concordance of teacher-level effectiveness ratings derived via six common generalized value-added model (VAM) approaches including a (1) student growth percentile (SGP) model, (2) value-added linear regression model (VALRM), (3) value-added hierarchical linear model (VAHLM), (4) simple difference (gain) score model, (5) rubric-based performance level (growth) model, and (6) simple criterion (percent passing) model. The study sample included fourth to sixth grade teachers employed in a large, suburban school district who taught the same sets of students, at the same time, and for whom a consistent set of achievement measures and background variables were available. Findings indicate that ratings significantly and substantively differed depending upon the methodological approach used. Findings, accordingly, bring into question the validity of the inferences based on such estimates, especially when high-stakes decisions are made about teachers as based on estimates measured via different, albeit popular methods across different school districts and states.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Methodological issues in value-added modeling: an international review from 26 countries

Article Open access 02 August 2019

Jessica Levy, Martin Brunner, … Antoine Fischbach

Validating “value added” in the primary grades: one district’s attempts to increase fairness and inclusivity in its teacher evaluation system

Article 25 November 2015

Audrey Amrein-Beardsley, Sarah Polasky & Jessica Holloway-Libell

Estimating Teacher Quality: Comparing Objective and Subjective Measures

Article 27 December 2022

Mira G. Potter-Schwartz

Notes

VAMs are designed to isolate and measure teachers’ alleged contributions to student achievement on large-scale standardized achievement tests as groups of students move from one grade level to the next. VAMs are, accordingly, used to help objectively compute the differences between students’ composite test scores from year-to-year, with value-added being calculated as the deviations between predicted and actual growth (including random and systematic error). Differences in growth are to be compared to “similar” coefficients of “similar” teachers in “similar” districts at “similar” times, after which teachers are positioned into their respective and descriptive categories of effectiveness (e.g., highly effective, effective, ineffective, highly ineffective).
The main differences between VAMs and growth models are how precisely estimates are made and whether control variables are included. Different than the typical VAM, for example, the SGP model is more simply intended to measure the growth of similarly matched students to make relativistic comparisons about student growth over time, without any additional statistical controls (e.g., for student background variables). Students are, rather, directly and deliberately measured against or in reference to the growth levels of their peers, which de facto controls for these other variables. Thereafter, determinations are made in terms of whether students increase, maintain, or decrease in growth percentile rankings as compared to their academically similar peers. Accordingly, researchers refer to both models as generalized VAMs throughout the rest of this manuscript unless distinctions between growth models and VAMs are needed or required.
The SGP model is also used or endorsed statewide in the states of Colorado, Hawaii, Indiana, Massachusetts, Mississippi, Nevada, New Jersey, New York, Rhode Island, Virginia, and West Virginia (Collins and Amrein-Beardsley 2014).
The exact number of students covered by the classroom aggregations differs between the analytic methods. For example, regression techniques use list wise deletion of cases if one or more of the explanatory variables are missing, while non-regression techniques only require the presence of two achievement scores in the calculations.
With small enrollments, averaging residual growth scores risk skewing the class aggregate measures. Accordingly, researchers used medians as the class growth measure for this reason.
Researchers’ review of right hand side correlations and model diagnostics suggested multicollinearity among the ELL, PHL, and Lunch variables, although researchers placed no burden of precision or interpretation on the estimated parameters of the individual predictor variables, also noting that the use of collinear predictors did not impact overall model performance (Johnston 1972). The outcome of the modeling approach, then, is an estimate of residual achievement expressed in terms of the original scale scores. The model generates an expected score for each student, and the difference between the actual and the expected outcome is the residual value.
The contingency table for one grade, one subject, contains 36 cells (6 × 6). Diagonal cells compare identical methods and are therefore excluded. Off-diagonal cells are symmetric, leaving a total of 15 comparative measures per grade per subject.
Fifteen per grade per subject by two subjects by three grades.

References

American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Google Scholar
American Statistical Association (ASA). (2014). ASA statement on using value-added models for educational assessment. Alexandria, VA. Retrieved from http://www.amstat.org/policy/pdfs/asa_vam_statement.pdf.
Amrein-Beardsley, A., & Holloway, J. (2017). Value-added models for teacher evaluation and accountability: Commonsense assumptions. Educational Policy, 1–27. https://doi.org/10.1177/0895904817719519.
Anagnostopoulos, D., Rutledge, S. A., & Jacobsen, R. (2013). The infrastructure of accountability: Data use and the transformation of American education. Cambridge: Harvard Education Press.
Google Scholar
Arizona Department of Education (ADE) (2009). AIMS math technical report 2009. Retrieved from http://www.azed.gov/standards-development-assessment/files/2011/12/aimsmathfieldtesttechreport2009.pdf.
Arizona Department of Education (ADE) (2011). AIMS 2011 technical report. Retrieved from http://www.azed.gov/standards-development-assessment/files/2011/12/aims_tech_report_2011_final.pdf.
Ball, S. J. (2012). Politics and policy making in education: Explorations in sociology. London: Routledge.
Google Scholar
Ballou, D., Sanders, W. L., & Wright, P. (2004). Controlling for student background in value-added assessment of teachers. Journal of Educational and Behavioral Statistics, 29(1), 37–65. https://doi.org/10.3102/10769986029001037.
Article Google Scholar
Banchero, S. & Kesmodel, D. (2011). Teachers are put to the test: more states tie tenure, bonuses to new formulas for measuring test scores. The Wall Street Journal. Retrieved from http://online.wsj.com/article/SB10001424053111903895904576544523666669018.html.
Berliner, D. C. (2014). Exogenous variables and value-added assessments: a fatal flaw. Teachers College Record, 116(1).
Berliner, D. (2018). Between Scylla and Charybdis: reflections on and problems associated with the evaluation of teachers in an era of metrification. Education Policy Analysis Archives, 26(54), 1–29. https://doi.org/10.14507/epaa.26.3820.
Article Google Scholar
Betebenner, D. W. (2009). A primer on student growth percentiles. Dover: The Center for Assessment Retrieved from http://www.ksde.org/LinkClick.aspx?fileticket=XmFRiNlYbyc%3d&tabid=1646&mid=10217.
Google Scholar
Betebenner, D.W. (2011). Package ‘SGP.’ Retrieved from https://cran.r-project.org/web/packages/SGP/SGP.pdf.
Bill & Melinda Gates Foundation. (2010). Learning about teaching: Initial findings from the measures of effective teaching project. Seattle: Retrieved from http://www.gatesfoundation.org/college-ready-education/Documents/preliminary-findings-research-paper.pdf.
Bill & Melinda Gates Foundation (2013). Ensuring fair and reliable measures of effective teaching: Culminating findings from the MET project’s three-year study. Seattle, WA. Retrieved from http://www.gatesfoundation.org/press-releases/Pages/MET-Announcment.aspx.
Braun, H. I. (2005). Using student progress to evaluate teachers: a primer on value-added models. Princeton: Educational Testing Service Retrieved from http://www.ets.org/Media/Research/pdf/PICVAS.pdf.
Google Scholar
Braun, H., Goldschmidt, P., McCaffrey, D., & Lissitz, R. (2012). Graduate student council Division D fireside chat: VA modeling in educational research and evaluation. Paper Presented at Annual Conference of the American Educational Research Association (AERA), Vancouver, Canada.
Briggs, D. C., & Betebenner, D. (2009). Is growth in student achievement scale dependent? Paper presented at the annual meeting of the National Council for Measurement in Education (NCME), San Diego, CA.
Chin, M., & Goldhaber, D. (2015). Exploring explanations for the “weak” relationship between value added and observation-based measures of teacher performance. Cambridge, MA: Center for Education Policy Research (CEPR), Harvard University. Retrieved from http://cepr.harvard.edu/files/cepr/files/sree2015_simulation_working_paper.pdf?m=1436541369.
Close, K., Amrein-Beardsley, A., & Collins, C. (2018). State-level assessments and teacher evaluation systems after the passage of the Every Student Succeeds Act: Some steps in the right direction. Boulder, CO: Nation Education Policy Center (NEPC). Retrieved from http://nepc.colorado.edu/publication/stateassessment.
Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences (2nd ed.). Hillsdale: Lawrence Erlbaum Associates.
Google Scholar
Collins, C. (2014). Houston, we have a problem: teachers find no value in the SAS Education Value-Added Assessment System (EVAAS®). Education Policy Analysis Archives. Retrieved from http://epaa.asu.edu/ojs/article/view/1594.
Collins, C., & Amrein-Beardsley, A. (2014). Putting growth and value-added models on the map: A national overview. Teachers College Record, 16(1). Retrieved from: http://www.tcrecord.org/Content.asp?ContentId=17291
Corcoran, S. P., Jennings, J. L., & Beveridge, A. A. (2011). Teacher effectiveness on high- and low-stakes tests. New York: New York University Retrieved from https://files.nyu.edu/sc129/public/papers/corcoran_jennings_beveridge_2011_wkg_teacher_effects.pdf.
Google Scholar
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302. https://doi.org/10.1037/h0040957.
Article Google Scholar
Curtis, R. (2011). District of Columbia Public Schools: Defining instructional expectations and aligning accountability and support. Washington, D.C.: The Aspen Institute Retrieved from: www.nctq.org/docs/Impact_1_15579.pdf.
Google Scholar
Denby, D. (2012). Public defender: Diane Ravitch takes on a movement. The New Yorker. Retrieved from http://www.newyorker.com/reporting/2012/11/19/121119fa_fact_denby.
Doherty, K. M., & Jacobs, S. (2015). State of the states 2015: Evaluating teaching, leading and learning. Washington DC: National Council on Teacher Quality (NCTQ). Retrieved from http://www.nctq.org/dmsView/StateofStates2015.
Duncan, A. (2009). Teacher preparation: Reforming the uncertain profession. Retrieved from http://www.ed.gov/news/speeches/2009/10/10222009.html.
Duncan, A. (2011). Winning the future with education: Responsibility, reform and results. Testimony given to the U.S. Congress, Washington, DC: Retrieved from http://www.ed.gov/news/speeches/winning-future-education-responsibility-reform-and-results.
Every Student Succeeds Act (ESSA) of 2015, Pub. L. No. 114-95, § 129 Stat. 1802. (2016). Retrieved from https://www.gpo.gov/fdsys/pkg/BILLS-114s1177enr/pdf/BILLS-114s1177enr.pdf.
Felton, E. (2016). Southern lawmakers reconsidering role of test scores in teacher evaluations. Education Week. Retrieved from http://blogs.edweek.org/edweek/teacherbeat/2016/03/reconsidering_test_scores_in_teacher_evaluations.html.
Ferguson, G. A., & Takane, Y. (1989). Statistical analysis in psychology and education (6th ed.). New York: McGraw-Hill.
Google Scholar
Freed, M. N., Ryan, J. M., & Hess, R. K. (1991). Handbook of statistical procedures and their computer applications to education and the behavioral sciences. New York: Macmillan Publishing Company.
Google Scholar
Gabriel, R., & Lester, J. N. (2013). Sentinels guarding the grail: value-added measurement and the quest for education reform. Education Policy Analysis Archives, 21(9), 1–30 Retrieved from http://epaa.asu.edu/ojs/article/view/1165.
Google Scholar
Glazerman, S. M., & Potamites, L. (2011). False performance gains: a critique of successive cohort indicators. Washington, DC: Mathematica Policy Research. Retrieved from www.mathematica-mpr.com/publications/pdfs/.../False_Perf.pdf.
Goldhaber, D., Walch, J., & Gabele, B. (2014). Does the model matter? Exploring the relationship between different student achievement-based teacher assessments. Statistics and Public Policy, 1(1), 28–39. https://doi.org/10.1080/2330443x.2013.856169.
Article Google Scholar
Goldschmidt, P., Choi, K., & Beaudoin, J. B. (2012, February). Growth model comparison study: Practical implications of alternative models for evaluating school performance. Technical Issues in Large-Scale Assessment State Collaborative on Assessment and Student Standards. Council of Chief State School Officers.
Graue, M. E., Delaney, K. K., & Karch, A. S. (2013). Ecologies of education quality. Education Policy Analysis Archives, 21(8), 1–36 Retrieved from http://epaa.asu.edu/ojs/article/view/1163.
Google Scholar
Grek, S., & Ozga, J. (2010). Re-inventing public education: the new role of knowledge in education policy making. Public Policy and Administration, 25(3), 271–288. https://doi.org/10.1177/0952076709356870.
Article Google Scholar
Grossman, P., Cohen, J., Ronfeldt, M., & Brown, L. (2014). The test matters: the relationship between classroom observation scores and teacher value added on multiple types of assessment. Educational Researcher, 43(6), 293–303. https://doi.org/10.3102/0013189X14544542.
Article Google Scholar
Harris, D. N. (2011). Value-added measures in education: What every educator needs to know. Cambridge: Harvard Education Press.
Google Scholar
Harris, D. N., & Sass, T. R. (2006). Value-added models and the measurement of teacher quality. Tallahassee: Florida Department of Education Retrieved from http://itp.wceruw.org/vam/IES_Harris_Sass_EPF_Value-added_14_Stanford.pdf.
Google Scholar
Hill, H. C., Kapitula, L., & Umlan, K. (2011). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48(3), 794–831. https://doi.org/10.3102/0002831210387916.
Article Google Scholar
Ho, A. D. (2009). The dependence of growth model results on proficiency cut scores. Educational Measurement Issues and Practice, 28(4), 15–26. https://doi.org/10.1111/j.1745-3992.2009.00159.x.
Article Google Scholar
Hursh, D. (2007). Assessing No Child Left Behind and the rise of neoliberal education policies. American Educational Research Journal, 44(3), 493–518. https://doi.org/10.3102/0002831207306764.
Article Google Scholar
Jacob, B. A., & Lefgren, L. (2005). Principals as agents: Subjective performance measurement in education. Cambridge: National Bureau of Economic Research (NBER) Retrieved from www.nber.org/papers/w11463.
Book Google Scholar
Johnson, M., Lipscomb, S., & Gill, B. (2013). Sensitivity of teacher value-added estimates to student and peer control variables. Journal of Research on Educational Effectiveness, 8(1), 60–83. https://doi.org/10.1080/19345747.2014.967898.
Article Google Scholar
Johnston, J. (1972). Econometric methods (2nd ed.). New York: McGraw-Hill.
Google Scholar
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73. https://doi.org/10.1111/jedm.12000.
Article Google Scholar
Kennedy, M. M. (2010). Attribution error and the quest for teacher quality. Educational Researcher, 39(8), 591–598. https://doi.org/10.3102/0013189X10390804.
Article Google Scholar
Kersting, N. B., Chen, M., & Stigler, J. W. (2013). Value-added added teacher estimates as part of teacher evaluations: exploring the effects of data and model specifications on the stability of teacher value-added scores. Education Policy Analysis Archives, 21(7), 1–39 Retrieved from http://epaa.asu.edu/ojs/article/view/1167.
Google Scholar
Kimball, S. M., White, B., Milanowski, A. T., & Borman, G. (2004). Examining the relationship between teacher evaluation and student assessment results in Washoe County. Peabody Journal of Education, 79(4), 54–78. https://doi.org/10.1207/s15327930pje7904_4.
Article Google Scholar
Kupermintz, H. (2003). Teacher effects and teacher effectiveness: a validity investigation of the Tennessee Value-Added Assessment System. Educational Evaluation and Policy Analysis, 25, 287–298. https://doi.org/10.3102/01623737025003287.
Article Google Scholar
Kyriakides, L. (2005). Drawing from teacher effectiveness research and research into teacher interpersonal behaviour to establish a teacher evaluation system: a study on the use of student ratings to evaluate teacher behaviour. Journal of Classroom Instruction, 40(2), 44–66.
Google Scholar
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174. https://doi.org/10.2307/2529310.
Article Google Scholar
Lingard, B. (2011). Policy as numbers: ac/counting for educational research. The Australian Educational Researcher, 38(4), 355–382.
Article Google Scholar
Lingard, B., Martino, W., & Rezai-Rashti, G. (2013). Testing regimes, accountabilities and education policy: commensurate global and national developments. Journal of Education Policy, 28(5), 539–556. https://doi.org/10.1080/02680939.2013.820042.
Article Google Scholar
Lockwood, J., McCaffrey, D., Hamilton, L., Stetcher, B., Le, V. N., & Martinez, J. (2007). The sensitivity of value-added teacher effect estimates to different mathematics achievement measures. Journal of Educational Measurement, 44(1), 47–67. https://doi.org/10.1111/j.1745-3984.2007.00026.x.
Article Google Scholar
Loeb, S., Soland, J., & Fox, J. (2015). Is a good teacher a good teacher for all? Comparing value-added of teachers with English learners and non-English learners. Educational Evaluation and Policy Analysis, 36(4), 457–475. https://doi.org/10.3102/0162373714527788.
Article Google Scholar
Mathews, J. (2013). Hidden power of teacher awards. The Washington Post. Retrieved from http://www.washingtonpost.com/blogs/class-struggle/post/hidden-power-of-teacher-awards/2013/04/08/15b7afcc-9e66-11e2-9a79-eb5280c81c63_blog.html.
Mathis, W. (2011). Review of “Florida Formula for Student Achievement: Lessons for the Nation.”. Boulder: National Education Policy Center Retrieved from http://nepc.colorado.edu/thinktank/review-florida-formula.
Google Scholar
McCaffrey, D. F., Lockwood, J. R., Koretz, D. M., & Hamilton, L. S. (2003). Evaluating value-added models for teacher accountability. Santa Monica: Rand Corporation.
Book Google Scholar
McCaffrey, D. F., Lockwood, J. R., Koretz, D., Louis, T. A., & Hamilton, L. (2004). Models for value-added modeling of teacher effects. Journal of Educational and Behavioral Statistics, 29(1), 67–101 RAND reprint available at http://www.rand.org/pubs/reprints/2005/RAND_RP1165.pdf.
Article Google Scholar
Messick, S. (1975). The standard problem: meaning and values in measurement and evaluation. American Psychologist, 30, 955–966. https://doi.org/10.1037//0003-066x.30.10.955.
Article Google Scholar
Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist, 35, 1012–1027. https://doi.org/10.1037//0003-066x.35.11.1012.
Article Google Scholar
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education and Macmillan.
Google Scholar
Messick, S. (1995). Validity of psychological assessment: validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–749.
Article Google Scholar
Milanowski, A., Kimball, S. M., & White, B. (2004). The relationship between standards-based teacher evaluation scores and student achievement: Replication and extensions at three sites. Madison: University of Wisconsin-Madison, Center for Education Research.
Google Scholar
Newton, X., Darling-Hammond, L., Haertel, E., & Thomas, E. (2010). Value-added modeling of teacher effectiveness: An exploration of stability across models and contexts. Educational Policy Analysis Archives, 18(23), 1–27 Retrieved from http://epaa.asu.edu/ojs/article/view/810.
Google Scholar
Nichols, S. L., & Berliner, D. C. (2007). Collateral damage: How high-stakes testing corrupts America’s schools. Cambridge: Harvard Education Press.
Google Scholar
Nye, B., Konstantopoulos, S., & Hedges, L. V. (2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26(3), 237–257. https://doi.org/10.3102/01623737026003237.
Article Google Scholar
Ozga, J. (2016). Trust in numbers? Digital education governance and the inspection process. European Educational Research Journal, 15(1), 69–81. https://doi.org/10.1177/1474904115616629.
Article Google Scholar
Papay, J. P. (2010). Different tests, different answers: The stability of teacher value-added estimates across outcome measures. American Educational Research Journal, 48(1), 163–193. https://doi.org/10.3102/0002831210362589.
Article Google Scholar
Pauken, T. (2013). Texas vs. No Child Left Behind. The American Conservative. Retrieved from http://www.theamericanconservative.com/articles/texas-vs-no-child-left-behind/
Polikoff, M. S., & Porter, A. C. (2014). Instructional alignment as a measure of teaching quality. Education Evaluation and Policy Analysis, 36(4), 399–416. https://doi.org/10.3102/0162373714531851.
Article Google Scholar
Porter, T. M. (1996). Trust in numbers: The pursuit of objectivity in science and public life. Princeton: Princeton University Press.
Book Google Scholar
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Application and data analysis methods (2nd ed.). Thousand Oaks: Sage Publications, Inc..
Google Scholar
Reynolds, C. R., Livingston, R. B., & Wilson, V. (2009). Measurement and assessment in education (2nd ed.). Upper Saddle River: Pearson Education, Inc..
Google Scholar
Rhee, M. (2011). The evidence is clear: Test scores must accurately reflect students' learning. The Huffington Post. Retrieved from http://www.huffingtonpost.com/michelle-rhee/michelle-rhee-dc-schools_b_845286.html.
Rizvi, F., & Lingard, B. (2010). Globalizing education policy. London: Routledge.
Google Scholar
Rothstein, J., & Mathis, W. J. (2013). Review of two culminating reports from the MET Project. Boulder: National Education Policy Center (NEPC) Retrieved from http://nepc.colorado.edu/thinktank/review-MET-final-2013.
Google Scholar
Schafer, W. D., Lissitz, R. W., Zhu, X., Zhang, Y., Hou, X., & Li, Y. (2012). Evaluating teachers and schools using student growth models. Practical Assessment, Research & Evaluation, 17(17). Retrieved from pareonline.net/getvn.asp?v=17&n=17.
Smith, W. C. (2016). The global testing culture: Shaping education policy, perceptions, and practice. Oxford: Symposium Books.
Book Google Scholar
Smith, W. C., & Kubacka, K. (2017). The emphasis of student test scores in teacher appraisal systems. Education Policy Analysis Archives, 25(86). https://doi.org/10.14507/epaa.25.2889.
Article Google Scholar
Sørensen, T. B. (2016). Value-added measurement or modelling (VAM). Education International Discussion Paper. Retrieved from: http://download.eiie.org/Docs/WebDepot/2016_EI_VAM_EN_final_Web. pdf.
Stevens, J. (1996). Applied multivariate statistics for the social sciences. Mahwah: Lawrence Erlbaum Associates, Inc.
Google Scholar
Tekwe, C. D., Carter, R. L., Ma, C., Algina, J., Lucas, M. E., Roth, J., Arite, M., Fisher, T., & Resnick, M. B. (2004). An empirical comparison of statistical models for value-added assessment of school performance. Journal of Educational and Behavioral Statistics, 29(1), 11–36. https://doi.org/10.3102/10769986029001011.
Article Google Scholar
Timar, T. B., & Maxwell-Jolly, J. (Eds.). (2012). Narrowing the achievement gap: Perspectives and strategies for challenging times. Cambridge: Harvard Education Press.
Google Scholar
Verger, A., & Parcerisa, L. (2017). A difficult relationship. Accountability policies and teachers: International evidence and key premises for future research. In M. Akiba & G. LeTendre (Eds.), International handbook of teacher quality and policy (pp. 241–254). New York: Routledge.
Google Scholar
Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. New York: The New Teacher Project (TNTP) Retrieved from http://tntp.org/assets/documents/TheWidgetEffect_2nd_ed.pdf.
Google Scholar

Download references

Author information

Authors and Affiliations

Mary Lou Fulton Teachers College, Arizona State University, PO Box 871811, Tempe, AZ, 85287-1811, USA
Edward Sloat & Audrey Amrein-Beardsley
Research for Educational Impact (REDI) Centre, Deakin University, Melbourne Burwood Campus, 221 Burwood Highways, Burwood, VIC, 3125, Australia
Jessica Holloway

Authors

Edward Sloat
View author publications
You can also search for this author in PubMed Google Scholar
Audrey Amrein-Beardsley
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Holloway
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Audrey Amrein-Beardsley.

Electronic supplementary material

ESM 1

(DOCX 30 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sloat, E., Amrein-Beardsley, A. & Holloway, J. Different teacher-level effectiveness estimates, different results: inter-model concordance across six generalized value-added models (VAMs). Educ Asse Eval Acc 30, 367–397 (2018). https://doi.org/10.1007/s11092-018-9283-7

Download citation

Received: 12 November 2017
Accepted: 13 July 2018
Published: 28 July 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s11092-018-9283-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Different teacher-level effectiveness estimates, different results: inter-model concordance across six generalized value-added models (VAMs)

Abstract

Access this article

Similar content being viewed by others

Methodological issues in value-added modeling: an international review from 26 countries

Validating “value added” in the primary grades: one district’s attempts to increase fairness and inclusivity in its teacher evaluation system

Estimating Teacher Quality: Comparing Objective and Subjective Measures

Notes

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Different teacher-level effectiveness estimates, different results: inter-model concordance across six generalized value-added models (VAMs)

Abstract

Access this article

Similar content being viewed by others

Methodological issues in value-added modeling: an international review from 26 countries

Validating “value added” in the primary grades: one district’s attempts to increase fairness and inclusivity in its teacher evaluation system

Estimating Teacher Quality: Comparing Objective and Subjective Measures

Notes

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation