Mediating Assessment Innovation: Why Stakeholder Perspectives Matter

  • Martin East
Part of the Educational Linguistics book series (EDUL, volume 26)


As part of recent curriculum and assessment reforms in New Zealand, the assessment of foreign language (FL) students’ spoken communicative proficiency has undergone a major shift. A summative teacher-led interview test has been replaced by the collection of learner-focused peer-to-peer interactions that take place in the context of learning programmes throughout the year. The innovation brings with it significant changes to practice, and initially invoked strong teacher reaction. This chapter sets the scene for a 2-year study focused on stakeholder views about the new assessment in comparison with the former assessment. The chapter interweaves the New Zealand case with global arguments about teaching, learning and assessment in order to situate the case in question within on-going international debates. The chapter outlines the essence of the reforms. It articulates the centrality of assessment to effective teaching and learning and describes the evidence that assessment developers would normatively draw on to determine validity. A broader approach to validation is proposed. The chapter concludes with an overview of the study in question.


Foreign Language Target Language Test Taker Assessment Task Validity Argument 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. ACTFL. (2012). ACTFL proficiency guidelines 2012. Retrieved from
  2. Anastasi, A., & Urbina, S. (1997). Psychological testing. Upper Saddle River, NJ: Prentice Hall.Google Scholar
  3. Bachman, L. F. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17(1), 1–42.
  4. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford, England: Oxford University Press.Google Scholar
  5. Bachman, L. F., & Palmer, A. (1996). Language testing in practice: Designing and developing useful language tests. Oxford, England: Oxford University Press.Google Scholar
  6. Bachman, L. F., & Palmer, A. (2010). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford, England: Oxford University Press.Google Scholar
  7. Brown, H. D. (2007). Principles of language learning and teaching (5th ed.). New York, NY: Pearson.Google Scholar
  8. Chapelle, C. A. (1999). Validity in language assessment. Annual Review of Applied Linguistics, 19, 254–272.
  9. Cohen, R. J., & Swerdlik, M. E. (2005). Psychological testing and assessment: An introduction to tests and measurement (6th ed.). New York, NY: McGraw Hill.Google Scholar
  10. Council of Europe. (1998). Modern languages: Teaching, assessment. A common European framework of reference. Strasbourg, France: Council of Europe.Google Scholar
  11. Council of Europe. (2001). Common European framework of reference for languages. Cambridge, England: Cambridge University Press.Google Scholar
  12. Crocker, L. (2002). Stakeholders in comprehensive validation of standards-based assessments: A commentary. Educational Measurement: Issues and Practice, 22, 5–6.
  13. De Ridder, I., Vangehuchten, L., & Seseña Gómez, M. (2007). Enhancing automaticity through task-based language learning. Applied Linguistics, 28(2), 309–315.
  14. DeKeyser, R. M. (2001). Automaticity and automatization. In P. Robinson (Ed.), Cognition and second language instruction (pp. 125–151). Cambridge, England: Cambridge University Press.
  15. East, M. (2005). Using support resources in writing assessments: Test taker perceptions. New Zealand Studies in Applied Linguistics, 11(1), 21–36.Google Scholar
  16. East, M. (2007). Bilingual dictionaries in tests of L2 writing proficiency: Do they make a difference? Language Testing, 24(3), 331–353.
  17. East, M. (2008a). Dictionary use in foreign language writing exams: Impact and implications. Amsterdam, Netherlands/Philadelphia, PA: John Benjamins.
  18. East, M. (2008b). Language evaluation policies and the use of support resources in assessments of language proficiency. Current Issues in Language Planning, 9(3), 249–261.
  19. East, M. (2009). Evaluating the reliability of a detailed analytic scoring rubric for foreign language writing. Assessing Writing, 14(2), 88–115.
  20. East, M. (2012). Task-based language teaching from the teachers’ perspective: Insights from New Zealand. Amsterdam, Netherlands / Philadelphia, PA: John Benjamins.
  21. East, M. (2015). Taking communication to task – again: What difference does a decade make? The Language Learning Journal, 43(1), 6–19.
  22. Edge, J., & Richards, K. (1998). May I see your warrant please?: Justifying outcomes in qualitative research. Applied Linguistics, 19, 334–356.
  23. Elder, C. (1997). What does test bias have to do with fairness? Language Testing, 14(3), 261–277.
  24. Elder, C., Iwashita, N., & McNamara, T. (2002). Estimating the difficulty of oral proficiency tasks: what does the test-taker have to offer? Language Testing, 19(4), 347–368.
  25. Gardner, J., Harlen, W., Hayward, L., & Stobart, G. (2008). Changing assessment practice: Process, principles and standards. Belfast, Northern Ireland: Assessment Reform Group.Google Scholar
  26. Haertel, E. H. (2002). Standard setting as a participatory process: Implications for validation of standards-based accountability programs. Educational Measurement: Issues and Practice, 22, 16–22.
  27. Hedge, T. (2000). Teaching and learning in the language classroom. Oxford, England: Oxford University Press.Google Scholar
  28. Higgs, T. V. (Ed.). (1984). Teaching for proficiency: The organizing principle. Lincolnwood, IL: National Textbook Company.Google Scholar
  29. Hunter, D. (2009). Communicative language teaching and the ELT Journal: a corpus-based approach to the history of a discourse. Unpublished doctoral thesis. University of Warwick, Warwick, England.Google Scholar
  30. Kane, M. J. (2002). Validating high-stakes testing programs. Educational Measurement: Issues and Practice, 21(1), 31–42.
  31. Kaplan, R. M., & Saccuzzo, D. P. (2012). Psychological testing: Principles, applications, and issues (8th ed.). Belmont, CA: Wadsworth, Centage Learning.Google Scholar
  32. Kline, P. (2000). Handbook of psychological testing (2nd ed.). London, England: Routledge.
  33. Kramsch, C. (1986). From language proficiency to interactional competence. The Modern Language Journal, 70(4), 366–372.
  34. Kramsch, C. (1987). The proficiency movement: Second language acquisition perspectives. Studies in Second Language Acquisition, 9(3), 355–362.
  35. Kunnan, A. J. (2000). Fairness and justice for all. In A. J. Kunnan (Ed.), Fairness and validation in language assessment (pp. 1–14). Cambridge, England: Cambridge University Press.Google Scholar
  36. Lazaraton, A. (1995). Qualitative research in applied linguistics: A progress report. TESOL Quarterly, 29(3), 455–472.
  37. Lazaraton, A. (2002). A qualitative approach to the validation of oral language tests. Cambridge, England: Cambridge University Press.Google Scholar
  38. Leung, C. (2005). Convivial communication: Recontextualizing communicative competence. International Journal of Applied Linguistics, 15(2), 119–144.
  39. Long, M. (1983). Native speaker/non-native speaker conversation and the negotiation of comprehensible input. Applied Linguistics, 4(2), 126–141.
  40. Long, M. (1996). The role of the linguistic environment in second language acquisition. In W. Ritchie & T. Bhatia (Eds.), Handbook of second language acquisition (pp. 413–468). New York, NY: Academic.Google Scholar
  41. Madaus, G. F., & Kellaghan, T. (1992). Curriculum evaluation and assessment. In P. W. Jackson (Ed.), Handbook on research on curriculum (pp. 119–154). New York, NY: Macmillan.Google Scholar
  42. McNamara, T. (1997). ‘Interaction’ in second language performance assessment: Whose performance? Applied Linguistics, 18(4), 446–466.
  43. McNamara, T., & Roever, C. (2006). Language testing: The social dimension. Malden, MA: Blackwell.Google Scholar
  44. Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–749.
  45. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York, NY: Macmillan.Google Scholar
  46. Ministry of Education. (2007). The New Zealand Curriculum. Wellington, NZ: Learning Media.Google Scholar
  47. Mislevy, R., Wilson, M. R., Ercikan, K., & Chudowsky, N. (2003). Psychometric principles in student assessment. In T. Kellaghan, & D. L. Stufflebeam (Eds.), International handbook of educational evaluation (Vol. 9, pp. 489–531). Dordrecht, Netherlands: Kluwer Academic Publishers.
  48. Morrow, K. (1991). Evaluating communicative tests. In S. Anivan (Ed.), Current developments in language testing (pp. 111–118). Singapore, Singapore: SEAMEO Regional Language Centre.Google Scholar
  49. Newton, P., & Shaw, S. (2014). Validity in educational and psychological assessment. London, England: Sage.CrossRefGoogle Scholar
  50. Norris, J. (2002). Interpretations, intended uses and designs in task-based language assessment. Language Testing, 19(4), 337–346.
  51. Norris, J. (2008). Validity evaluation in language assessment. Frankfurt am Main, Germany: Peter Lang.Google Scholar
  52. Nunan, D. (2004). Task-based language teaching. Cambridge, England: Cambridge University Press.
  53. Philp, J., Adams, R., & Iwashita, N. (2014). Peer interaction and second language learning. New York, NY: Routledge.
  54. Rea-Dickins, P. (1997). So, why do we need relationships with stakeholders in language testing? A view from the UK. Language Testing, 14(3), 304–314.
  55. Richards, J. C. (2001). Curriculum development in language teaching. Cambridge, England: Cambridge University Press.
  56. Richards, J. C. (2006). Communicative language teaching today. Cambridge, England: Cambridge University Press.Google Scholar
  57. Richards, J. C., & Rodgers, T. S. (2014). Approaches and methods in language teaching (3rd ed.). Cambridge, England: Cambridge University Press.Google Scholar
  58. Ryan, K. (2002). Assessment validation in the context of high-stakes assessment. Educational Measurement: Issues and Practice, 22, 7–15.
  59. Savignon, S. (2005). Communicative language teaching: Strategies and goals. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp. 635–651). Mahwah, NJ: Lawrence Erlbaum.Google Scholar
  60. Segalowitz, N. (2005). Automaticity and second languages. In C. J. Doughty, & M. H. Long (Eds.), The handbook of second language acquisition (pp. 381–408). Oxford, England: Blackwell.
  61. Shohamy, E. (2000). Fairness in language testing. In A. J. Kunnan (Ed.), Fairness and validation in language assessment (pp. 15–19). Cambridge, England: Cambridge University Press.Google Scholar
  62. Shohamy, E. (2001a). The power of tests: A critical perspective on the uses of language tests. Harlow, England: Longman/Pearson.
  63. Shohamy, E. (2001b). The social responsibility of the language testers. In R. L. Cooper (Ed.), New perspectives and issues in educational language policy (pp. 113–130). Amsterdam, Netherlands/Philadelphia, PA: John Benjamins Publishing Company.
  64. Shohamy, E. (2006). Language policy: Hidden agendas and new approaches. New York, NY: Routledge.
  65. Shohamy, E. (2007). Tests as power tools: Looking back, looking forward. In J. Fox, M. Wesche, D. Bayliss, L. Cheng, C. E. Turner, & C. Doe (Eds.), Language testing reconsidered (pp. 141–152). Ottawa, Canada: University of Ottawa Press.Google Scholar
  66. Spada, N. (2007). Communicative language teaching: Current status and future prospects. In J. Cummins, & C. Davison (Eds.), International handbook of English language teaching (pp. 271–288). New York, NY: Springer.
  67. Spolsky, B. (1995). Measured words. Oxford, England: Oxford University Press.Google Scholar
  68. Tomlinson, B. (Ed.). (2011). Materials development in language teaching (2nd ed.). Cambridge, England: Cambridge University Press.Google Scholar
  69. Willis, D., & Willis, J. (2007). Doing task-based teaching. Oxford, England: Oxford University Press.Google Scholar
  70. Winke, P. (2011). Evaluating the validity of a high-stakes ESL test: Why teachers’ perceptions matter. TESOL Quarterly, 45(4), 628–660.
  71. Wood, R. (1993). Assessment and testing. Cambridge, England: Cambridge University Press.Google Scholar

Copyright information

© Springer Science+Business Media Singapore 2016

Authors and Affiliations

  • Martin East
    • 1
  1. 1.Faculty of Education and Social WorkThe University of AucklandAucklandNew Zealand

Personalised recommendations