Skip to main content

Abstract

We begin by examining the history of language testing and assessment as parallel to the development of large-scale, high-stakes language proficiency tests (e.g., TOEFL) used primarily for admission into institutions of higher learning. We then discuss core concepts in the field and provide an overview of the most commonly used research methods. Lastly, we address a number of challenges and concerns arising from tensions between those who see the growing emphasis on testing as a way to ensure fairness and accountability and those who believe it results in bias and inequality. Consequential validity, assessment literacy, and world Englishes/English as a lingua franca are discussed in relation to language tests and assessments as used for decision-making purposes in various domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • ACTFL. (2012). ACTFL proficiency guidelines (Revised). Alexandria, VA: American Council on the Teaching of Foreign Languages.

    Google Scholar 

  • Alderson, C. (1991). Language testing in the 1990s: How far have we come? How much further have we to go? In A. Sarinee (Ed.), Current developments in language testing: Anthology Series 25 (pp. 1–27). Singapore: Regional Language Centre.

    Google Scholar 

  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

    Google Scholar 

  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

    Google Scholar 

  • Angoff, W. H. (1988). Validity: An evolving concept. In H. Wainer & H. Braun (Eds.), Test validity (pp. 9–13). Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Bachman, L. (1988). Problems in examining the validity of the ACTFL oral proficiency interview. Studies in Second Language Acquisition, 10, 149–164.

    Article  Google Scholar 

  • Bachman, L., & Savignon, S. (1986). The evaluation of communicative language proficiency: A critique of the ACTFL oral interview. Modern Language Journal, 70, 380–391.

    Article  Google Scholar 

  • Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.

    Google Scholar 

  • Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press.

    Google Scholar 

  • Banerjee, J., & Luoma, S. (1997). Qualitative approaches to test validation. In C. Clapham & D. Corson (Eds.), Encyclopedia of language and education, Volume 7: Language testing and assessment (pp. 275–287). Dordrecht: Kluwer Academic.

    Chapter  Google Scholar 

  • Berns, M. (2008). World Englishes, English as a lingua franca, and intelligibility. World Englishes, 27, 327–334.

    Article  Google Scholar 

  • Canale, M. (1983). From communicative competence to communicative language pedagogy. In J. C. Richards & R. W. Schmidt (Eds.), Language and communication (pp. 2–27). New York: Longman.

    Google Scholar 

  • Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1, 1–47.

    Article  Google Scholar 

  • Carroll, J. B. (1961). Fundamental considerations in testing for English language proficiency of foreign students. In H. B. Allen & R. N. Campbell (Eds.), Teaching English as a second language: A book of readings (2nd ed., pp. 313–321). New York: McGraw Hill.

    Google Scholar 

  • Carroll, J. B. (1986). LT + 25, and beyond. Language Testing, 3, 123–129.

    Article  Google Scholar 

  • Chapelle, C., Chung, Y., Hegelheimer, V., Pendar, N., & Xu, J. (2010). Towards a computer-delivered test of productive grammatical ability. Language Testing, 27, 443–469.

    Article  Google Scholar 

  • Chapelle, C. A., Enright, M. K., & Jamieson, J. (2008). Building a validity argument for the test of English as a foreign language. New York: Routledge.

    Google Scholar 

  • Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.

    Google Scholar 

  • Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge: M.I.T. Press.

    Google Scholar 

  • Council of Europe. (2003). Relating language examinations to the Common European Framework of Reference for languages: Learning, teaching and assessment. Cambridge: Cambridge University Press.

    Google Scholar 

  • Cronbach, L. J. (1984). Essentials of psychological testing (4th ed.). New York: Harper and Row.

    Google Scholar 

  • Davidson, F. (2006). World Englishes and test construction. In B. B. Kachru, Y. Kachru, & C. Nelson (Eds.), The handbook of world Englishes (pp. 709–717). Hoboken, NJ: Wiley-Blackwell.

    Chapter  Google Scholar 

  • Davidson, F., & Fulcher, G. (2007). The Common European Framework of Reference (CEFR) and the design of language tests: A matter of effect. Language Teaching, 40, 231–24I.

    Article  Google Scholar 

  • Davies, A. (1984). Validating three tests of language proficiency. Language Testing, 1, 50–69.

    Article  Google Scholar 

  • Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33, 117–135.

    Article  Google Scholar 

  • Davis, L., Laughlin, V., Gu, L., & Ockey, G. (2016, March). Face-to-face speaking assessment in the digital age: Interactive speaking tasks on-line. Paper presented at the Georgetown University Roundtable, Washington, DC.

    Google Scholar 

  • Dimova, S. (2017). Pronunciation assessment in the context of world Englishes. In O. Kang & A. Ginther (Eds.), Assessment in second language pronunciation (pp. 49–66). New York: Routledge.

    Google Scholar 

  • Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25, 155–185.

    Article  Google Scholar 

  • Fulcher, G. (1996). Invalidating validity claims for the ACTFL oral rating scale. System, 24, 163–172.

    Article  Google Scholar 

  • Fulcher, G. (1997). An English language placement test: Issues in reliability and validity. Language Testing, 14, 113–138.

    Article  Google Scholar 

  • Fulcher, G. (2004). Deluded by artifices? The Common European Framework and harmonization. Language Assessment Quarterly, 1, 253–266.

    Article  Google Scholar 

  • Fulcher, G. (2012). Assessment literacy for the language classroom. Language Assessment Quarterly, 9, 113–132.

    Article  Google Scholar 

  • Gardener, H. (1985). The mind’s new science. New York: Basic Books.

    Google Scholar 

  • Ginther, A., & Elder, C. (2014). A comparative investigation into understandings and uses of the TOEFL iBT test, the International English Language Testing Service (academic) test, and the Pearson Test of English for Graduate Admissions in the United States and Australia: A case study of two university contexts. ETS research report No. TOEFLiBT-24. Retrieved from https://www.ets.org/research/policy_research_reports/publications/report/2014/jtms

  • Ginther, A., & Stevens, J. (1998). Language background, ethnicity, and the internal construct validity of the Advanced Placement Spanish language examination. In A. Kunnan (Ed.), Validation in language assessment (pp. 169–194). Mahwah, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Hawkins, J., & Filipović, L. (2012). Criterial features in L2 English: Specifying the reference levels of the Common European Framework. Cambridge: Cambridge University Press.

    Google Scholar 

  • Henning, G. (1984). Advantages of latent trait measurement in language testing. Language Testing, 1, 123–133.

    Article  Google Scholar 

  • Hsu, T. H.-L. (2016). Removing bias towards World Englishes: The development of a rater attitude instrument using Indian English as a stimulus. Language Testing, 33, 367–389.

    Article  Google Scholar 

  • Hymes, D. H. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics. Selected readings (pp. 269–293). Harmondsworth: Penguin.

    Google Scholar 

  • Inbar-Lourie, O. (2008). Constructing a language assessment knowledge base: A focus on language assessment courses. Language Testing, 25, 385–402.

    Article  Google Scholar 

  • Jenkins, J. (2006). Current perspectives on teaching world Englishes and English as a lingua Franca. TESOL Quarterly, 40, 157–181.

    Article  Google Scholar 

  • Kachru, B. (1985). Standards, codification and sociolinguistic realism: The English language in the Outer Circle. In R. Quirk & H. Widdowson (Eds.), English in the world, teaching and learning the language and literatures (pp. 11–30). Cambridge: Cambridge University Press.

    Google Scholar 

  • Kane, M. T. (2013). Validating the interpretation and uses of test scores. Journal of Educational Measurement, 50, 1–73.

    Article  Google Scholar 

  • Lado, R. (1961). Language testing: The construction and use of foreign language tests. London: Longman.

    Google Scholar 

  • Linn, R. L. (1998). Partitioning responsibility for the evaluation of the consequences of assessment programs. Educational Measurement: Issues and Practice, 17, 28–30.

    Article  Google Scholar 

  • Lowenberg, P. H. (1993). Issues in validity in tests of English as a world language: Whose standards? World Englishes, 12, 95–106.

    Article  Google Scholar 

  • Major, R. C., Fitzmaurice, S. F., Bunta, F., & Balasubramanian, C. (2005). Testing the effects of regional, ethnic and international dialects of English on listening comprehension. Language Learning, 55, 37–69.

    Article  Google Scholar 

  • McNamara, T. F. (1995). Modelling performance: Opening Pandora’s box. Applied Linguistics, 16, 159–179.

    Article  Google Scholar 

  • McNamara, T. F. (1996). Measuring second language performance: A new era in language testing. New York: Longman.

    Google Scholar 

  • Mehrens, W. A. (1997). The consequences of consequential validity. Educational Measurement: Issues and Practice, 16, 16–18.

    Article  Google Scholar 

  • Messick, S. (1975). The standard program: Meaning and values in measurement and evaluation. American Psychologist, 30, 955–966.

    Article  Google Scholar 

  • Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education and Macmillan.

    Google Scholar 

  • Miller, G. (2003). The cognitive revolution: A historical perspective. Trends in Cognitive Sciences, 7, 141–144.

    Article  Google Scholar 

  • Morrow, K. (1981). Communicative language testing: Revolution or evolution? In J. C. Alderson & A. Hughes (Eds.), Issues in language testing, 38 (pp. 9–26). London: The British Council.

    Google Scholar 

  • Nelson, C. (2011). Intelligibility in world Englishes. Hoboken, NJ: Blackwell.

    Google Scholar 

  • O’Loughlin, K. (2013). Developing the assessment literacy of university proficiency test users. Language Testing, 30, 363–380.

    Article  Google Scholar 

  • Papageorgiou, S., Tannenbaum, R. J., Bridgeman, B., & Cho, Y. (2015). The association between TOEFL iBT test scores and the Common European Framework of Reference (CEFR) levels. Research Memorandum-15-06. Princeton, NJ: ETS.

    Google Scholar 

  • Phakiti, A. (2008). Construct validation of Bachman and Palmer’s (1996) strategic competence model over time in EFL reading tests. Language Testing, 25, 237–272.

    Article  Google Scholar 

  • Popham, W. J. (1997). Consequential validity: Right concern – Wrong concept. Educational Measurement: Issues and Practice, 16, 9–13.

    Article  Google Scholar 

  • Rea-Dickins, P. (2001). Mirror, mirror on the wall: Identifying processes of classroom assessment. Language Testing, 18, 429–462.

    Article  Google Scholar 

  • Sawaki, Y., Stricker, L. J., & Oranje, A. H. (2009). Factor structure of the TOEFL Internet-based test. Language Testing, 26, 5–30.

    Article  Google Scholar 

  • Seidlhofer, B. (2001). Closing a conceptual gap: The case for a description of English as a lingua franca. International Journal of Applied Linguistics, 11, 133–158.

    Article  Google Scholar 

  • Shepard, L. A. (1993). Evaluating test validity. In L. Darling-Hammond (Ed.), Review of Research in Education, 19 (pp. 405–450). Washington, DC: AERA.

    Google Scholar 

  • Shiotsu, T., & Weir, C. J. (2007). The relative significance of syntactic knowledge and vocabulary breadth in the prediction of reading comprehension test performance. Language Testing, 24, 99–128.

    Article  Google Scholar 

  • Spolsky, B. (1981). Some ethical questions about language testing. In C. Klein-Braley & D. K. Stevenson (Eds.), Practice and problems in language testing (pp. 5–30). Frankfurt am Main: Peter Lang.

    Google Scholar 

  • Spolsky, B. (1986). A multiple choice for language testers. Language Testing, 3, 147–158.

    Article  Google Scholar 

  • Spolsky, B. (1993). Testing across cultures: An historical perspective. World Englishes, 12, 87–93.

    Article  Google Scholar 

  • Spolsky, B. (1995). Measured words: The development of objective language testing. Oxford: Oxford University Press.

    Google Scholar 

  • Stansfield, C. (2008). Where we have been and where we should go? Language Testing, 25, 311–326.

    Article  Google Scholar 

  • Stiggins, R. J. (1991). Assessment literacy. Phi Delta Kappan, 72, 534–539.

    Google Scholar 

  • Torkildsen, L. G., & Erickson, G. (2016). “If they’d written more…” – On students’ perceptions of assessment and assessment practices. Education Inquiry, 7, 137–157.

    Article  Google Scholar 

  • Toulmin, S. (1958). The uses of argument. Cambridge: Cambridge University Press.

    Google Scholar 

  • Toulmin, S. (2001). Return to reason. Cambridge, MA: Harvard University Press.

    Google Scholar 

  • Weigle, S. C. (2007). Teaching writing teachers about assessment. Journal of Second Language Writing, 16, 194–209.

    Article  Google Scholar 

  • Wind, S. A., & Peterson, M. E. (2017). A systematic review of methods for evaluating rating quality in language assessment. Language Testing, 35, 161–192.

    Article  Google Scholar 

  • Yan, X., Thirakunkovit, S., Kauper, N., & Ginther, A. (2016). What do test takers say: Test-taker feedback as input for quality control. In J. Read (Ed.), Post-admission language assessments of university students (pp. 157–183). Switzerland: Springer.

    Google Scholar 

  • Zhang, Y., & Elder, C. (2011). Judgments of oral proficiency by non-native and native English speaking teacher raters: Competing or complementary constructs? Language Testing, 28, 31–50.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to April Ginther .

Editor information

Editors and Affiliations

Copyright information

© 2018 The Author(s)

About this chapter

Cite this chapter

Ginther, A., McIntosh, K. (2018). Language Testing and Assessment. In: Phakiti, A., De Costa, P., Plonsky, L., Starfield, S. (eds) The Palgrave Handbook of Applied Linguistics Research Methodology. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-59900-1_39

Download citation

  • DOI: https://doi.org/10.1057/978-1-137-59900-1_39

  • Publisher Name: Palgrave Macmillan, London

  • Print ISBN: 978-1-137-59899-8

  • Online ISBN: 978-1-137-59900-1

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics