Abstract
We begin by examining the history of language testing and assessment as parallel to the development of large-scale, high-stakes language proficiency tests (e.g., TOEFL) used primarily for admission into institutions of higher learning. We then discuss core concepts in the field and provide an overview of the most commonly used research methods. Lastly, we address a number of challenges and concerns arising from tensions between those who see the growing emphasis on testing as a way to ensure fairness and accountability and those who believe it results in bias and inequality. Consequential validity, assessment literacy, and world Englishes/English as a lingua franca are discussed in relation to language tests and assessments as used for decision-making purposes in various domains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
ACTFL. (2012). ACTFL proficiency guidelines (Revised). Alexandria, VA: American Council on the Teaching of Foreign Languages.
Alderson, C. (1991). Language testing in the 1990s: How far have we come? How much further have we to go? In A. Sarinee (Ed.), Current developments in language testing: Anthology Series 25 (pp. 1–27). Singapore: Regional Language Centre.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Angoff, W. H. (1988). Validity: An evolving concept. In H. Wainer & H. Braun (Eds.), Test validity (pp. 9–13). Hillsdale, NJ: Lawrence Erlbaum.
Bachman, L. (1988). Problems in examining the validity of the ACTFL oral proficiency interview. Studies in Second Language Acquisition, 10, 149–164.
Bachman, L., & Savignon, S. (1986). The evaluation of communicative language proficiency: A critique of the ACTFL oral interview. Modern Language Journal, 70, 380–391.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press.
Banerjee, J., & Luoma, S. (1997). Qualitative approaches to test validation. In C. Clapham & D. Corson (Eds.), Encyclopedia of language and education, Volume 7: Language testing and assessment (pp. 275–287). Dordrecht: Kluwer Academic.
Berns, M. (2008). World Englishes, English as a lingua franca, and intelligibility. World Englishes, 27, 327–334.
Canale, M. (1983). From communicative competence to communicative language pedagogy. In J. C. Richards & R. W. Schmidt (Eds.), Language and communication (pp. 2–27). New York: Longman.
Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1, 1–47.
Carroll, J. B. (1961). Fundamental considerations in testing for English language proficiency of foreign students. In H. B. Allen & R. N. Campbell (Eds.), Teaching English as a second language: A book of readings (2nd ed., pp. 313–321). New York: McGraw Hill.
Carroll, J. B. (1986). LT + 25, and beyond. Language Testing, 3, 123–129.
Chapelle, C., Chung, Y., Hegelheimer, V., Pendar, N., & Xu, J. (2010). Towards a computer-delivered test of productive grammatical ability. Language Testing, 27, 443–469.
Chapelle, C. A., Enright, M. K., & Jamieson, J. (2008). Building a validity argument for the test of English as a foreign language. New York: Routledge.
Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge: M.I.T. Press.
Council of Europe. (2003). Relating language examinations to the Common European Framework of Reference for languages: Learning, teaching and assessment. Cambridge: Cambridge University Press.
Cronbach, L. J. (1984). Essentials of psychological testing (4th ed.). New York: Harper and Row.
Davidson, F. (2006). World Englishes and test construction. In B. B. Kachru, Y. Kachru, & C. Nelson (Eds.), The handbook of world Englishes (pp. 709–717). Hoboken, NJ: Wiley-Blackwell.
Davidson, F., & Fulcher, G. (2007). The Common European Framework of Reference (CEFR) and the design of language tests: A matter of effect. Language Teaching, 40, 231–24I.
Davies, A. (1984). Validating three tests of language proficiency. Language Testing, 1, 50–69.
Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33, 117–135.
Davis, L., Laughlin, V., Gu, L., & Ockey, G. (2016, March). Face-to-face speaking assessment in the digital age: Interactive speaking tasks on-line. Paper presented at the Georgetown University Roundtable, Washington, DC.
Dimova, S. (2017). Pronunciation assessment in the context of world Englishes. In O. Kang & A. Ginther (Eds.), Assessment in second language pronunciation (pp. 49–66). New York: Routledge.
Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25, 155–185.
Fulcher, G. (1996). Invalidating validity claims for the ACTFL oral rating scale. System, 24, 163–172.
Fulcher, G. (1997). An English language placement test: Issues in reliability and validity. Language Testing, 14, 113–138.
Fulcher, G. (2004). Deluded by artifices? The Common European Framework and harmonization. Language Assessment Quarterly, 1, 253–266.
Fulcher, G. (2012). Assessment literacy for the language classroom. Language Assessment Quarterly, 9, 113–132.
Gardener, H. (1985). The mind’s new science. New York: Basic Books.
Ginther, A., & Elder, C. (2014). A comparative investigation into understandings and uses of the TOEFL iBT test, the International English Language Testing Service (academic) test, and the Pearson Test of English for Graduate Admissions in the United States and Australia: A case study of two university contexts. ETS research report No. TOEFLiBT-24. Retrieved from https://www.ets.org/research/policy_research_reports/publications/report/2014/jtms
Ginther, A., & Stevens, J. (1998). Language background, ethnicity, and the internal construct validity of the Advanced Placement Spanish language examination. In A. Kunnan (Ed.), Validation in language assessment (pp. 169–194). Mahwah, NJ: Lawrence Erlbaum.
Hawkins, J., & Filipović, L. (2012). Criterial features in L2 English: Specifying the reference levels of the Common European Framework. Cambridge: Cambridge University Press.
Henning, G. (1984). Advantages of latent trait measurement in language testing. Language Testing, 1, 123–133.
Hsu, T. H.-L. (2016). Removing bias towards World Englishes: The development of a rater attitude instrument using Indian English as a stimulus. Language Testing, 33, 367–389.
Hymes, D. H. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics. Selected readings (pp. 269–293). Harmondsworth: Penguin.
Inbar-Lourie, O. (2008). Constructing a language assessment knowledge base: A focus on language assessment courses. Language Testing, 25, 385–402.
Jenkins, J. (2006). Current perspectives on teaching world Englishes and English as a lingua Franca. TESOL Quarterly, 40, 157–181.
Kachru, B. (1985). Standards, codification and sociolinguistic realism: The English language in the Outer Circle. In R. Quirk & H. Widdowson (Eds.), English in the world, teaching and learning the language and literatures (pp. 11–30). Cambridge: Cambridge University Press.
Kane, M. T. (2013). Validating the interpretation and uses of test scores. Journal of Educational Measurement, 50, 1–73.
Lado, R. (1961). Language testing: The construction and use of foreign language tests. London: Longman.
Linn, R. L. (1998). Partitioning responsibility for the evaluation of the consequences of assessment programs. Educational Measurement: Issues and Practice, 17, 28–30.
Lowenberg, P. H. (1993). Issues in validity in tests of English as a world language: Whose standards? World Englishes, 12, 95–106.
Major, R. C., Fitzmaurice, S. F., Bunta, F., & Balasubramanian, C. (2005). Testing the effects of regional, ethnic and international dialects of English on listening comprehension. Language Learning, 55, 37–69.
McNamara, T. F. (1995). Modelling performance: Opening Pandora’s box. Applied Linguistics, 16, 159–179.
McNamara, T. F. (1996). Measuring second language performance: A new era in language testing. New York: Longman.
Mehrens, W. A. (1997). The consequences of consequential validity. Educational Measurement: Issues and Practice, 16, 16–18.
Messick, S. (1975). The standard program: Meaning and values in measurement and evaluation. American Psychologist, 30, 955–966.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education and Macmillan.
Miller, G. (2003). The cognitive revolution: A historical perspective. Trends in Cognitive Sciences, 7, 141–144.
Morrow, K. (1981). Communicative language testing: Revolution or evolution? In J. C. Alderson & A. Hughes (Eds.), Issues in language testing, 38 (pp. 9–26). London: The British Council.
Nelson, C. (2011). Intelligibility in world Englishes. Hoboken, NJ: Blackwell.
O’Loughlin, K. (2013). Developing the assessment literacy of university proficiency test users. Language Testing, 30, 363–380.
Papageorgiou, S., Tannenbaum, R. J., Bridgeman, B., & Cho, Y. (2015). The association between TOEFL iBT test scores and the Common European Framework of Reference (CEFR) levels. Research Memorandum-15-06. Princeton, NJ: ETS.
Phakiti, A. (2008). Construct validation of Bachman and Palmer’s (1996) strategic competence model over time in EFL reading tests. Language Testing, 25, 237–272.
Popham, W. J. (1997). Consequential validity: Right concern – Wrong concept. Educational Measurement: Issues and Practice, 16, 9–13.
Rea-Dickins, P. (2001). Mirror, mirror on the wall: Identifying processes of classroom assessment. Language Testing, 18, 429–462.
Sawaki, Y., Stricker, L. J., & Oranje, A. H. (2009). Factor structure of the TOEFL Internet-based test. Language Testing, 26, 5–30.
Seidlhofer, B. (2001). Closing a conceptual gap: The case for a description of English as a lingua franca. International Journal of Applied Linguistics, 11, 133–158.
Shepard, L. A. (1993). Evaluating test validity. In L. Darling-Hammond (Ed.), Review of Research in Education, 19 (pp. 405–450). Washington, DC: AERA.
Shiotsu, T., & Weir, C. J. (2007). The relative significance of syntactic knowledge and vocabulary breadth in the prediction of reading comprehension test performance. Language Testing, 24, 99–128.
Spolsky, B. (1981). Some ethical questions about language testing. In C. Klein-Braley & D. K. Stevenson (Eds.), Practice and problems in language testing (pp. 5–30). Frankfurt am Main: Peter Lang.
Spolsky, B. (1986). A multiple choice for language testers. Language Testing, 3, 147–158.
Spolsky, B. (1993). Testing across cultures: An historical perspective. World Englishes, 12, 87–93.
Spolsky, B. (1995). Measured words: The development of objective language testing. Oxford: Oxford University Press.
Stansfield, C. (2008). Where we have been and where we should go? Language Testing, 25, 311–326.
Stiggins, R. J. (1991). Assessment literacy. Phi Delta Kappan, 72, 534–539.
Torkildsen, L. G., & Erickson, G. (2016). “If they’d written more…” – On students’ perceptions of assessment and assessment practices. Education Inquiry, 7, 137–157.
Toulmin, S. (1958). The uses of argument. Cambridge: Cambridge University Press.
Toulmin, S. (2001). Return to reason. Cambridge, MA: Harvard University Press.
Weigle, S. C. (2007). Teaching writing teachers about assessment. Journal of Second Language Writing, 16, 194–209.
Wind, S. A., & Peterson, M. E. (2017). A systematic review of methods for evaluating rating quality in language assessment. Language Testing, 35, 161–192.
Yan, X., Thirakunkovit, S., Kauper, N., & Ginther, A. (2016). What do test takers say: Test-taker feedback as input for quality control. In J. Read (Ed.), Post-admission language assessments of university students (pp. 157–183). Switzerland: Springer.
Zhang, Y., & Elder, C. (2011). Judgments of oral proficiency by non-native and native English speaking teacher raters: Competing or complementary constructs? Language Testing, 28, 31–50.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Copyright information
© 2018 The Author(s)
About this chapter
Cite this chapter
Ginther, A., McIntosh, K. (2018). Language Testing and Assessment. In: Phakiti, A., De Costa, P., Plonsky, L., Starfield, S. (eds) The Palgrave Handbook of Applied Linguistics Research Methodology. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-59900-1_39
Download citation
DOI: https://doi.org/10.1057/978-1-137-59900-1_39
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-1-137-59899-8
Online ISBN: 978-1-137-59900-1
eBook Packages: Social SciencesSocial Sciences (R0)