Language Testing and Assessment

Ginther, April; McIntosh, Kyle

doi:10.1057/978-1-137-59900-1_39

April Ginther⁵ &
Kyle McIntosh⁶

6911 Accesses
5 Citations

Abstract

We begin by examining the history of language testing and assessment as parallel to the development of large-scale, high-stakes language proficiency tests (e.g., TOEFL) used primarily for admission into institutions of higher learning. We then discuss core concepts in the field and provide an overview of the most commonly used research methods. Lastly, we address a number of challenges and concerns arising from tensions between those who see the growing emphasis on testing as a way to ensure fairness and accountability and those who believe it results in bias and inequality. Consequential validity, assessment literacy, and world Englishes/English as a lingua franca are discussed in relation to language tests and assessments as used for decision-making purposes in various domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Hardcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

ACTFL. (2012). ACTFL proficiency guidelines (Revised). Alexandria, VA: American Council on the Teaching of Foreign Languages.
Google Scholar
Alderson, C. (1991). Language testing in the 1990s: How far have we come? How much further have we to go? In A. Sarinee (Ed.), Current developments in language testing: Anthology Series 25 (pp. 1–27). Singapore: Regional Language Centre.
Google Scholar
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Google Scholar
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Google Scholar
Angoff, W. H. (1988). Validity: An evolving concept. In H. Wainer & H. Braun (Eds.), Test validity (pp. 9–13). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Bachman, L. (1988). Problems in examining the validity of the ACTFL oral proficiency interview. Studies in Second Language Acquisition, 10, 149–164.
Article Google Scholar
Bachman, L., & Savignon, S. (1986). The evaluation of communicative language proficiency: A critique of the ACTFL oral interview. Modern Language Journal, 70, 380–391.
Article Google Scholar
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
Google Scholar
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press.
Google Scholar
Banerjee, J., & Luoma, S. (1997). Qualitative approaches to test validation. In C. Clapham & D. Corson (Eds.), Encyclopedia of language and education, Volume 7: Language testing and assessment (pp. 275–287). Dordrecht: Kluwer Academic.
Chapter Google Scholar
Berns, M. (2008). World Englishes, English as a lingua franca, and intelligibility. World Englishes, 27, 327–334.
Article Google Scholar
Canale, M. (1983). From communicative competence to communicative language pedagogy. In J. C. Richards & R. W. Schmidt (Eds.), Language and communication (pp. 2–27). New York: Longman.
Google Scholar
Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1, 1–47.
Article Google Scholar
Carroll, J. B. (1961). Fundamental considerations in testing for English language proficiency of foreign students. In H. B. Allen & R. N. Campbell (Eds.), Teaching English as a second language: A book of readings (2nd ed., pp. 313–321). New York: McGraw Hill.
Google Scholar
Carroll, J. B. (1986). LT + 25, and beyond. Language Testing, 3, 123–129.
Article Google Scholar
Chapelle, C., Chung, Y., Hegelheimer, V., Pendar, N., & Xu, J. (2010). Towards a computer-delivered test of productive grammatical ability. Language Testing, 27, 443–469.
Article Google Scholar
Chapelle, C. A., Enright, M. K., & Jamieson, J. (2008). Building a validity argument for the test of English as a foreign language. New York: Routledge.
Google Scholar
Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.
Google Scholar
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge: M.I.T. Press.
Google Scholar
Council of Europe. (2003). Relating language examinations to the Common European Framework of Reference for languages: Learning, teaching and assessment. Cambridge: Cambridge University Press.
Google Scholar
Cronbach, L. J. (1984). Essentials of psychological testing (4th ed.). New York: Harper and Row.
Google Scholar
Davidson, F. (2006). World Englishes and test construction. In B. B. Kachru, Y. Kachru, & C. Nelson (Eds.), The handbook of world Englishes (pp. 709–717). Hoboken, NJ: Wiley-Blackwell.
Chapter Google Scholar
Davidson, F., & Fulcher, G. (2007). The Common European Framework of Reference (CEFR) and the design of language tests: A matter of effect. Language Teaching, 40, 231–24I.
Article Google Scholar
Davies, A. (1984). Validating three tests of language proficiency. Language Testing, 1, 50–69.
Article Google Scholar
Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33, 117–135.
Article Google Scholar
Davis, L., Laughlin, V., Gu, L., & Ockey, G. (2016, March). Face-to-face speaking assessment in the digital age: Interactive speaking tasks on-line. Paper presented at the Georgetown University Roundtable, Washington, DC.
Google Scholar
Dimova, S. (2017). Pronunciation assessment in the context of world Englishes. In O. Kang & A. Ginther (Eds.), Assessment in second language pronunciation (pp. 49–66). New York: Routledge.
Google Scholar
Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25, 155–185.
Article Google Scholar
Fulcher, G. (1996). Invalidating validity claims for the ACTFL oral rating scale. System, 24, 163–172.
Article Google Scholar
Fulcher, G. (1997). An English language placement test: Issues in reliability and validity. Language Testing, 14, 113–138.
Article Google Scholar
Fulcher, G. (2004). Deluded by artifices? The Common European Framework and harmonization. Language Assessment Quarterly, 1, 253–266.
Article Google Scholar
Fulcher, G. (2012). Assessment literacy for the language classroom. Language Assessment Quarterly, 9, 113–132.
Article Google Scholar
Gardener, H. (1985). The mind’s new science. New York: Basic Books.
Google Scholar
Ginther, A., & Elder, C. (2014). A comparative investigation into understandings and uses of the TOEFL iBT test, the International English Language Testing Service (academic) test, and the Pearson Test of English for Graduate Admissions in the United States and Australia: A case study of two university contexts. ETS research report No. TOEFLiBT-24. Retrieved from https://www.ets.org/research/policy_research_reports/publications/report/2014/jtms
Ginther, A., & Stevens, J. (1998). Language background, ethnicity, and the internal construct validity of the Advanced Placement Spanish language examination. In A. Kunnan (Ed.), Validation in language assessment (pp. 169–194). Mahwah, NJ: Lawrence Erlbaum.
Google Scholar
Hawkins, J., & Filipović, L. (2012). Criterial features in L2 English: Specifying the reference levels of the Common European Framework. Cambridge: Cambridge University Press.
Google Scholar
Henning, G. (1984). Advantages of latent trait measurement in language testing. Language Testing, 1, 123–133.
Article Google Scholar
Hsu, T. H.-L. (2016). Removing bias towards World Englishes: The development of a rater attitude instrument using Indian English as a stimulus. Language Testing, 33, 367–389.
Article Google Scholar
Hymes, D. H. (1972). On communicative competence. In J. B. Pride & J. Holmes (Eds.), Sociolinguistics. Selected readings (pp. 269–293). Harmondsworth: Penguin.
Google Scholar
Inbar-Lourie, O. (2008). Constructing a language assessment knowledge base: A focus on language assessment courses. Language Testing, 25, 385–402.
Article Google Scholar
Jenkins, J. (2006). Current perspectives on teaching world Englishes and English as a lingua Franca. TESOL Quarterly, 40, 157–181.
Article Google Scholar
Kachru, B. (1985). Standards, codification and sociolinguistic realism: The English language in the Outer Circle. In R. Quirk & H. Widdowson (Eds.), English in the world, teaching and learning the language and literatures (pp. 11–30). Cambridge: Cambridge University Press.
Google Scholar
Kane, M. T. (2013). Validating the interpretation and uses of test scores. Journal of Educational Measurement, 50, 1–73.
Article Google Scholar
Lado, R. (1961). Language testing: The construction and use of foreign language tests. London: Longman.
Google Scholar
Linn, R. L. (1998). Partitioning responsibility for the evaluation of the consequences of assessment programs. Educational Measurement: Issues and Practice, 17, 28–30.
Article Google Scholar
Lowenberg, P. H. (1993). Issues in validity in tests of English as a world language: Whose standards? World Englishes, 12, 95–106.
Article Google Scholar
Major, R. C., Fitzmaurice, S. F., Bunta, F., & Balasubramanian, C. (2005). Testing the effects of regional, ethnic and international dialects of English on listening comprehension. Language Learning, 55, 37–69.
Article Google Scholar
McNamara, T. F. (1995). Modelling performance: Opening Pandora’s box. Applied Linguistics, 16, 159–179.
Article Google Scholar
McNamara, T. F. (1996). Measuring second language performance: A new era in language testing. New York: Longman.
Google Scholar
Mehrens, W. A. (1997). The consequences of consequential validity. Educational Measurement: Issues and Practice, 16, 16–18.
Article Google Scholar
Messick, S. (1975). The standard program: Meaning and values in measurement and evaluation. American Psychologist, 30, 955–966.
Article Google Scholar
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: American Council on Education and Macmillan.
Google Scholar
Miller, G. (2003). The cognitive revolution: A historical perspective. Trends in Cognitive Sciences, 7, 141–144.
Article Google Scholar
Morrow, K. (1981). Communicative language testing: Revolution or evolution? In J. C. Alderson & A. Hughes (Eds.), Issues in language testing, 38 (pp. 9–26). London: The British Council.
Google Scholar
Nelson, C. (2011). Intelligibility in world Englishes. Hoboken, NJ: Blackwell.
Google Scholar
O’Loughlin, K. (2013). Developing the assessment literacy of university proficiency test users. Language Testing, 30, 363–380.
Article Google Scholar
Papageorgiou, S., Tannenbaum, R. J., Bridgeman, B., & Cho, Y. (2015). The association between TOEFL iBT test scores and the Common European Framework of Reference (CEFR) levels. Research Memorandum-15-06. Princeton, NJ: ETS.
Google Scholar
Phakiti, A. (2008). Construct validation of Bachman and Palmer’s (1996) strategic competence model over time in EFL reading tests. Language Testing, 25, 237–272.
Article Google Scholar
Popham, W. J. (1997). Consequential validity: Right concern – Wrong concept. Educational Measurement: Issues and Practice, 16, 9–13.
Article Google Scholar
Rea-Dickins, P. (2001). Mirror, mirror on the wall: Identifying processes of classroom assessment. Language Testing, 18, 429–462.
Article Google Scholar
Sawaki, Y., Stricker, L. J., & Oranje, A. H. (2009). Factor structure of the TOEFL Internet-based test. Language Testing, 26, 5–30.
Article Google Scholar
Seidlhofer, B. (2001). Closing a conceptual gap: The case for a description of English as a lingua franca. International Journal of Applied Linguistics, 11, 133–158.
Article Google Scholar
Shepard, L. A. (1993). Evaluating test validity. In L. Darling-Hammond (Ed.), Review of Research in Education, 19 (pp. 405–450). Washington, DC: AERA.
Google Scholar
Shiotsu, T., & Weir, C. J. (2007). The relative significance of syntactic knowledge and vocabulary breadth in the prediction of reading comprehension test performance. Language Testing, 24, 99–128.
Article Google Scholar
Spolsky, B. (1981). Some ethical questions about language testing. In C. Klein-Braley & D. K. Stevenson (Eds.), Practice and problems in language testing (pp. 5–30). Frankfurt am Main: Peter Lang.
Google Scholar
Spolsky, B. (1986). A multiple choice for language testers. Language Testing, 3, 147–158.
Article Google Scholar
Spolsky, B. (1993). Testing across cultures: An historical perspective. World Englishes, 12, 87–93.
Article Google Scholar
Spolsky, B. (1995). Measured words: The development of objective language testing. Oxford: Oxford University Press.
Google Scholar
Stansfield, C. (2008). Where we have been and where we should go? Language Testing, 25, 311–326.
Article Google Scholar
Stiggins, R. J. (1991). Assessment literacy. Phi Delta Kappan, 72, 534–539.
Google Scholar
Torkildsen, L. G., & Erickson, G. (2016). “If they’d written more…” – On students’ perceptions of assessment and assessment practices. Education Inquiry, 7, 137–157.
Article Google Scholar
Toulmin, S. (1958). The uses of argument. Cambridge: Cambridge University Press.
Google Scholar
Toulmin, S. (2001). Return to reason. Cambridge, MA: Harvard University Press.
Google Scholar
Weigle, S. C. (2007). Teaching writing teachers about assessment. Journal of Second Language Writing, 16, 194–209.
Article Google Scholar
Wind, S. A., & Peterson, M. E. (2017). A systematic review of methods for evaluating rating quality in language assessment. Language Testing, 35, 161–192.
Article Google Scholar
Yan, X., Thirakunkovit, S., Kauper, N., & Ginther, A. (2016). What do test takers say: Test-taker feedback as input for quality control. In J. Read (Ed.), Post-admission language assessments of university students (pp. 157–183). Switzerland: Springer.
Google Scholar
Zhang, Y., & Elder, C. (2011). Judgments of oral proficiency by non-native and native English speaking teacher raters: Competing or complementary constructs? Language Testing, 28, 31–50.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of English, Purdue University, West Lafayette, IN, USA
April Ginther
Department of English and Writing, University of Tampa, Tampa, FL, USA
Kyle McIntosh

Authors

April Ginther
View author publications
You can also search for this author in PubMed Google Scholar
Kyle McIntosh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to April Ginther .

Editor information

Editors and Affiliations

Sydney School of Education and Social Work, University of Sydney, Sydney, NSW, Australia
Aek Phakiti
Department of Linguistics, Germanic, Slavic, Asian and African Languages, Michigan State University, East Lansing, MI, USA
Peter De Costa
Applied Linguistics, Northern Arizona University, Flagstaff, AZ, USA
Luke Plonsky
School of Education, UNSW Sydney, Sydney, NSW, Australia
Sue Starfield

Copyright information

About this chapter

Cite this chapter

Ginther, A., McIntosh, K. (2018). Language Testing and Assessment. In: Phakiti, A., De Costa, P., Plonsky, L., Starfield, S. (eds) The Palgrave Handbook of Applied Linguistics Research Methodology. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-59900-1_39

Download citation

DOI: https://doi.org/10.1057/978-1-137-59900-1_39
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-1-137-59899-8
Online ISBN: 978-1-137-59900-1
eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics