Abstract
High-stakes standardized English proficiency tests are generally used to measure the language proficiency of candidates who wish to study, work in, or immigrate to environments where English is the dominant language. The purpose of this chapter is to shed light on the complexities and the apparent disconnect between equity, integrity, fairness, and justice in standardized language proficiency tests and the integrity issues that can arise as a result. Some of the pre-Covid-19 and post-Covid-19 disconnects are outlined and potential solutions are offered by drawing from literature, our first-hand experiences and observations as test examiners and designers, as well as our experiences as academics in Canadian academic institutions. We hope that by unveiling issues rooted in socio-cultural and socio-economic injustices, our potential solutions will encourage stakeholders to work in tandem towards a more equitable and inclusive language assessment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alexander, P. A., Schallert, D. L., & Hare, V. C. (1991). Coming to terms: How researchers in learning and literacy talk about knowledge. Review of Educational Research, 61(3), 315–343. https://doi.org/10.3102/00346543061003315
Alsagoafi, A. (2018). IELTS economic washback: A case study on English major students at King Faisal University in Al-Hasa. Saudi Arabia. Language Testing in Asia, 8(1), 1–13. https://doi.org/10.1186/s40468-018-0058-3
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford University Press.
Bachman, L. F., & Purpura, J. E. (2008). Language assessments: Gate-Keepers or door-openers? In B. Spolsky & F. Hult (Eds.), The handbook of educational linguistics (pp. 456–468). Wiley-Blackwell. https://doi.org/10.1002/9780470694138.ch32
British Council. (2021). Bring your tomorrow closer. British Council. Retrieved December 5, 2021, from https://takeielts.britishcouncil.org/
Carson, V. (2014, October 7). Foreigners pay to pass visa tests. The Courier Mail. https://couriermail.com.au/news/queensland/foreigners-found-paying-professional-exam-sitters-to-pass-english-tests-for-permanent-residency/news-story/d7382477c0ea1ddd19b33459cb2d6146
Chalhoub-Deville, M., & Turner, C. E. (2000). What to look for in ESL admission tests: Cambridge certificate exams, IELTS, and TOEFL. System, 28(4), 523–539. https://doi.org/10.1016/S0346-251X(00)00036-1
Chen, Z., & Henning, G. (1985). Linguistic and cultural bias in language proficiency tests. Language Testing, 2(2), 155–163. https://doi.org/10.1177/026553228500200204
Choi, I.-C. (2008). The impact of EFL testing on EFL education in Korea. Language Testing, 25(1), 39–62. https://doi.org/10.1177/0265532207083744
Cizek, G. J. (1999). Cheating on tests: How to do it, detect it, and prevent it. Routledge.
Clark, T., Spiby, R., & Tasviri, R. (2021). Crisis, collaboration, recovery: IELTS and COVID-19. Language Assessment Quarterly, 18(1), 17–25. https://doi.org/10.1080/15434303.2020.1866575
Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33(1), 117–135. https://doi.org/10.1177/0265532215582282
Deygers, B. (2019). Fairness and social justice in English language assessment. In X. Gao (Ed.), Second handbook of English language teaching (pp. 541–569). Springer International Publishing. https://doi.org/10.1007/978-3-030-02899-2_30
Deygers, B., & Van Gorp, K. (2015). Determining the scoring validity of a co-constructed CEFR-based rating scale. Language Testing, 32(4), 521–541. https://doi.org/10.1177/0265532215575626
Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25(2), 155–185. https://doi.org/10.1177/0265532207086780
Eckes, T. (2009). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2(3), 197–221. https://doi.org/10.1207/s15434311laq0203_2
Educational Testing Service. (2020). Reliability and comparability of TOEFL iBT® scores. https://www.ets.org/research/policy_research_reports/publications/periodical/2011/isje
Freimuth, H. (2013). ‘A persistent source of Disquite’: An investigation of the cultural capital on the IELTS exam. International Journal of Education, 1(1), 9–26.
Freimuth, H. (2016). An examination of cultural bias in IELTS task 1 non-process writing prompts: A UAE perspective. Learning and Teaching in Higher Education: Gulf Perspectives, 13(1), 3–18. https://doi.org/10.18538/lthe.v13.n1.221
Fulcher, G., Davidson, F., & Kemp, J. (2011). Effective rating scale development for speaking tests: Performance decision trees. Language Testing, 28(1), 5–29. https://doi.org/10.1177/0265532209359514
Haladyna, T. M., & Downing, S. M. (2004). Construct-irrelevant variance in high-stakes testing. Educational Measurement: Issues and Practice, 23(1), 17–27. https://doi.org/10.1111/j.1745-3992.2004.tb00149.x
Hamid, M. O., Hardy, I., & Reyes, V. (2019). Test-takers’ perspectives on a global test of English: Questions of fairness, justice and validity. Language Testing in Asia, 9(16), 1–20. https://doi.org/10.1186/s40468-019-0092-9
Hamp-Lyons, L., & Zhang, B. W. (2001). World Englishes: Issues in and from academic writing assessment. In J. Flowerdew & M. Peacock (Eds.), Research perspectives on English for academic purposes (pp. 101–116). Cambridge University Press. https://doi.org/10.1017/CBO9781139524766.010
Harsch, C., & Martin, G. (2013). Comparing holistic and analytic scoring methods: Issues of validity and reliability. Assessment in Education: Principles, Policy & Practice, 20(3), 281–307. https://doi.org/10.1080/0969594X.2012.742422
He, L., & Shi, L. (2008). ESL students’ perceptions and experiences of standardized English writing tests. Assessing Writing, 13(2), 130–149. https://doi.org/10.1016/j.asw.2008.08.001
He, L., & Shi, L. (2012). Topical knowledge and ESL writing. Language Testing, 29(3), 443–464. https://doi.org/10.1177/0265532212436659
Hoang, N. T. H., & Hamid, M. O. (2017). ‘A fair go for all?’ Australia’s language-in-migration policy. Discourse: Studies in the Cultural Politics of Education, 38(6), 836–850. https://doi.org/10.1080/01596306.2016.1199527
Holland, P. W., & Wainer, H. (2012). Differential item functioning. Routledge.
Hu, R., & Trenkic, D. (2019). The effects of coaching and repeated test-taking on Chinese candidates’ IELTS scores, their English proficiency, and subsequent academic achievement. International Journal of Bilingual Education and Bilingualism, 24(10), 1486–1501. https://doi.org/10.1080/13670050.2019.1691498
IELTS. (2007). The IELTS handbook. University of Cambridge Local Examinations Syndicate.
IELTS. (2021a). IELTS – One test, countell opportunities. IELTS. https://www.idp.com/global/ielts/
IELTS. (2021b). Why accept IELTS scores? IELTS. https://www.ielts.org/for-organisations/why-accept-ielts-scores
Isaacs, T., & Thomson, R. I. (2013). Rater experience, rating scale length, and judgments of L2 pronunciation: Revisiting research conventions. Language Assessment Quarterly, 10(2), 135–159. https://doi.org/10.1080/15434303.2013.769545
Isbell, D. R., & Kremmel, B. (2020). Test review: Current options in at-home language proficiency tests for making high-stakes decisions. Language Testing, 37(4), 600–619. https://doi.org/10.1177/0265532220943483
Mulkey, J. R., & Fremer, J. (2005). Securing and proctoring online assessments. In S. L. Howell & M. Hricko (Eds.), Online assessment and measurement: Foundations and challenges (pp. 280–299). IGI Global. https://doi.org/10.4018/978-1-59140-720-1.ch013
Khan, R. (2006). The IELTS speaking test: Analysing culture bias. Malaysian Journal of ELT Research, 2(1), 61–79.
Kunnan, A. J. (2017). Evaluating language assessments. Routledge. https://doi.org/10.4324/9780203803554
Laing, S. P., & Kamhi, A. (2003). Alternative assessment of language and literacy in culturally and linguistically diverse populations. Language, Speech & Hearing Services in Schools, 34(1), 44–55. https://doi.org/10.1044/0161-1461(2003/005)
Lumley, T. (2002). Assessment criteria in a large-scale writing test: What do they really mean to the raters? Language Testing, 19(3), 246–276. https://doi.org/10.1191/0265532202lt230oa
McLeod, M. (2021). Investigating test purpose pluralism and test retrofitting in high-stakes language proficiency testing. Canadian Journal for New Scholars in Education/Revue Canadienne Des Jeunes Chercheures et Chercheurs En Éducation, 12(1), 85–93.
McNamara, T. (1996). Measuring second language performance. Longman.
McNamara, T. (2010). The use of language tests in the service of policy: Issues of validity. Revue Française de Linguistique Appliquée, 15(1), 7–23. https://doi.org/10.3917/rfla.151.0007
McNamara, T. (2012). Language assessments as shibboleths: A poststructuralist perspective. Applied Linguistics, 33(5), 564–581. https://doi.org/10.1093/applin/ams052
McNamara, T., & Ryan, K. (2011). Fairness versus justice in language testing: The place of English literacy in the Australian citizenship test. Language Assessment Quarterly, 8(2), 161–178. https://doi.org/10.1080/15434303.2011.565438
Messick, S. (1981). Evidence and ethics in the evaluation of tests. Educational Researcher, 10(9), 9–20. https://doi.org/10.3102/0013189X010009009
Messick, S. (1998). Test validity: A matter of consequence. Social Indicators Research, 45(1), 35–44. https://doi.org/10.1023/A:1006964925094
Noori, M., & Mirhosseini, S.-A. (2021). Testing language, but what?: Examining the carrier content of IELTS preparation materials from a critical perspective. Language Assessment Quarterly, 18(4), 382–397. https://doi.org/10.1080/15434303.2021.1883618
Nussbaum, M. (2002). Capabilities and social justice. International Studies Review, 4(2), 123–135. https://doi.org/10.1111/1521-9488.00258
Ockey, G. J., & Wagner, E. (2018). Assessing L2 listening: Moving towards authenticity (Vol. 50). John Benjamins Publishing Company.
O’Loughlin, K. (2002). The impact of gender in oral proficiency testing. Language Testing, 19(2), 169–192. https://doi.org/10.1191/0265532202lt226oa
O’Sullivan, B. (2000). Exploring gender and oral proficiency interview performance. System, 28(3), 373–386. https://doi.org/10.1016/S0346-251X(00)00018-X
O’Sullivan, B. (2002). Learner acquaintanceship and oral proficiency test pair-task performance. Language Testing, 19(3), 277–295. https://doi.org/10.1191/0265532202lt205oa
Pearson, W. S. (2019). Critical perspectives on the IELTS test. ELT Journal, 73(2), 197–206. https://doi.org/10.1093/elt/ccz006
Ross, S. J., & Okabe, J. (2006). The subjective and objective interface of bias detection on language tests. International Journal of Testing, 6(3), 229–253. https://doi.org/10.1207/s15327574ijt0603_2
Sawyer, W., & Singh, M. (2011). Learning to play the ‘classroom tennis’ well: IELTS and international students in teacher education. IELTS Research Reports, 11(2), 1–54.
Sen, A. K. (2009). The idea of justice. Harvard University Press.
Shephard, J. (2008, June). Cheating gets closer examination. The Guardian.
Shohamy, E. (2001). Democratic assessment as an alternative. Language Testing, 18(4), 373–391. https://doi.org/10.1177/026553220101800404
Shohamy, E. (2007). Language tests as language policy tools. Assessment in Education, 14(1), 117–130. https://doi.org/10.1080/09695940701272948
Shohamy, E. (2008). Language policy and language assessment: The relationship. Current Issues in Language Planning, 9(3), 363–373. https://doi.org/10.1080/14664200802139604
Shohamy, E. (2013). The discourse of language testing as a tool for shaping national, global, and transnational identities. Language and Intercultural Communication, 13(2), 225–236. https://doi.org/10.1080/14708477.2013.770868
Tweedie, M. G., & Chu, M.-W. (2019). Challenging equivalency in measures of English language proficiency for university admission: Data from an undergraduate engineering programme. Studies in Higher Education, 44(4), 683–695. https://doi.org/10.1080/03075079.2017.1395008
Wagner, E. (2020). Duolingo English test, revised version July 2019. Language Assessment Quarterly, 17(3), 300–315. https://doi.org/10.1080/15434303.2020.1771343
Weigle, S. C. (2002). Assessing writing. Cambridge University Press. https://doi.org/10.1017/CBO9780511732997
Wolfe, E. W. (2004). Identifying rater effects using latent trait models. Psychology Science, 46(1), 35–51.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Sabbaghan, S., Fazel, I. (2023). None of the Above: Integrity Concerns of Standardized English Proficiency Tests. In: Eaton, S.E., Carmichael, J.J., Pethrick, H. (eds) Fake Degrees and Fraudulent Credentials in Higher Education . Ethics and Integrity in Educational Contexts, vol 5. Springer, Cham. https://doi.org/10.1007/978-3-031-21796-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-21796-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21795-1
Online ISBN: 978-3-031-21796-8
eBook Packages: EducationEducation (R0)