New Directions in Testing English Language Proficiency for University Entrance

Cumming, Alister

doi:10.1007/978-0-387-46301-8_34

Alister Cumming³

Part of the book series: Springer International Handbooks of Education ((SIHE,volume 15))

25k Accesses
4 Citations

Abstract

This chapter reviews recent trends in the conceptualizations and formats of tests used to determine whether non-native speakers of English have sufficient proficiency in English to study at English-medium universities in English-dominant countries. The review focuses on published research informing a new version of the Test of English as a Foreign Language (TOEFL), but a range of similar tests internationally is also considered. Prominent among the issues guiding research and development on these tests are the following: construct validation, particularly refinements in the description of testing purposes, evaluations of the discourse produced in the contexts of testing, and surveys of relevant domains and score users; consistency, including fairness in opportunities for test performance across differing populations, reliability through field-testing and equating of test forms, and sampling of multiple, comparable performances from examinees; and innovations in the media of test administration, including various forms of computer and other technological adaptations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 429.00; Price excludes VAT (USA)

Softcover Book: USD 549.99; Price excludes VAT (USA)

Hardcover Book: USD 549.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

AERA (American Educational Research Association), APA (American Psychological Association), & NCME (National Council on Measurement and Evaluation). (1999). Standards for educational and psychological assessment. Washington, DC: Authors.
Google Scholar
Alderson, J. C., & Banerjee, J. (2002). State of the art review: Language testing and assessment (Part 2). Language Teaching, 35, 79–113.
Article Google Scholar
Alderson, J. C., & Clapham, C. (1992). Applied linguistics and language testing: A case study of the ELTS test. Applied Linguistics, 13(2), 149–167.
Article Google Scholar
Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press.
Google Scholar
Alderson, J. C., & Hamp-Lyons, L. (1996). TOEFL preparation courses: A study of washback. TESOL Quarterly, 13, 280–297.
Google Scholar
Bachman, L. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17(1), 1–42.
Google Scholar
Bachman, L., Davidson, F., Ryan, K., & Choi, I. (1995). An investigation into the comparability of two tests of English as a foreign language: The Cambridge-TOEFL comparability study. Cambridge: Cambridge University Press.
Google Scholar
Bailey, K. (1999). Washback in language testing (TOEFL Monograph No. 15). Princeton, NJ: Educational Testing Service.
Google Scholar
Benesch, S. (Ed.). (1988). Ending remediation: Linking ESL and content in higher education. Washington, DC: TESOL.
Google Scholar
Bejar, I., Douglas, D., Jamieson, J., Nissan, S., & Turner, J. (2000). TOEFL 2000 listening framework: A working paper (TOEFL Monograph Series, Report No. 19). Princeton, NJ: Educational Testing Service.
Google Scholar
Biber, D., Conrad, S., Reppen, R., Byrd, P., & Helt, M. (2002). Speaking and writing in the university: A multidimensional comparison. TESOL Quarterly, 36, 9–48.
Article Google Scholar
Boyle, A., & Booth, D. (2000, March). The UCLES/CUP learner corpus. Research Notes: University of Cambridge Local Examinations Syndicate EFL, 1, 10.
Google Scholar
Bridgeman, B., Cline, F., & Powers, D. (2002, April). Evaluating new tasks for TOEFL: Relationships to external criteria. Paper presented at the Annual TESOL Convention, Salt Lake City, UT.
Google Scholar
Brindley, G. (1998). Describing language development? Rating scales and second language acquisition. In L. Bachman & A. Cohen (Eds.), Interfaces between second language acquisition and language testing research (pp. 112–140). Cambridge: Cambridge University Press.
Google Scholar
Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency. Language Testing, 20, 1–25.
Article Google Scholar
Butler, F., Eignor, D., Jones, S., McNamara, T., & Suomi, B. (2000). TOEFL 2000 speaking framework: A working paper (TOEFL Monograph Series, Report No. 20). Princeton, NJ: Educational Testing Service.
Google Scholar
Carey, P. (1966). A review of psychometric and consequential issues related to performance assessment (TOEFL Monograph Series, Report No. 3). Princeton, NJ: Educational Testing Service.
Google Scholar
Carrell, P., Dunkel, P., & Mollaun, P. (2002). The effects of notetaking, lecture length and topic on the listening component of TOEFL 2000. (TOEFL Monograph Series, Report No. 23). Princeton, NJ: Educational Testing Service.
Google Scholar
Carroll, J. B. (1975). The teaching of French as a foreign language in eight countries. John Wiley & Sons: New York.
Google Scholar
Chalhoub-Deville, M., & Deville, C. (1999). Computer-adaptive testing in second language contexts. Annual Review of Applied Linguistics, 19, 273–299.
Article Google Scholar
Chapelle, C. (1998). Construct definition and validity inquiry in SLA research. In L. Bachman & A. Cohen (Eds.), Interfaces between second language acquisition and language testing research (pp. 32–70). Cambridge: Cambridge University Press.
Google Scholar
Chapelle, C. (2001). Computer applications in second language acquisition: Foundations for teaching, testing and research. Cambridge: Cambridge University Press.
Google Scholar
Chapelle, C., Grabe, W., & Berns, M. (1997). Communicative language proficiency: Definition and implications for TOEFL 2000. (TOEFL Monograph Series Report No. 10). Princeton, NJ: Educational Testing Service.
Google Scholar
Charge, N., & Taylor, L. (1997). Recent developments in IELTS. ELT Journal, 51, 374–380.
Article Google Scholar
Clapham, C. (1996). The development of the IELTS: A study of the effect of background knowledge on reading comprehension. Cambridge: Cambridge University Press.
Google Scholar
Connor-Linton, J. (1995). Looking behind the curtain: What do L2 composition ratings really mean? TESOL Quarterly, 29, 762–765.
Article Google Scholar
Cumming, A. (1996). The concept of validation in language testing. In A. Cumming & R. Berwick, (Eds.), Validation in language testing (pp. 1–14). Clevedon, UK: Multingual Matters.
Google Scholar
Cumming, A., Grant, L., Mulcahy-Ernt, P., & Powers, D. (2004). A teacher-verification study of speaking and writing prototype tasks for a new TOEFL. Language Testing, 21, 159–197.
Article Google Scholar
Cumming, A., Kantor, R., & Powers, D. (2001). Scoring TOEFL essays and TOEFL 2000 prototype tasks: An investigation into raters’ decision making and development of a preliminary analytic framework (TOEFL Monograph Series, Report No. 22). Princeton, NJ: Educational Testing Service.
Google Scholar
Cumming, A., Kantor, R., & Powers, D. (2002). Decision making while scoring ESL/EFL compositions: A descriptive model. Modern Language Journal, 86, 67–96.
Article Google Scholar
Cumming, A., Kantor, R., Powers, D., Santos, T., & Taylor, C. (2000). TOEFL 2000 writing framework: A working paper (TOEFL Monograph Series, Report No. 18). Princeton, NJ: Educational Testing Service.
Google Scholar
Davidson, F., Turner, C., & Huhta, A. (1997). Language testing standards. In D. Corson (Series Ed.) & C. Clapham (Vol. Ed.), Encyclopedia of language and education: Vol. 7. Language testing and assessment (pp. 303–311). Dordrecht, Netherlands: Kluwer.
Google Scholar
De Jong, J. (Ed.). (1990). Standardization in language testing [Special issue]. AILA Review, 7, 24–45.
Google Scholar
Douglas, D. (1997). Testing speaking ability in academic contexts: Theoretical considerations (TOEFL Monograph Series, Report No. 8). Princeton, NJ: Educational Testing Service.
Google Scholar
Douglas, D., & Smith, J. (1997). Theoretical underpinnings of the Test of Spoken English revision project (TOEFL Monograph Series, Report No. 9). Princeton, NJ: Educational Testing Service.
Google Scholar
ETS (Educational Testing Service). (2002). Language Edge courseware: Handbook for scoring speaking and writing. Princeton, NJ: Educational Testing Service.
Google Scholar
Elder, C. (1993). How do subject specialists construe classroom language proficiency? Language Testing, 10, 233–254.
Article Google Scholar
Elson, N. (1992). The failure of tests: Language tests and post-secondary admissions of ESL students. In B. Burnaby & A. Cumming (Eds.), Socio-political aspects of ESL education in Canada (pp. 110–121). Toronto: OISE Press.
Google Scholar
Enright, M., & Cline, F. (2002, April). Evaluating new task types for TOEFL: Relationships between skills. Paper presented at Annual TESOL Convention, Salt Lake City, UT.
Google Scholar
Enright, M., Grabe, B., Koda, K., Mosenthal, P., Mulcahy-Emt, P., & Schedl, M. (2000). TOEFL 2000 reading framework: A working paper (TOEFL Monograph Series, Report No. 17). Princeton, NJ: Educational Testing Service.
Google Scholar
Epp, L., & Stawychny, M. (2001). Using the Canadian Language Benchmarks (CLB) to benchmark college programs/courses and language proficiency tests. TESL Canada Journal, 18, 32–47.
Google Scholar
Fletcher, J., & Stern, R. (1989). Language skills and adaptation: A study of foreign students in a Canadian university. Curriculum Inquiry, 19, 293–308.
Article Google Scholar
Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating scale construction. Language Testing, 13, 208–238.
Article Google Scholar
Fulcher, G. (1999). Assessment in English for academic purposes: Putting content validity in its place. Applied Linguistics, 20, 221–236
Article Google Scholar
Ginther, A. (2001). Effects of the presence and absence of visuals on performance on TOEFL CBT listening-comprehensive stimuli (TOEFL Research Report No. 66). Princeton, NJ: Educational Testing Service.
Google Scholar
Ginther, A., & Grant, L. (1996). A review of the academic needs of native English-speaking college students in the United States (TOEFL Monograph Series, Report No. 1). Princeton, NJ: Educational Testing Service.
Google Scholar
Graham, J. (1987). English language proficiency and the prediction of academic success. TESOL Quarterly, 21, 505–521.
Article Google Scholar
Grant, L., & Ginther, L. (2000). Using computer-tagged linguistic features to describe L2 writing differences. Journal of Second Language Writing, 9, 123–145.
Article Google Scholar
Greenberg, K. (1986). The development and validation of the TOEFL writing test: A discussion of TOEFL Research Reports 15 and 19. TESOL Quarterly, 20, 531–544.
Article Google Scholar
Hale, G., Taylor, C., Bridgeman, B., Carson, J., Kroll, B., & Kantor, R. (1996). A study of writing tasks assigned in academic degree programs (TOEFL Research Report No. 54). Princeton, NJ: Educational Testing Service.
Google Scholar
Hamp-Lyons, L. (1997). Ethics in language testing. In D. Corson (Series Ed.) & C. Clapham (Vol. Ed.), Encyclopedia of language and education: Vol. 7. Language testing and assessment (pp. 323–333). Dordrecht, Netherlands: Kluwer.
Google Scholar
Hamp-Lyons, L. (1998). Ethical test preparation practice: The case of TOEFL. TESOL Quarterly, 32, 329–337.
Article Google Scholar
Hamp-Lyons, L., & Kroll, B. (1997). TOEFL 2000-writing: Composition, community, and assessment (TOEFL Monograph Report No. 5). Princeton, NJ: Educational Testing Service.
Google Scholar
Harley, B., Cummins, J., Swain, M., & Allen, P. (Eds.). (1990). The development of second language proficiency. Cambridge: Cambridge University Press.
Google Scholar
Hudson, T. (1996). Assessing second language academic reading from a communicative competence perspective (TOEFL Monograph Series, Report 4). Princeton, NJ: Educational Testing Service.
Google Scholar
Iwashita, N., McNamara, T., & Elder, C. (2001). Can we predict task difficulty in an oral proficiency test? Exploring the potential of an information-processing approach in task design. Language Learning, 51, 401–436.
Article Google Scholar
Jamieson, J., Jones, S., Kirsch, I., Mosenthal, P., & Taylor, C. (2000). TOEFL 2000 framework: A working paper (TOEFL Monograph Series Report No. 16). Princeton, NJ: Educational Testing Service.
Google Scholar
Kenyon, D., & Malobonga, V. (2001). Comparing examinee attitudes toward computer-assisted and other oral proficiency assessments. Language Learning and Technology, 5(2), 60–83.
Google Scholar
Kirsch, I., Jamieson, J., Taylor, C., & Eignor, D. (1998). Computer familiarity among TOEFL examinees (TOEFL Research Report Series, No. 59). Princeton, NJ: Educational Testing Service.
Google Scholar
Kunnan, A. (1998). Approaches to validation in language assessment. In A. Kunnan (Ed.), Validation in language assessment (pp. 1–18). Mahwah, NJ: Lawrence Erlbaum.
Google Scholar
Kunnan, A. (2000). Fairness and justice for all. In A. Kunnan (Ed.), Fairness and validation in language assessment. Cambridge: Cambridge University Press.
Google Scholar
Lazaration, A. (2002). A qualitative approach to the validation of oral language tests. Cambridge: Cambridge University Press.
Google Scholar
Lazarton, A., & Wagner, S. (1996). The revised TSE: Discourse analysis of native and nonnative speaker data (TOEFL Monograph Report No. 7). Princeton, NJ: Educational Testing Service.
Google Scholar
Lee, Y-W. (2005). Dependability of scores for a new ESL speaking test: Evaluating prototype tasks. (TOEFL Monograph Series No. 28). Princeton, NJ: Educational Testing Service.
Google Scholar
Lee, Y-W., Kantor, R., & Mollaun, P. (2002, April). Score reliability as an essential prerequisite for validating new writing and speaking tasks for TOEFL. Paper presented at the Annual TESOL Convention, Salt Lake City, UT.
Google Scholar
McNamara, T. (1998). Policy and social considerations in language assessment. Annual Review of Applied Linguistics, 18, 304–319.
Article Google Scholar
McNamara, T., Hill, K., & May, L. (2002). Discourse and assessment. Annual Review of Applied Linguistics, 22, 221–242.
Article Google Scholar
Messick, S. (1988). The once and future issues of validity: Assessing the meaning and consequences of measurement. In H. Wainer & H. Braun (Eds.), Test validity (pp. 33–45). Hillsdale, NJ: Erlbaum.
Google Scholar
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5–11.
Google Scholar
Moss, P. (1992). Shifting conceptions of validity in educational measurement: Implications for performance assessment. Review of Educational Research, 62(3), 229–258.
Google Scholar
North, B. (2000). The development of a Common Framework Scale of language proficiency. Oxford: Peter Lang.
Google Scholar
Peirce, B. (1992). Demystifying the TOEFL reading test. TESOL Quarterly, 26, 665–689.
Article Google Scholar
Powell, W. (2001). Looking back, looking forward: Trends in intensive English program enrollments (TOEFL Monograph 14). Princeton, NJ: Educational Testing Service.
Google Scholar
Raimes, A. (1990). The TOEFL Test of Written English: Causes for concern. TESOL Quarterly, 24(3), 427–442.
Article Google Scholar
Roberts, M. (2000). An examination of the way a group of Korean language learners prepare for the Test of English as a Foreign Language (TOEFL). Unpublished Masters’ dissertation, Department of Curriculum, Teaching and Learning, University of Toronto.
Google Scholar
Rosenfeld, M., Leung, S., & Oltman, P. (2001). The reading, writing, speaking, and listening tasks important for academic success at the undergraduate and graduate levels (TOEFL Monograph 21). Princeton, NJ: Educational Testing Service.
Google Scholar
Ross, S., & Berwick, R. (1992). The discourse of accommodation in oral proficiency interviews. Studies in Second Language Acquisition, 14, 159–176.
Google Scholar
Schedl, M., Gordon, C., Carey, P., & Tang, K. L. (1996). An analysis of the dimensionality of TOEFL reading comprehension items. TOEFL Research Report 53. Princeton, NJ: Educational Testing Service.
Google Scholar
Shermis, M., & Burstein, J. (Eds.). (2003). Automated essay scoring: A cross-disciplinary perspective. Mahwah, NJ: Erlbaum.
Google Scholar
Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press.
Google Scholar
Spolsky, B. (1995). Measured words: The development of objective language testing. Oxford: Oxford University Press.
Google Scholar
Sullivan, B., Weir, C., & Saville, N. (2002). Using observation checklists to validate speaking-test tasks. Language Testing, 19, 33–56.
Article Google Scholar
Taylor, C., Jamieson, J., Eignor, D., & Kirsch, I. (1998). The relationship between computer familiarity and performance on computer-based TOEFL test tasks (TOEFL Research Report Series, 61). Princeton, NJ: Educational Testing Service.
Google Scholar
Taylor, C., Kirsch, I., Jamieson, J., & Eignor, D. (1999). Examining the relationship between computer familiarity and performance on computer-based language tasks. Language Learning, 49, 219–274.
Article Google Scholar
Wainer, H., & Lukhele, R. (1997). How reliable is the TOEFL test? (TOEFL Technical Report 12). Princeton, NJ: Educational Testing Service.
Google Scholar
Wallace, C. (1997). IELTS: Global implications of curriculum and materials design. ELT Journal, 51, 370–373.
Article Google Scholar
Waters, A. (1996). A review of research into needs in English for academic purposes of relevance to the North American higher education context. (TOEFL Monograph Report 6). Princeton, NJ: Educational Testing Service.
Google Scholar
Weir, C. (2005). Language testing and validation: An evidence-based approach. New York: Palgrave Macmillan.
Google Scholar
Wesche, M. (1987). Second language performance testing: The Ontario Test of ESL as an example. Language Testing, 4, 28–47.
Article Google Scholar
Young, R., & He, A. (Eds.). (1998). Talking and testing: Discourse approaches to the assessment of oral proficiency. Amsterdam: John Benjamins.
Google Scholar
Zamel, V. (1995). Stangers in academia: The experiences of faculty and ESL students across the curriculum. College Composition and Communication, 46, 505–521.
Article Google Scholar

Download references

Author information

Authors and Affiliations

The University of Toronto, Canada
Alister Cumming

Authors

Alister Cumming
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Ontario Institute for Studies in Education, Canada
Jim Cummins
The University of Hong Kong, China
Chris Davison

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cumming, A. (2007). New Directions in Testing English Language Proficiency for University Entrance. In: Cummins, J., Davison, C. (eds) International Handbook of English Language Teaching. Springer International Handbooks of Education, vol 15. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-46301-8_34

Download citation

DOI: https://doi.org/10.1007/978-0-387-46301-8_34
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-46300-1
Online ISBN: 978-0-387-46301-8
eBook Packages: Humanities, Social Sciences and LawEducation (R0)

Publish with us

Policies and ethics