Skip to main content

New Directions in Testing English Language Proficiency for University Entrance

  • Chapter
International Handbook of English Language Teaching

Part of the book series: Springer International Handbooks of Education ((SIHE,volume 15))

Abstract

This chapter reviews recent trends in the conceptualizations and formats of tests used to determine whether non-native speakers of English have sufficient proficiency in English to study at English-medium universities in English-dominant countries. The review focuses on published research informing a new version of the Test of English as a Foreign Language (TOEFL), but a range of similar tests internationally is also considered. Prominent among the issues guiding research and development on these tests are the following: construct validation, particularly refinements in the description of testing purposes, evaluations of the discourse produced in the contexts of testing, and surveys of relevant domains and score users; consistency, including fairness in opportunities for test performance across differing populations, reliability through field-testing and equating of test forms, and sampling of multiple, comparable performances from examinees; and innovations in the media of test administration, including various forms of computer and other technological adaptations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 429.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 549.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • AERA (American Educational Research Association), APA (American Psychological Association), & NCME (National Council on Measurement and Evaluation). (1999). Standards for educational and psychological assessment. Washington, DC: Authors.

    Google Scholar 

  • Alderson, J. C., & Banerjee, J. (2002). State of the art review: Language testing and assessment (Part 2). Language Teaching, 35, 79–113.

    Article  Google Scholar 

  • Alderson, J. C., & Clapham, C. (1992). Applied linguistics and language testing: A case study of the ELTS test. Applied Linguistics, 13(2), 149–167.

    Article  Google Scholar 

  • Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press.

    Google Scholar 

  • Alderson, J. C., & Hamp-Lyons, L. (1996). TOEFL preparation courses: A study of washback. TESOL Quarterly, 13, 280–297.

    Google Scholar 

  • Bachman, L. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17(1), 1–42.

    Google Scholar 

  • Bachman, L., Davidson, F., Ryan, K., & Choi, I. (1995). An investigation into the comparability of two tests of English as a foreign language: The Cambridge-TOEFL comparability study. Cambridge: Cambridge University Press.

    Google Scholar 

  • Bailey, K. (1999). Washback in language testing (TOEFL Monograph No. 15). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Benesch, S. (Ed.). (1988). Ending remediation: Linking ESL and content in higher education. Washington, DC: TESOL.

    Google Scholar 

  • Bejar, I., Douglas, D., Jamieson, J., Nissan, S., & Turner, J. (2000). TOEFL 2000 listening framework: A working paper (TOEFL Monograph Series, Report No. 19). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Biber, D., Conrad, S., Reppen, R., Byrd, P., & Helt, M. (2002). Speaking and writing in the university: A multidimensional comparison. TESOL Quarterly, 36, 9–48.

    Article  Google Scholar 

  • Boyle, A., & Booth, D. (2000, March). The UCLES/CUP learner corpus. Research Notes: University of Cambridge Local Examinations Syndicate EFL, 1, 10.

    Google Scholar 

  • Bridgeman, B., Cline, F., & Powers, D. (2002, April). Evaluating new tasks for TOEFL: Relationships to external criteria. Paper presented at the Annual TESOL Convention, Salt Lake City, UT.

    Google Scholar 

  • Brindley, G. (1998). Describing language development? Rating scales and second language acquisition. In L. Bachman & A. Cohen (Eds.), Interfaces between second language acquisition and language testing research (pp. 112–140). Cambridge: Cambridge University Press.

    Google Scholar 

  • Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency. Language Testing, 20, 1–25.

    Article  Google Scholar 

  • Butler, F., Eignor, D., Jones, S., McNamara, T., & Suomi, B. (2000). TOEFL 2000 speaking framework: A working paper (TOEFL Monograph Series, Report No. 20). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Carey, P. (1966). A review of psychometric and consequential issues related to performance assessment (TOEFL Monograph Series, Report No. 3). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Carrell, P., Dunkel, P., & Mollaun, P. (2002). The effects of notetaking, lecture length and topic on the listening component of TOEFL 2000. (TOEFL Monograph Series, Report No. 23). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Carroll, J. B. (1975). The teaching of French as a foreign language in eight countries. John Wiley & Sons: New York.

    Google Scholar 

  • Chalhoub-Deville, M., & Deville, C. (1999). Computer-adaptive testing in second language contexts. Annual Review of Applied Linguistics, 19, 273–299.

    Article  Google Scholar 

  • Chapelle, C. (1998). Construct definition and validity inquiry in SLA research. In L. Bachman & A. Cohen (Eds.), Interfaces between second language acquisition and language testing research (pp. 32–70). Cambridge: Cambridge University Press.

    Google Scholar 

  • Chapelle, C. (2001). Computer applications in second language acquisition: Foundations for teaching, testing and research. Cambridge: Cambridge University Press.

    Google Scholar 

  • Chapelle, C., Grabe, W., & Berns, M. (1997). Communicative language proficiency: Definition and implications for TOEFL 2000. (TOEFL Monograph Series Report No. 10). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Charge, N., & Taylor, L. (1997). Recent developments in IELTS. ELT Journal, 51, 374–380.

    Article  Google Scholar 

  • Clapham, C. (1996). The development of the IELTS: A study of the effect of background knowledge on reading comprehension. Cambridge: Cambridge University Press.

    Google Scholar 

  • Connor-Linton, J. (1995). Looking behind the curtain: What do L2 composition ratings really mean? TESOL Quarterly, 29, 762–765.

    Article  Google Scholar 

  • Cumming, A. (1996). The concept of validation in language testing. In A. Cumming & R. Berwick, (Eds.), Validation in language testing (pp. 1–14). Clevedon, UK: Multingual Matters.

    Google Scholar 

  • Cumming, A., Grant, L., Mulcahy-Ernt, P., & Powers, D. (2004). A teacher-verification study of speaking and writing prototype tasks for a new TOEFL. Language Testing, 21, 159–197.

    Article  Google Scholar 

  • Cumming, A., Kantor, R., & Powers, D. (2001). Scoring TOEFL essays and TOEFL 2000 prototype tasks: An investigation into raters’ decision making and development of a preliminary analytic framework (TOEFL Monograph Series, Report No. 22). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Cumming, A., Kantor, R., & Powers, D. (2002). Decision making while scoring ESL/EFL compositions: A descriptive model. Modern Language Journal, 86, 67–96.

    Article  Google Scholar 

  • Cumming, A., Kantor, R., Powers, D., Santos, T., & Taylor, C. (2000). TOEFL 2000 writing framework: A working paper (TOEFL Monograph Series, Report No. 18). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Davidson, F., Turner, C., & Huhta, A. (1997). Language testing standards. In D. Corson (Series Ed.) & C. Clapham (Vol. Ed.), Encyclopedia of language and education: Vol. 7. Language testing and assessment (pp. 303–311). Dordrecht, Netherlands: Kluwer.

    Google Scholar 

  • De Jong, J. (Ed.). (1990). Standardization in language testing [Special issue]. AILA Review, 7, 24–45.

    Google Scholar 

  • Douglas, D. (1997). Testing speaking ability in academic contexts: Theoretical considerations (TOEFL Monograph Series, Report No. 8). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Douglas, D., & Smith, J. (1997). Theoretical underpinnings of the Test of Spoken English revision project (TOEFL Monograph Series, Report No. 9). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • ETS (Educational Testing Service). (2002). Language Edge courseware: Handbook for scoring speaking and writing. Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Elder, C. (1993). How do subject specialists construe classroom language proficiency? Language Testing, 10, 233–254.

    Article  Google Scholar 

  • Elson, N. (1992). The failure of tests: Language tests and post-secondary admissions of ESL students. In B. Burnaby & A. Cumming (Eds.), Socio-political aspects of ESL education in Canada (pp. 110–121). Toronto: OISE Press.

    Google Scholar 

  • Enright, M., & Cline, F. (2002, April). Evaluating new task types for TOEFL: Relationships between skills. Paper presented at Annual TESOL Convention, Salt Lake City, UT.

    Google Scholar 

  • Enright, M., Grabe, B., Koda, K., Mosenthal, P., Mulcahy-Emt, P., & Schedl, M. (2000). TOEFL 2000 reading framework: A working paper (TOEFL Monograph Series, Report No. 17). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Epp, L., & Stawychny, M. (2001). Using the Canadian Language Benchmarks (CLB) to benchmark college programs/courses and language proficiency tests. TESL Canada Journal, 18, 32–47.

    Google Scholar 

  • Fletcher, J., & Stern, R. (1989). Language skills and adaptation: A study of foreign students in a Canadian university. Curriculum Inquiry, 19, 293–308.

    Article  Google Scholar 

  • Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating scale construction. Language Testing, 13, 208–238.

    Article  Google Scholar 

  • Fulcher, G. (1999). Assessment in English for academic purposes: Putting content validity in its place. Applied Linguistics, 20, 221–236

    Article  Google Scholar 

  • Ginther, A. (2001). Effects of the presence and absence of visuals on performance on TOEFL CBT listening-comprehensive stimuli (TOEFL Research Report No. 66). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Ginther, A., & Grant, L. (1996). A review of the academic needs of native English-speaking college students in the United States (TOEFL Monograph Series, Report No. 1). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Graham, J. (1987). English language proficiency and the prediction of academic success. TESOL Quarterly, 21, 505–521.

    Article  Google Scholar 

  • Grant, L., & Ginther, L. (2000). Using computer-tagged linguistic features to describe L2 writing differences. Journal of Second Language Writing, 9, 123–145.

    Article  Google Scholar 

  • Greenberg, K. (1986). The development and validation of the TOEFL writing test: A discussion of TOEFL Research Reports 15 and 19. TESOL Quarterly, 20, 531–544.

    Article  Google Scholar 

  • Hale, G., Taylor, C., Bridgeman, B., Carson, J., Kroll, B., & Kantor, R. (1996). A study of writing tasks assigned in academic degree programs (TOEFL Research Report No. 54). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Hamp-Lyons, L. (1997). Ethics in language testing. In D. Corson (Series Ed.) & C. Clapham (Vol. Ed.), Encyclopedia of language and education: Vol. 7. Language testing and assessment (pp. 323–333). Dordrecht, Netherlands: Kluwer.

    Google Scholar 

  • Hamp-Lyons, L. (1998). Ethical test preparation practice: The case of TOEFL. TESOL Quarterly, 32, 329–337.

    Article  Google Scholar 

  • Hamp-Lyons, L., & Kroll, B. (1997). TOEFL 2000-writing: Composition, community, and assessment (TOEFL Monograph Report No. 5). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Harley, B., Cummins, J., Swain, M., & Allen, P. (Eds.). (1990). The development of second language proficiency. Cambridge: Cambridge University Press.

    Google Scholar 

  • Hudson, T. (1996). Assessing second language academic reading from a communicative competence perspective (TOEFL Monograph Series, Report 4). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Iwashita, N., McNamara, T., & Elder, C. (2001). Can we predict task difficulty in an oral proficiency test? Exploring the potential of an information-processing approach in task design. Language Learning, 51, 401–436.

    Article  Google Scholar 

  • Jamieson, J., Jones, S., Kirsch, I., Mosenthal, P., & Taylor, C. (2000). TOEFL 2000 framework: A working paper (TOEFL Monograph Series Report No. 16). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Kenyon, D., & Malobonga, V. (2001). Comparing examinee attitudes toward computer-assisted and other oral proficiency assessments. Language Learning and Technology, 5(2), 60–83.

    Google Scholar 

  • Kirsch, I., Jamieson, J., Taylor, C., & Eignor, D. (1998). Computer familiarity among TOEFL examinees (TOEFL Research Report Series, No. 59). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Kunnan, A. (1998). Approaches to validation in language assessment. In A. Kunnan (Ed.), Validation in language assessment (pp. 1–18). Mahwah, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Kunnan, A. (2000). Fairness and justice for all. In A. Kunnan (Ed.), Fairness and validation in language assessment. Cambridge: Cambridge University Press.

    Google Scholar 

  • Lazaration, A. (2002). A qualitative approach to the validation of oral language tests. Cambridge: Cambridge University Press.

    Google Scholar 

  • Lazarton, A., & Wagner, S. (1996). The revised TSE: Discourse analysis of native and nonnative speaker data (TOEFL Monograph Report No. 7). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Lee, Y-W. (2005). Dependability of scores for a new ESL speaking test: Evaluating prototype tasks. (TOEFL Monograph Series No. 28). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Lee, Y-W., Kantor, R., & Mollaun, P. (2002, April). Score reliability as an essential prerequisite for validating new writing and speaking tasks for TOEFL. Paper presented at the Annual TESOL Convention, Salt Lake City, UT.

    Google Scholar 

  • McNamara, T. (1998). Policy and social considerations in language assessment. Annual Review of Applied Linguistics, 18, 304–319.

    Article  Google Scholar 

  • McNamara, T., Hill, K., & May, L. (2002). Discourse and assessment. Annual Review of Applied Linguistics, 22, 221–242.

    Article  Google Scholar 

  • Messick, S. (1988). The once and future issues of validity: Assessing the meaning and consequences of measurement. In H. Wainer & H. Braun (Eds.), Test validity (pp. 33–45). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5–11.

    Google Scholar 

  • Moss, P. (1992). Shifting conceptions of validity in educational measurement: Implications for performance assessment. Review of Educational Research, 62(3), 229–258.

    Google Scholar 

  • North, B. (2000). The development of a Common Framework Scale of language proficiency. Oxford: Peter Lang.

    Google Scholar 

  • Peirce, B. (1992). Demystifying the TOEFL reading test. TESOL Quarterly, 26, 665–689.

    Article  Google Scholar 

  • Powell, W. (2001). Looking back, looking forward: Trends in intensive English program enrollments (TOEFL Monograph 14). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Raimes, A. (1990). The TOEFL Test of Written English: Causes for concern. TESOL Quarterly, 24(3), 427–442.

    Article  Google Scholar 

  • Roberts, M. (2000). An examination of the way a group of Korean language learners prepare for the Test of English as a Foreign Language (TOEFL). Unpublished Masters’ dissertation, Department of Curriculum, Teaching and Learning, University of Toronto.

    Google Scholar 

  • Rosenfeld, M., Leung, S., & Oltman, P. (2001). The reading, writing, speaking, and listening tasks important for academic success at the undergraduate and graduate levels (TOEFL Monograph 21). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Ross, S., & Berwick, R. (1992). The discourse of accommodation in oral proficiency interviews. Studies in Second Language Acquisition, 14, 159–176.

    Google Scholar 

  • Schedl, M., Gordon, C., Carey, P., & Tang, K. L. (1996). An analysis of the dimensionality of TOEFL reading comprehension items. TOEFL Research Report 53. Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Shermis, M., & Burstein, J. (Eds.). (2003). Automated essay scoring: A cross-disciplinary perspective. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press.

    Google Scholar 

  • Spolsky, B. (1995). Measured words: The development of objective language testing. Oxford: Oxford University Press.

    Google Scholar 

  • Sullivan, B., Weir, C., & Saville, N. (2002). Using observation checklists to validate speaking-test tasks. Language Testing, 19, 33–56.

    Article  Google Scholar 

  • Taylor, C., Jamieson, J., Eignor, D., & Kirsch, I. (1998). The relationship between computer familiarity and performance on computer-based TOEFL test tasks (TOEFL Research Report Series, 61). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Taylor, C., Kirsch, I., Jamieson, J., & Eignor, D. (1999). Examining the relationship between computer familiarity and performance on computer-based language tasks. Language Learning, 49, 219–274.

    Article  Google Scholar 

  • Wainer, H., & Lukhele, R. (1997). How reliable is the TOEFL test? (TOEFL Technical Report 12). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Wallace, C. (1997). IELTS: Global implications of curriculum and materials design. ELT Journal, 51, 370–373.

    Article  Google Scholar 

  • Waters, A. (1996). A review of research into needs in English for academic purposes of relevance to the North American higher education context. (TOEFL Monograph Report 6). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Weir, C. (2005). Language testing and validation: An evidence-based approach. New York: Palgrave Macmillan.

    Google Scholar 

  • Wesche, M. (1987). Second language performance testing: The Ontario Test of ESL as an example. Language Testing, 4, 28–47.

    Article  Google Scholar 

  • Young, R., & He, A. (Eds.). (1998). Talking and testing: Discourse approaches to the assessment of oral proficiency. Amsterdam: John Benjamins.

    Google Scholar 

  • Zamel, V. (1995). Stangers in academia: The experiences of faculty and ESL students across the curriculum. College Composition and Communication, 46, 505–521.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Science+Business Media, LLC.

About this chapter

Cite this chapter

Cumming, A. (2007). New Directions in Testing English Language Proficiency for University Entrance. In: Cummins, J., Davison, C. (eds) International Handbook of English Language Teaching. Springer International Handbooks of Education, vol 15. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-46301-8_34

Download citation

Publish with us

Policies and ethics