A Ubiquitous Agent for Unrestricted Vocabulary Learning in Noisy Digital Environments

  • David Wible
  • Chin-Hwa Kuo
  • Meng-Chang Chen
  • Nai-Lung Tsao
  • Chong-Fu Hong
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4053)


One of the most persistently difficult aspects of vocabulary for foreign language learners is collocation. This paper describes a browser-based agent that assists learners in acquiring collocations in context during their unrestricted Web browsing. The agent overcomes the limitations imposed by learner models in traditional ITS. Its capacity to function in noisy unscripted contexts derives from a well-understood theory of lexical knowledge that attributes a word’s identity to its contextual features. Collocations constitute a central feature type, and we extract these features computationally from a 20-million-word portion of BNC. These we are able to detect and highlight in real time for learners in the noisy Web environments they freely browse. Our learner model, derived by semi-automatic techniques from our 3-million word corpus of learner English, maps detected collocations onto corresponding collocation errors produced by this learner population, alerting learners to the non-substitutability of words within the target collocations. A notebook offers a push function for individualized repeated exposure to examples of these collocations in context.


Mutual Information Learner Model Target Language Contextual Feature British National Corpus 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brants, T.: TnT-A statistical part-of-speech tagger. In: Proceedings of ANLP 2000, Seattle, Washington (2000)Google Scholar
  2. 2.
    Chien, J.F.-Y.: A Study of Input and Second Language Lexical Acquisition. Master Thesis, Department of English, Tamkang University, Taiwan (2003)Google Scholar
  3. 3.
    Church, K., Gale, W., Hanks, P., Hindle, D.: Using statistics in lexical analysis, in Lexical Acquisition. In: Zernik, U. (ed.) Exploiting On-Line Resources to Build a Lexicon, pp. 115–164. Lawrence Erlbaum Associates, Mahwah (1991)Google Scholar
  4. 4.
    Firth, J.R.: A Synopsis of Linguistic Theory 1930-1955 in Studies in Linguistic Analysis, Philological Society, Oxford. reprinted in Palmer, F., (ed. 1968), Selected Papers of J.R. Firth, Longman, Harlow (1957)Google Scholar
  5. 5.
    Krashen, S.: Principles and Practice in Second Language Acquisition. Pergamon Press, Cambridge (1982)Google Scholar
  6. 6.
    Liu, A.L.-E.: A Corpus-based Lexical Semantic Investigation of Verb-Noun Miscollocations in Taiwan learners’English. Master Thesis, Department of English, Tamkang University, Taiwan (2002)Google Scholar
  7. 7.
    Christopher, D.: Manning and Hinrich Schutze, Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)Google Scholar
  8. 8.
    Schmidt, R.: Attention in P. Robinson (ed) Cognition and Second Language Instruction. Cambridge University Press, Cambridge (2001)Google Scholar
  9. 9.
    Wang, W.S.: The Effects of Degrees of Explicitness of Automated Feedback on English Learners Acquisition of Collocations, Master Thesis. Department of English, Tamkang, Taiwan (2005)Google Scholar
  10. 10.
    Wible, D., Liu, A.: A Syntax-Lexical Semantics Interface Analysis of Collocation Errors. In: PacSLRF (Pacific Second Language Research Forum), University of Hawaii, Manoa, Hawaii (October 2001)Google Scholar
  11. 11.
    Wible, D., Kuo, C.-H., Chien, F.-y., Liu, A., Tsao, N.-L.: A Web-based EFL Writing Environment: Exploiting Information for Learners, Teachers, and Researchers. Computers and Education 37, 297–315 (2002)CrossRefGoogle Scholar
  12. 12.
    Wible, D., Kuo, C.-H., Tsao, N.-L., Liu, A., Lin, H.-L.: Bootstrapping in a Language Learning Environment. Journal of Computer-Assisted Learning 19(1), 90–102 (2003)CrossRefGoogle Scholar
  13. 13.
    Wible, D., Kuo, C.-H., Tsao, N.-L.: Contextualizing Language Learning in the Digital Wild: Tools and a Framework. In: Proceedings of IEEE International Conference on Advanced Learning Technologies (ICALT), Joensuu, Finland (2004)Google Scholar
  14. 14.
    Wible, D., Kuo, C.-H., Tsao, N.-L.: Improving the Extraction of Collocations with High Frequency Words. In: Proceedings of International Conference on Language Resources and Evaluation (LREC), Lisbon, Portugual (2004)Google Scholar
  15. 15.
    Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pp. 189–196 (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • David Wible
    • 1
  • Chin-Hwa Kuo
    • 2
  • Meng-Chang Chen
    • 3
  • Nai-Lung Tsao
    • 3
  • Chong-Fu Hong
    • 2
  1. 1.Department of EnglishTamkang UniversityTaipei CountyTaiwan
  2. 2.Computer and Network LabTamkang UniversityTaipei CountyTaiwan
  3. 3.Institute of Information ScienceAcademia SinicaNankang TaipeiTaiwan

Personalised recommendations