Skip to main content

Document Ranking Applied to Second Language Learning

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10772))

Included in the following conference series:

Abstract

This paper addresses the needs of language learners and teachers by combining keyword-based search and language level information on an algorithm that can rank documents by pertinence to the required topic (keywords) and adequacy to the user’s language level. We conducted several experiments using the EF-CAMDAT corpus (annotated for topic and level) and we observed that the best ranking results were an average of BM25 and linguistic information. We also saw that the grammar of level C1 is the best indicator for level. Finally, we proposed a customization for prioritizing beginner or intermediate levels at the top of the rank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For the grammar and vocabulary of level C1, we used the inversed score, because we expect that an A-level document would present them in a lesser degree.

  2. 2.

    In this work we did not address advanced-level learners, because they are supposed to use traditional search engine systems.

References

  1. Purcell, K., Rainie, L., Heaps, A., Buchanan, J., Friedrich, L., Jacklin, A., Chen, C., Zickuhr, K.: How teens do research in the digital world. Pew Internet & American Life Project (2012)

    Google Scholar 

  2. Chinkina, M., Kannan, M., Meurers, D.: Online information retrieval for language learning. In: 2016 ACL, p. 7 (2016)

    Google Scholar 

  3. Vajjala, S., Meurers, D.: On the applicability of readability models to web texts. In: 2013 ACL, p. 59 (2013)

    Google Scholar 

  4. Passonneau, R., Hemat, L., Plante, J., Sheehan, K.M.: Electronic sources as input to gre® reading comprehension item development: sourcefinder prototype evaluation. ETS Res. Rep. Ser. 2002(1) (2002). 66 pages

    Google Scholar 

  5. Collins-Thompson, K., Callan, J.: Information retrieval for language tutoring: an overview of the reap project. In: 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 544–545. ACM (2004)

    Google Scholar 

  6. Miltsakaki, E., Troutt, A.: Read-x: Automatic evaluation of reading difficulty of web text. In: E-Learn: World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education, Association for the Advancement of Computing in Education (AACE), pp. 7280–7286 (2007)

    Google Scholar 

  7. Verhelst, N., Van Avermaet, P., Takala, S., Figueras, N., North, B.: Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge University Press, Cambridge (2009)

    Google Scholar 

  8. Geertzen, J., Alexopoulou, T., Korhonen, A.: Automatic linguistic annotation of large scale l2 databases: the ef-cambridge open language database (EFCAMDAT). In: Cascadilla Proceedings Project on 31st Second Language Research Forum. Somerville, MA (2013)

    Google Scholar 

  9. Rayson, P., Garside, R.: Comparing corpora using frequency profiling. In: Workshop on Comparing Corpora, pp. 1–6. ACL (2000)

    Google Scholar 

  10. Robertson, S.E., Walker, S.: Okapi/keenbow at TREC-8. In: TREC, pp. 151–162 (1999)

    Google Scholar 

  11. Pérez-Iglesias, J., Pérez-Agüera, J.R., Fresno, V., Feinstein, Y.Z.: Integrating the probabilistic models bm25/bm25f into lucene. preprint arXiv:0911.5046 (2009)

  12. Dale, E., Chall, J.S.: A formula for predicting readability: instructions. Educ. Res. Bull. 27, 37–54 (1948)

    Google Scholar 

  13. Kincaid, J.P., Fishburne Jr., R.P., Rogers, R.L., Chissom, B.S.: Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report, Naval Technical Training Command Millington TN Research Branch (1975)

    Google Scholar 

  14. Gunning, R.: The Technique of Clear Writing. Mcgraw-Hill, NY (1968)

    Google Scholar 

  15. Heilman, M., Collins-Thompson, K., Callan, J., Eskenazi, M.: Combining lexical and grammatical features to improve readability measures for first and second language texts. In: HLT 2007: The Conference of the NAACL, pp. 460–467 (2007)

    Google Scholar 

Download references

Acknowledgements

We would like to thank the Walloon Region (Projects BEWARE n. 1510637 and 1610378) for support, and Altissia International for research collaboration.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rodrigo Wilkens .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wilkens, R., Zilio, L., Fairon, C. (2018). Document Ranking Applied to Second Language Learning. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds) Advances in Information Retrieval. ECIR 2018. Lecture Notes in Computer Science(), vol 10772. Springer, Cham. https://doi.org/10.1007/978-3-319-76941-7_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-76941-7_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-76940-0

  • Online ISBN: 978-3-319-76941-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics