Skip to main content
Log in

CityU corpus of essay drafts of English language learners: a corpus of textual revision in second language writing

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

Learner corpora consist of texts produced by non-native speakers. In addition to these texts, some learner corpora also contain error annotations, which can reveal common errors made by language learners, and provide training material for automatic error correction. We present a novel type of error-annotated learner corpus containing sequences of revised essay drafts written by non-native speakers of English. Sentences in these drafts are annotated with comments by language tutors, and are aligned to sentences in subsequent drafts. We describe the compilation process of our corpus, present its encoding in TEI XML, and report agreement levels on the error annotations. Further, we demonstrate the potential of the corpus to facilitate research on textual revision in L2 writing, by conducting a case study on verb tenses using ANNIS, a corpus search and visualization platform.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Examples include the Cambridge Learner Corpus (Nicholls 2003), the International Corpus of Learner English (ICLE) (Granger et al. 2009), the National University of Singapore Corpus of Learner English (Dahlmeier et al. 2013), among many others.

  2. Target hypotheses are costly to produce and often overlooked, but are nevertheless crucial, since any form of error annotation implies a comparison with what the annotator believes the learner was trying to express. Failing to explicitly document error hypotheses can lead to error annotations that are inconsistent and difficult to rationalize. For extensive discussion, see Lüdeling and Hirschmann (to appear).

  3. This corpus is available for research purposes through arrangement with the Halliday Centre for Intelligent Applications of Language Studies at City University of Hong Kong (hcls@cityu.edu.hk).

  4. For lab reports, we include only the discussion section since other sections contain many equations, numbers and sentence fragments.

  5. Whether the text span contains the specified error is a separate question that will be addressed in Sect. 3.3.

  6. We omitted the “Delete this” category, since it can be applied on any kind of word, and so it is always valid by definition.

  7. This level of disagreement means that evaluation of the precision of error annotations can differ by as much as 10 %, depending on the annotator (Tetreault and Chodorow 2008).

  8. Two experts, both professors of linguistics participated in this evaluation. One was a native speaker of English and the other a near-native speaker who studied in an English-speaking country for 15 years since high school.

  9. Our evaluation does not estimate the coverage, or recall, of the tutor comments, i.e. the proportion of errors in the learner text that were annotated. Since the tutors were not asked to exhaustively annotate all errors in the text, this figure would not be meaningful.

  10. E.g. using XQuery, a generic query language for XML documents, see http://www.w3.org/TR/xquery-30/.

  11. In this study, we do not consider open-ended comments on verb tense errors, since they vary in terms of the explicitness of the feedback, making it difficult to compare their impact. Furthermore, among comments leading to verb tense revision, open-ended comments (16 %) are much less frequent than error categories (84 %).

  12. The interested reader is referred to http://www.sfb632.uni-potsdam.de/annis/ and to (Krause & Zeldes, 2014) for more detail on how the interface can perform sophisticated queries to answer research questions flexibly and without programming skills.

  13. Due to revisions over the course of the LCC project, the comment bank differed slightly for each semester; in particular, a few categories were annotated at different levels of granularity. For example, “Verb needed”, “Noun needed”, “Adjective needed”, and “Adverb needed” from one semester are subsumed by “Part-of-speech incorrect” from another semester. The more fine-grained categories are considered subcategories in our corpus.

References

  • Andreu Andrés, M. A., Guardiola, A. A., Matarredona, M. B., MacDonald, P., Fleta, B. M., & Pérez Sabater, C. (2010). Analysing EFL learner output in the MiLC Project: An error * it’s, but which tag? In M. C. Campoy-Cubillo, B. Bellés-Fortuño, & M. L. Gea-Valor (Eds.), Corpus-based approaches to English language teaching (pp. 167–179). London: Continuum.

    Google Scholar 

  • Ashwell, T. (2000). Patterns of teacher response to student writing in a multiple-draft composition classroom: Is content feedback followed by form feedback the best method? Journal of Second Language Writing, 9(3), 227–257.

    Article  Google Scholar 

  • Barzilay, R., & Elhadad, N. (2003). Sentence Alignment for Monolingual Comparable Corpora. In Proceedings of the 2003 conference on empirical methods in natural language processing. Sapporo, Japan, pp. 25–32.

  • Biber, D., Nekrasova, T., & Horn, B. (2011). The effectiveness of feedback for L1-English and L2-writing development: A meta-analysis. TOEFL iBT research report.

  • Bitchener, J., & Ferris, D. R. (2012). Written corrective feedback in Second Language Acquisition and Writing. New York, NY: Routledge.

    Google Scholar 

  • Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The criterion online writing service. AI Magazine, 25(3), 27–36.

    Google Scholar 

  • Chandler, J. (2003). The efficacy of various kinds of error feedback for improvement in the accuracy and fluency of L2 student writing. Journal of Second Language Writing, 12(3), 267–296.

    Article  Google Scholar 

  • Dahlmeier, D., & Ng, H. T. (2011). Grammatical error correction with alternating structure optimization. Proceedings of the 49th annual meeting of The Association for Computational Linguistics (pp. 915–923). Stroudsburg, PA: ACL.

    Google Scholar 

  • Dahlmeier, D., Ng, H. T., & Wu, S. M. (2013). Building a large annotated corpus of learner English: The NUS corpus of learner English. In Proceedings of the Eighth workshop on innovative use of NLP for building educational applications, 22–31.

  • Dale, R., & Kilgarriff, A. (2011). Helping our own: The HOO 2011 pilot shared task. In Proceedings of the 13th European Workshop on Natural Language Generation (ENLG). Nancy, France, 242–249.

  • Dipper, S. (2005). XML-based stand-off representation and exploitation of multi-level linguistic annotation. In Proceedings of Berliner XML Tage 2005 (BXML 2005). Berlin, Germany, 39–50.

  • Eriksson, A., Finnegan, D., Kauppinen, A., Wiktorsson, M., Wärnsby, A., & Withers, P. (2012). MUCH: The Malmö University-Chalmers Corpus of Academic Writing as a Process. In Proceedings of the 10th teaching and language corpora conference.

  • Fathman, A. K. & Whalley, E. (1990). Teacher response to student writing: Focus on form versus content. In Kroll, B. (ed.) Second language writing: Research insights for the classroom, pp. 178–190.

  • Ferris, D. R. (1997). The influence of teacher commentary on student revision. TESOL Quarterly, 31(2), 315–339.

    Article  Google Scholar 

  • Ferris, D. R. (2006). Does error feedback help student writers? New evidence on the short-and long-term effects of written error correction. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts and issues (pp. 81–104). Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Ferris, D. R., & Roberts, B. (2001). Error feedback in L2 writing classes: How explicit does it need to be? Journal of Second Language Writing, 10, 161–184.

    Article  Google Scholar 

  • Foster, J., Wagner, J., & van Genabith, J. (2008). Adapting a WSJ-trained parser to grammatically noisy text. In Proceedings of ACL.

  • Graham, S., & Perin, D. (2007). A meta-analysis of writing instruction for adolescent students. Journal of Educational Psychology, 99(3), 445–476.

    Article  Google Scholar 

  • Granger, S. (1999). Use of tenses by advanced EFL learners: Evidence from error-tagged computer corpus. In H. Hasselgård (Ed.), Out of Corpora—Studies in Honour of Stig Johansson (pp. 191–202). Amsterdam, Atlanta: Rodopi.

    Google Scholar 

  • Granger, S. (2004). Computer learner corpus research: Current status and future prospects. Language and Computers, 23, 123–145.

    Google Scholar 

  • Granger, S. (2008). Learner corpora. In A. Lüdeling & M. Kyto (Eds.), Corpus linguistics: An international handbook (Vol. 1). Berlin: Mouton de Gruyter.

    Google Scholar 

  • Granger, S., Dagneaux, E., Meunier, F., & Paquot, M. (2009). The international corpus of learner English. Version 2. Handbook and CD-ROM. Louvain-la-Neuve: Presses universitaires de Louvain.

  • Han, N.-R., Chodorow, M., & Leacock, C. (2006). Detecting errors in English article usage by non-native speakers. Natural Language Engineering, 12(2), 115–129.

    Article  Google Scholar 

  • Ide, N., Bonhomme, P., & Romary, L. (2000). XCES: An XML-based encoding standard for linguistic corpora. Proceedings of the second international language resources and evaluation conference (pp. 825–830). Paris: ELRA.

    Google Scholar 

  • Krause, T. & Zeldes, A. (2014). ANNIS3: A new architecture for generic corpus query and visualization. To appear in Literary and Linguistic Computing. http://dsh.oxfordjournals.org/content/early/2014/12/02/llc.fqu057

  • Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.

    Article  Google Scholar 

  • Lee, J., & Seneff, S. (2008). An analysis of grammatical errors in nonnative speech in English. In Proceedings of the IEEE Workshop on Spoken Language Technology 2008. pp. 89–92.

  • Lee, J., Tetreault, J., & Chodorow, M. (2009). Human evaluation of article and noun number usage: Influences of context and construction variability. In Proceedings of the Third Linguistic Annotation Workshop, pp. 60–63.

  • Lipnevich, A. A., & Smith, J. K. (2009). “I really need feedback to learn:” Students’ perspectives on the effectiveness of the differential feedback messages. Educational Assessment, Evaluation and Accountability, 21(4), 347–367.

    Article  Google Scholar 

  • Lüdeling, A., Doolittle, S., Hirschmann, H., Schmidt, K., & Walter, M. (2008). Das Lernerkorpus Falko. Deutsch als Fremdsprache, 2, 67–73.

    Google Scholar 

  • Lüdeling, A., Walter, M., Kroymann, E., & Adolphs, P. (2005). Multi-level Error Annotation in Learner Corpora. In Proceedings of Corpus Linguistics 2005. Birmingham.

  • Lüdeling, A., & Hirschmann, H. (to appear). Error Annotation. In Granger, S., Gilquin, G., & Meunier, F. (eds.), The Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press.

  • Marcus, M. P., Santorini, B., & Marcinkiewicz, M. A. (1993). Building a large annotated corpus of English: The Penn Treebank. Special Issue on Using Large Corpora, Computational Linguistics, 19(2), 313–330.

    Google Scholar 

  • Nagata, R., Whittaker, E., & Sheinman, V. (2011). Creating a manually error-tagged and shallow-parsed learner corpus. Proceedings of the 49th annual meeting of the association for computational linguistics (pp. 1210–1219). Stroudsburg, PA: ACL.

    Google Scholar 

  • Nesi, H., Sharpling, G., & Ganobcsik-Williams, L. (2004). Student papers across the curriculum: designing and developing a corpus of British student writing. Computers and Composition, 21(4), 439–450.

    Article  Google Scholar 

  • Nguyen, N. L. T. & Miyao, Y. (2013). Alignment-based annotation of proofreading texts toward professional writing assistance. In Proceedings of the international joint conference on natural language processing, pp. 753–759.

  • Nicholls, D. (2003). The Cambridge learner corpus: Error coding and analysis for lexicography and ELT. In Proceedings of the corpus linguistics 2003 conference.

  • Paulus, T. M. (1999). The effect of peer and teacher feedback on student writing. Journal of Second Language Writing, 8(3), 265–289.

    Article  Google Scholar 

  • Polio, C., & Fleck, C. (1998). “If I only had more time:” ESL learners’ changes in linguistic accuracy on essay revisions. Journal of Second Language Writing, 7(1), 43–68.

    Article  Google Scholar 

  • Reznicek, M., Lüdeling, A., & Hirschmann, H. (2013). Competing target hypotheses in the Falko corpus: A flexible multi-layer corpus architecture. In A. Díaz-Negrillo, N. Ballier, & P. Thompson (Eds.), Automatic treatment and analysis of learner corpus data (pp. 101–124). Amsterdam: John Benjamins.

    Chapter  Google Scholar 

  • Rosen, A., Hana, J., Stindlova, B., & Feldman, A. (2014). Evaluating and automating the annotation of a learner corpus. Language Resources and Evaluation, 48, 65–92.

    Article  Google Scholar 

  • Rozovskaya, A., & Roth, D. (2010). Annotating ESL errors: Challenges and rewards. In: Proceedings of NAACL’10 workshop on innovative use of NLP for building educational applications.

  • Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for the acquisition of L2 grammar: A meta-analysis of the research. In J. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (Language learning and language teaching 13) (pp. 133–164). Amsterdam and Philadelphia: John Benjamins.

    Google Scholar 

  • Shemtov, H. (1993). Text Alignment in a Tool for Translating Revised Documents. Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics (EACL-93) (pp. 449–453). Stroudsburg, PA: ACL.

    Chapter  Google Scholar 

  • Snover, M., Dorr, B., Schwartz, R., Micciulla, L., & Makhoul, J. (2006). A study of translation edit rate with targeted human annotation. In Proceedings of the 7th conference of the association for machine translation in the Americas. Cambridge, MA, pp. 223–231.

  • Tetreault, J. R., & Chodorow, M. (2008). Native judgments of non-native usage: Experiments in preposition error detection. In Proceedings of the workshop on human judgements in computational linguistics, pp. 24–32.

  • Toutanova, K., Klein, D., Manning, C. D., & Singer, Y. (2003). Feature-rich part-of-speech tagging with a cyclic dependency network. Proceedings of the 2003 conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL-HLT 2003) (pp. 252–259). Stroudsburg, PA: ACL.

    Google Scholar 

  • Toutanova, K., & Manning, C. D. (2000). Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In Proceedings of the 2000 joint SIGDAT conference on empirical methods in natural language processing and very large corpora. Hong Kong, pp. 63–70.

  • Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning, 46(2), 327–369.

    Article  Google Scholar 

  • Truscott, J., & Hsu, A. Y.-P. (2008). Error correction, revision, and learning. Journal of Second Language Writing, 17(4), 292–305.

    Article  Google Scholar 

  • Webster, J., Chan, A., & Lee, J. (2011). Introducing an online language learning environment and its corpus of tertiary student writing. Asia Pacific World, 2(2), 44–65.

    Article  Google Scholar 

  • Wible, D., Kuo, C.-H., Chien, F.-L., Liu, A., & Tsao, N.-L. (2001). A web-based EFL writing environment: Integrating information for learners, teachers, and researchers. Computers & Education, 37(3–4), 297–315.

    Article  Google Scholar 

  • Zeldes, A., Ritz, J., Lüdeling, A., & Chiarcos, C. (2009). ANNIS: A search tool for multi-layer annotated corpora. In Proceedings of corpus linguistics 2009. Liverpool, UK.

  • Zipser, F., & Romary, L. (2010). A model oriented approach to the mapping of annotation formats using standards. Proceedings of the workshop on language resource and language technology standards, LREC-2010 (pp. 7–18). Malta: Valletta.

    Google Scholar 

Download references

Acknowledgments

The work described in this article was supported by a grant from the Germany / Hong Kong Joint Research Scheme sponsored by the Research Grants Council of Hong Kong and the German Academic Exchange Service (Reference No. G_HK013/11).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John Lee.

Appendix: Error categories

Appendix: Error categories

The complete list of the error categoriesFootnote 13 used in our corpus, with example sentences, are shown in Table 10. The text span addressed by the error category is enclosed in square brackets. For some of the categories, we provide an explanation rather than an example because of space constraints.

Table 10 Complete list of the error categories used in our corpus, with example sentences

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, J., Yeung, C.Y., Zeldes, A. et al. CityU corpus of essay drafts of English language learners: a corpus of textual revision in second language writing. Lang Resources & Evaluation 49, 659–683 (2015). https://doi.org/10.1007/s10579-015-9301-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-015-9301-z

Keywords

Navigation