Skip to main content
Log in

On Morphological Analysis for Learner Language, Focusing on Russian

  • Published:
Research on Language and Computation

Abstract

We describe a framework for performing morphological analysis to account for learner language, focusing on Russian as an example of an inflecting language. Because a set of linguistic analyses is needed to provide feedback on potentially noisy data, there is a large amount of ambiguity for even well-formed words. Using a segmented POS lexicon as a test case, we show how to analyze subparts of words, in order to analyze variations. After describing and implementing this framework for Russian, we focus on removing undesirable analyses to keep the task feasible. This is essentially an investigation of how much overgeneration of analyses is a problem and under what assumptions it can be reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Amaral L., Meurers D. (2007) Conceptualizing Student Models for ICALL. In: Conati C., McCoy K.F. (eds) User Modeling 2007: Proceedings of the eleventh international conference. Springer, Lecture Notes in Computer Science, Springer Wien, New York, Berlin

    Google Scholar 

  • Beesley K. R., Karttunen L. (2003) Finite state morphology. CSLI Publications, Stanford

    Google Scholar 

  • Chew, P. A., Bader, B. W., & Abdelali, A. (2008). Latent morpho-semantic analysis: Multilingual information retrieval with character N-grams and mutual information. In Proceedings of the 22nd international conference on computational linguistics (Coling 2008) (pp. 129–136). Manchester, UK: Coling 2008 Organizing Committee.

  • Díaz-Negrillo, A., Meurers, D., Valera, S., & Wunsch, H. (2010, to appear). Towards interlanguage POS annotation for effective learner corpora in SLA and FLT. Language Forum Special Issue on New Trends in Language Teaching.

  • Dickinson, M. (2010). Generating learner-like morphological errors in Russian. In Proceedings of the 23nd international conference on computational linguistics (COLING-10). Beijing, China.

  • Dickinson, M., & Herring, J. (2008). Developing online ICALL exercises for Russian. In The 3rd workshop on innovative use of NLP for building educational applications (pp. 1–9). Columbus, OH.

  • Evans, R., Tiberius, C., Brown, D., & Corbett, G. (2003a). Russian Lemmatisation with DATR. Tech. rep., University of Brighton, Brighton. Information Technology Research Institute Technical Report Series, ITRI-03-23.

  • Evans, R., Tiberius, C., Brown, D., & Corbett, G. C. (2003b). A large-scale inheritance-based morphological lexicon for Russian. In Proceedings of the EACL 2003 workshop on morphological processing of slavic languages (pp. 9–16). Budapest.

  • Feldman A., Hana J. (2010) A resource-light approach to morpho-syntactic tagging. Rodopi, Amsterdam

    Google Scholar 

  • Felshin, S. (1995). The athena language learning project NLP system: A multilingual system for conversation-based language learning. In: Intelligent language tutors: Theory shaping technology (Chap. 14, pp. 257–272). Lawrence Erlbaum Associates.

  • Foster J., Vogel C. (2004) Parsing ill-formed text using an error grammar. Artificial Intelligence Review 21: 269–291 (Special AICS 2003 Issue)

    Article  Google Scholar 

  • Gelbukh, A., & Sidorov, G. (2003). Approach to construction of automatic morphological analysis systems for inflective languages with little effort. In Proceedings of the fourth international conference on intelligent text processing and computational linguistics (CICLing-03), Lecture Notes in Computer Science (Vol. 2588, pp. 215–220). Springer.

  • Heift T. (2003) Multiple learner errors and meaningful feedback: A challenge for ICALL systems. CALICO Journal 20(3): 533–548

    Google Scholar 

  • Heift T., Schulze M. (2007) Errors and intelligence in computer-assisted language learning: Parsers and pedagogues. Routledge, New York

    Google Scholar 

  • Karttunen, L., Kaplan, R. M. & Zaenen, A. (1992). Two-level morphology with composition. In Proceedings of the 14th international conference on computational linguistics (COLING-92) (pp. 141–148).

  • Koskenniemi, K. (1983). Two-level morphology: A general computational model for word-form recognition and production. Ph.D. thesis, University of Helsinki.

  • Loritz D. (1992) Generalized transition network parsing for language study: The GPARS system for English, Russian, Japanese and Chinese. CALICO Journal 10(1): 5–22

    Google Scholar 

  • Menzel W. (2006) Detecting mistakes or finding misconceptions? Diagnosing morpho-syntactic errors in language learning. In: Angelova G., Simov K., Slavcheva M. (eds) Readings in multilinguality. Incoma Ltd, Shoumen, pp 71–77

    Google Scholar 

  • Menzel, W., & Schröder, I. (1999). Error diagnosis for language learning systems. Specifal edition of the ReCALL Journal pp. 20–30.

  • Mikheev A. (1997) Automatic rule induction for unknown-word guessing. Computational Linguistics 23(3): 405–423

    Google Scholar 

  • Oflazer K. (1996) Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction. Computational Linguistics 22(1): 73–89

    Google Scholar 

  • Roark B., Sproat R. (2007) Computational approaches to morphology and syntax. Oxford University Press, Oxford

    Google Scholar 

  • Rosengrant S. F. (1987) Error patterns in written Russian. The Modern Language Journal 71(2): 138–145

    Article  Google Scholar 

  • Rozovskaya, A., & Roth, D. (2010). Annotating ESL errors: Challenges and rewards. In Proceedings of the NAACL HLT 2010 fifth workshop on innovative use of NLP for building educational applications (pp. 28–36). Los Angeles, California: Association for Computational Linguistics.

  • Rubinstein G. (1995) On case errors made in oral speech by American learners of Russian. Slavic and East European Journal 39(3): 408–429

    Article  Google Scholar 

  • Schmid, H. (2005). A programming language for finite state transducers. In Proceedings of the 5th international workshop on finite state methods in natural language processing (FSMNLP 2005). Helsinki, Finland.

  • Schneider, D., & McCoy, K. F. (1998). Recognizing syntactic errors in the writing of second language learners. In Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics (Vol. 2, pp. 1198–1204). Montreal, Quebec, Canada: Association for Computational Linguistics.

  • Segalovich I. (2003) A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. In: Arabnia H.R., Kozerenko E.B (eds) Proceedings of the international conference on machine learning; models, technologies and applications (MLMTA’03). CSREA Press, Las Vegas, pp 273–280

    Google Scholar 

  • Sharoff, S., Kopotev, M., Erjavec, T., Feldman, A., & Divjak, D. (2008). Designing and evaluating Russian tagsets. In Proceedings of the 6th international language resources and evaluation conference (LREC-08). Marrakech, Morocco.

  • Tetreault, J., & Chodorow, M. (2008). Native judgments of non-native usage: Experiments in preposition error detection. In Coling 2008: Proceedings of the workshop on human judgements in computational linguistics (pp. 24–32). Manchester, UK: Coling 2008 Organizing Committee.

  • Townsend, C. E. (1975). Russian word formation. Bloomington, IN: Slavica Publishers, Inc.

  • Vandeventer Faltin, A. (2003). Syntactic error diagnosis in the context of computer assisted language learning. Thèse de doctorat, Université de Genève, Genève.

  • Yablonsky, S. A. (1999). Russian morphological analysis. In Proceedings of Venezia per il Trattamento Automatico delle Lingue (VEXTAL’99) (pp. 83–90). Venezia.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Markus Dickinson.

About this article

Cite this article

Dickinson, M. On Morphological Analysis for Learner Language, Focusing on Russian. Res on Lang and Comput 8, 273–298 (2010). https://doi.org/10.1007/s11168-011-9079-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11168-011-9079-0

Keywords

Navigation