Advertisement

Reducing Overdetections in a French Symbolic Grammar Checker by Classification

  • Fabrizio Gotti
  • Philippe Langlais
  • Guy Lapalme
  • Simon Charest
  • Éric Brunelle
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6609)

Abstract

We describe the development of an “overdetection” identifier, a system for filtering detections erroneously flagged by a grammar checker. Various families of classifiers have been trained in a supervised way for 14 types of detections made by a commercial French grammar checker. Eight of these were integrated in the most recent commercial version of the system. This is a striking illustration of how a machine learning component can be successfully embedded in Antidote, a robust, commercial, as well as popular natural language application.

Keywords

Word Sense Disambiguation Computational Linguistics Past Participle Correcteurs Orthographiques Prepositional Object 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fontenelle, T.: Dictionnaires et outils de correction linguistiques. Rev. franç. de linguistique appliquée X-2, 119–128 (2005)Google Scholar
  2. 2.
    Véronis, J.: Texte: Correcteurs orthographiques en panne? Blog du (July 6, 2005), http://aixtal.blogspot.com/2005/07/texte-correcteurs-orthographiques-en.html
  3. 3.
    Clément, L., Gerdes, K., Marlet, R.: Grammaires d’erreur – correction grammaticale avec analyse profonde et proposition de corrections minimales. In: 16è TALN, Senlis, France (2009)Google Scholar
  4. 4.
    Bustamante, F.R., León, F.S.: Gramcheck: A grammar and style checker. In: 16th COLING, Denmark, pp. 175–181 (1996)Google Scholar
  5. 5.
    Napolitano, D., Stent, A.: TechWriter: An Evolving System for Writing TechWriter: An Evolving System for Writing Assistance for Advanced Learners of English. CALICO 26(3), 611–625 (2009)Google Scholar
  6. 6.
    Rider, Z.: Grammar checking using pos tagging and rules matching. In: Proceedings of the Class of 2005, Senior Conference, Computer Science Department, Swarthmore College, pp. 14–19 (2005)Google Scholar
  7. 7.
    Souque, A.: Vers une nouvelle approche de la correction grammaticale automatique. In: Récital, Avignon, France (2008)Google Scholar
  8. 8.
    Foster, J., Vogel, C.: Parsing ill-formed text using an error grammar. Artif. Intell. Rev. 21(3-4), 269–291 (2004)CrossRefzbMATHGoogle Scholar
  9. 9.
    Foster, J.: Good Reasons for Noting Bad Grammar: Empirical Investigations into the Parsing of Ungrammatical Written English. PhD thesis, Department of Computer Science - University of Dublin (May 2005)Google Scholar
  10. 10.
    Foster, J.: Treebanks gone bad: Parser evaluation and retraining using a treebank of ungrammatical sentences. Int. J. Doc. Anal. Recognit. 10(3), 129–145 (2007)CrossRefGoogle Scholar
  11. 11.
    Sofkova Hashemi, S.: Detecting grammar errors in children’s writing: A finite state approach. In: 13th Nordic Conference on Computational Linguistics, Uppsala, Sweden (May 2001)Google Scholar
  12. 12.
    Helfrich, A., Music, B.: Design and evaluation of grammar checkers in multiple languages. In: Project notes and demonstration at the 18th COLING, Saarbrücken, Germany, pp. 1036–1040 (2000)Google Scholar
  13. 13.
    Bernth, A.: Easyenglish: a tool for improving document quality. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, Morristown, NJ, USA, pp. 159–165. Association for Computational Linguistics (1997)Google Scholar
  14. 14.
    Golding, A.R., Roth, D.: A winnow-based approach to context-sensitive spelling correction. CoRR cs.LG/9811003 (1998)Google Scholar
  15. 15.
    Yarowsky, D.: Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In: 33rd Meeting of the ACL, Cambridge, MA, pp. 189–196 (1995)Google Scholar
  16. 16.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1), 10–18 (2009)CrossRefGoogle Scholar
  17. 17.
    Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, San Francisco (1995)Google Scholar
  18. 18.
    Damerau, F.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)CrossRefGoogle Scholar
  19. 19.
    Brill, E., Moore, R.C.: An improved error model for noisy channel spelling correction. In: ACL 2000: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, Morristown, NJ, USA, pp. 286–293. Association for Computational Linguistics (2000)Google Scholar
  20. 20.
    Kiefer, B., Krieger, H.U., Carroll, J., Malouf, R.: A bag of useful techniques for efficient and robust parsing (1999)Google Scholar
  21. 21.
    Rimell, L., Clark, S.: Adapting a lexicalized-grammar parser to contrasting domains. In: EMNLP 2008: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Morristown, NJ, USA, pp. 475–484. Association for Computational Linguistics (2008)Google Scholar
  22. 22.
    van Noord, G.: Using self-trained bilexical preferences to improve disambiguation accuracy. In: IWPT 2007: Proceedings of the 10th International Conference on Parsing Technologies, Morristown, NJ, USA, pp. 1–10. Association for Computational Linguistics (2007)Google Scholar
  23. 23.
    Sagot, B., de la Clergerie, E.: Fouille d’erreurs sur des sorties d’analyseurs syntaxiques. Traitement Automatique des Langues 49(1), 41–60 (2009)Google Scholar
  24. 24.
    de Kok, D., Ma, J., van Noord, G.: A generalized method for iterative error mining in parsing results. In: Proceedings of the 2009 Workshop on Grammar Engineering Across Frameworks (GEAF 2009), Suntec, Singapore, pp. 71–79. Association for Computational Linguistics (August 2009)Google Scholar
  25. 25.
    Klavans, J.L., Resnik, P. (eds.): The balancing act: combining symbolic and statistical approaches to language. MIT Press, Cambridge (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Fabrizio Gotti
    • 1
  • Philippe Langlais
    • 1
  • Guy Lapalme
    • 1
  • Simon Charest
    • 2
  • Éric Brunelle
    • 2
  1. 1.DIROUniv. de MontréalMontréalCanada
  2. 2.Druide InformatiqueMontréalCanada

Personalised recommendations