Advertisement

Using Tree Transducers for Detecting Errors in a Treebank of Polish

  • Katarzyna Krasnowska
  • Witold Kieraś
  • Marcin Woliński
  • Adam Przepiórkowski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7499)

Abstract

The paper presents a modification — aimed at highly inflectional languages — of a recently proposed error detection method for syntactically annotated corpora. The technique described below is based on Synchronous Tree Substitution Grammar (STSG), i.e. a kind of tree transducer grammar. The method involves induction of STSG rules from a treebank and application of their subset meeting a certain criterion to the same resource. Obtained results show that the proposed modification can be successfully used in the task of error detection in a treebank of an inflectional language such as Polish.

Keywords

Error Detection Computational Linguistics Annotate Corpus Tree Transducer Syntactic Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Eisner, J.: Learning non-isomorphic tree mappings for machine translation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, ACL 2003, vol. 2, pp. 205–208. Association for Computational Linguistics, Stroudsburg (2003)CrossRefGoogle Scholar
  2. 2.
    Cohn, T., Lapata, M.: Sentence compression as tree transduction. Journal of Artificial Intelligence Research 34, 637–674 (2009)zbMATHGoogle Scholar
  3. 3.
    Kato, Y., Matsubara, S.: Correcting errors in a treebank based on synchronous tree substitution grammar. In: Proceedings of the ACL 2010 Conference Short Papers, ACLShort 2010, pp. 74–79. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  4. 4.
    Woliński, M., Głowińska, K., Świdziński, M.: A preliminary version of Składnica — a treebank of Polish. In: Vetulani, Z. (ed.) Proceedings of the 5th Language & Technology Conference, Poznań, pp. 299–303 (2011)Google Scholar
  5. 5.
    van Halteren, H.: The detection of inconsistency in manually tagged text. In: Proceedings of the 2nd Workshop on Linguistically Interpreted Corpora (LINC 2000) (2000)Google Scholar
  6. 6.
    Eskin, E.: Automatic corpus correction with anomaly detection. In: Proceedings of the 1st Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2000), Seattle, WA, pp. 148–153 (2000)Google Scholar
  7. 7.
    Dickinson, M., Meurers, W.D.: Detecting errors in part-of-speech annotation. In: Proceedings of the 10nth Conference of the European Chapter of the Association for Computational Linguistics (EACL 2003), Budapest, pp. 107–114 (2003)Google Scholar
  8. 8.
    Dickinson, M., Meurers, W.D.: Detecting inconsistencies in treebanks. In: Nivre, J., Hinrichs, E. (eds.) Proceedings of the Second Workshop on Treebanks and Linguistic Theories (TLT 2003), Växjö, Norway, pp. 45–56 (2003)Google Scholar
  9. 9.
    Dickinson, M., Meurers, W.D.: Prune diseased branches to get healthy trees! How to find erroneous local trees in a treebank and why it matters. In: Civit, M., Kübler, S., Martí, M.A. (eds.) Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (TLT 2005), Barcelona, pp. 41–52 (2005)Google Scholar
  10. 10.
    Boyd, A., Dickinson, M., Meurers, D.: On detecting errors in dependency treebanks. Research on Language and Computation 6, 113–137 (2008)CrossRefGoogle Scholar
  11. 11.
    Dickinson, M., Lee, C.M.: Detecting errors in semantic annotation. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, ELRA (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Katarzyna Krasnowska
    • 1
  • Witold Kieraś
    • 1
  • Marcin Woliński
    • 1
  • Adam Przepiórkowski
    • 1
  1. 1.Institute of Computer SciencePolish Academy of SciencesWarsawPoland

Personalised recommendations