Intraclausal Coordination and Clause Detection as a Preprocessing Step to Dependency Parsing

  • Domen Marinčič
  • Matjaž Gams
  • Tomaž Šef
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5729)

Abstract

The impact of clause and intraclausal coordination detection to dependency parsing of Slovene is examined. New methods based on machine learning and heuristic rules are proposed for clause and intraclausal coordination detection. They were included in a new dependency parsing algorithm, PACID. For evaluation, Slovene dependency treebank was used. At parsing, 6.4% and 9.2 % relative error reduction was achieved, compared to the dependency parsers MSTP and Malt, respectively.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abney, S.P.: Rapid Incremental Parsing with Repair. In: Proceedings of the 6th New OED Conference, pp. 1–9 (1990)Google Scholar
  2. 2.
    Ejerhed, E.I.: Finding clauses in unrestricted text by finitary and stochastic methods. In: Proceedings of the second conference on Applied natural language processing, pp. 219–227 (1988)Google Scholar
  3. 3.
    Tjong Kim Sang, E.F.: Memory-Based Shallow Parsing. Journal of Machine Learning Research 2, 559–594 (2002)Google Scholar
  4. 4.
    Hogan, D.: Empirical measurements of lexical similarity in noun phrase conjuncts. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 149–152 (2007)Google Scholar
  5. 5.
    Ohno, T., Matsubara, S., Kashioka, H., Maruyama, T., Inagaki, Y.: Incremental Dependency Parsing of Japanese Spoken Monologue Based on Clause Boundaries. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics (ACL), pp. 169–176 (2006)Google Scholar
  6. 6.
    Holán, T., Žabokrtský, Z.: Combining czech dependency parsers. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 95–102. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Džeroski, S., Erjavec, T., Ledinek, N., Pajas, P., Žabokrtský, Z., Žele, A.: Towards a Slovene Dependency Treebank. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), pp. 1388–1391 (2006)Google Scholar
  8. 8.
    Kuboň, V., Lopatková, M., Plátek, M., Pognan, P.: Segmentation of complex sentences. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 151–158. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2005)Google Scholar
  10. 10.
    Erjavec, T.: The English-Slovene ACQUIS Corpus. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), pp. 2138–2141 (2006)Google Scholar
  11. 11.
    McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective Dependency Parsing Using Spanning Tree Algorithms. In: Proceedings of the Joint Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT-EMNLP), pp. 523–530 (2005)Google Scholar
  12. 12.
    Nivre, J.: Inductive Dependency Parsing. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Marinčič, D., Gams, M., Šef, T.: How much can clause identification help to improve dependency parsing? In: Proceedings of the 10th International Multiconference Information Society (IS), pp. 92–94 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Domen Marinčič
    • 1
  • Matjaž Gams
    • 1
  • Tomaž Šef
    • 1
  1. 1.Jozef Stefan InstituteLjubljanaSlovenia

Personalised recommendations