Skip to main content

Automatic Processing of Linguistic Data as a Feedback for Linguistic Theory

  • Conference paper
Book cover Advances in Artificial Intelligence and Its Applications (MICAI 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8265))

Included in the following conference series:

  • 1308 Accesses

Abstract

The paper describes a method of identifying a set of interesting constructions in a syntactically annotated corpus of Czech – the Prague Dependency Treebank – by application of an automatic procedure of analysis by reduction to the trees in the treebank. The procedure clearly reveals certain linguistic phenomena that go beyond ‘dependency nature’ (and thus generally pose a problem for dependency-based formalisms). Moreover, it provides a feedback indicating that the annotation of a particular phenomenon might be inconsistent.

The paper contains discussion and analysis of individual phenomena, as well as the quantification of results of the automatic procedure on a subset of the treebank. The results show that a vast majority of sentences from the subset used in these experiments can be analyzed automatically and it confirms that most of the problematic phenomena belong to the language periphery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hajič, J., Panevová, J., Hajičová, E., Sgall, P., Pajas, P., Štěpánek, J., Havelka, J., Mikulová, M., Žabokrtský, Z., Ševčíková-Razímová, M.: Prague Dependency Treebank 2.0. LDC, Philadelphia (2006)

    Google Scholar 

  2. Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Reidel, Dordrecht (1986)

    Google Scholar 

  3. Hajičová, E.: Corpus annotation as a test of a linguistic theory: The case of Prague Dependency Treebank, pp. 15–24. Franco Angeli, Milano (2007)

    Google Scholar 

  4. Lopatková, M., Plátek, M., Kuboň, V.: Modeling Syntax of Free Word-Order Languages: Dependency Analysis by Reduction. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 140–147. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Lopatková, M., Plátek, M., Sgall, P.: Towards a Formal Model for Functional Generative Description: Analysis by Reduction and Restarting Automata. The Prague Bulletin of Mathematical Linguistics 87, 7–26 (2007)

    Google Scholar 

  6. Tesnière, L.: Eléments de syntaxe structurale. Librairie C. Klincksieck, Paris (1959)

    Google Scholar 

  7. Mel’čuk, I.A.: Dependency in language. In: Proceedings of DepLing 2011, Barcelona, pp. 1–16 (2011)

    Google Scholar 

  8. Gerdes, K., Kahane, S.: Defining dependencies (and constituents). In: Proceedings of DepLing 2011, Barcelona, pp. 17–27 (2011)

    Google Scholar 

  9. Jančar, P., Mráz, F., Plátek, M., Vogel, J.: On monotonic automata with a restart operation. Journal of Automata, Languages and Combinatorics 4, 287–311 (1999)

    MathSciNet  MATH  Google Scholar 

  10. Otto, F.: Restarting Automata. In: Reichel, H. (ed.) FCT 1995. LNCS, vol. 965, pp. 269–303. Springer, Heidelberg (1995)

    Google Scholar 

  11. Plátek, M., Mráz, F., Lopatková, M.: (In)Dependencies in Functional Generative Description by Restarting Automata. In: Proceedings of NCMA 2010, Wien, Austria, Österreichische Computer Gesellschaft. books@ocg.at, vol. 263, pp. 155–170 (2010)

    Google Scholar 

  12. Avgustinova, T., Oliva, K.: On the Nature of the Wackernagel Position in Czech. In: Formale Slavistik, pp. 25–47. Vervuert Verlag, Frankfurt am Main (1997)

    Google Scholar 

  13. Hana, J.: Czech Clitics in Higher Order Grammar. PhD thesis, The Ohio State University (2007)

    Google Scholar 

  14. Hajičová, E., Havelka, J., Sgall, P., Veselá, K., Zeman, D.: Issues of Projectivity in the Prague Dependency Treebank. The Prague Bulletin of Mathematical Linguistics 81, 5–22 (2004)

    Google Scholar 

  15. Holan, T., Kuboň, V., Oliva, K., Plátek, M.: On Complexity of Word Order. Les grammaires de dépendance – Traitement automatique des langues (TAL) 41, 273–300 (2000)

    Google Scholar 

  16. Pajas, P., Štěpánek, J.: System for Querying Syntactically Annotated Corpora. In: Proceedings of the ACL-IJCNLP 2009 Software Demonstrations, pp. 33–36. ACL, Singapore (2009)

    Chapter  Google Scholar 

  17. Pajas, P., Štěpánek, J.: Recent Advances in a Feature-Rich Framework for Treebank Annotation. In: Proceedings of CoLING 2008, vol. 2, pp. 673–680. The Coling 2008 Organizing Committee, Manchester (2008)

    Google Scholar 

  18. Mikulová, M., Bémová, A., Hajič, J., Hajičová, E., Havelka, J., Kolářová, V., Kučová, L., Lopatková, M., Pajas, P., Panevová, J., Razímová, M., Sgall, P., Štěpánek, J., Urešová, Z., Veselá, K., Žabokrtský, Z.: Annotation on the tectogrammatical level in the Prague Dependency Treebank. Annotation manual. Technical Report 30, Prague, Czech Rep. (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kuboň, V., Lopatková, M., Mírovský, J. (2013). Automatic Processing of Linguistic Data as a Feedback for Linguistic Theory. In: Castro, F., Gelbukh, A., González, M. (eds) Advances in Artificial Intelligence and Its Applications. MICAI 2013. Lecture Notes in Computer Science(), vol 8265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45114-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45114-0_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45113-3

  • Online ISBN: 978-3-642-45114-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics