Skip to main content

Learning Syntactic Patterns Using Boosting and Other Classifier Combination Schemas

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3658))

Abstract

This paper presents a method for the syntactic parsing of Hungarian natural language texts using a machine learning approach. This method learns tree patterns with various phrase types described by regular expressions from an annotated corpus. The PGS algorithm, an improved version of the RGLearn method, is developed and applied as a classifier in classifier combination schemas. Experiments show that classifier combinations, especially the Boosting algorithm, can effectively improve the recognition accuracy of the syntactic parser.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abney, S.: Partial Parsing via Finite-State Cascades. In: Proceedings of ESSLLI 1996 Robust Parsing Workshop, pp. 1–8 (1996)

    Google Scholar 

  2. Alexin, Z., Csirik, J., Gyimóthy, T., Bibok, K., Hatvani, C., Prószéky, G., Tihanyi, L.: Manually Annotated Hungarian Corpus. In: Proceedings of the Research Note Sessions of the 10th Conference of the European Chapter of the Association for Computational Linguis-tics EACL 2003, Budapest, Hungary, pp. 53–56 (2003)

    Google Scholar 

  3. Erjavec, T., Monachini, M.: Specification and Notation for Lexicon Encoding, Copernicus project 106 ”MULTEXT-EAST”,Work Package WP1 - Task 1.1 Deliverable D1.1F (1997)

    Google Scholar 

  4. Hócza, A.: Noun Phrase Recognition with Tree Patterns. In: Proceedings of the Acta Cybernetica, Szeged, Hungary (2004)

    Google Scholar 

  5. Kis, B., Naszódy, M., Prószéki, G.: Complex Hungarian syntactic parser system. In: Proceedings of the MSZNY 2003, Szeged, Hungary, pp. 145–151 (2003)

    Google Scholar 

  6. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: the Penn Treebank, Association for Computational Linguistics (1993)

    Google Scholar 

  7. Ramshaw, L.A., Marcus, M.P.: Text Chunking Using Transformational-Based Learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, Association for Computational Linguistics (1995)

    Google Scholar 

  8. Tjong Kim Sang, E.F., Veenstra, J.: Representing text chunks. In: Proceedings of EACL 1999, Association for Computational Linguistics (1999)

    Google Scholar 

  9. Tjong Kim Sang, E.F.: Noun Phrase Recognition by System Combination. In: Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, Seattle, pp. 50–55 (2000)

    Google Scholar 

  10. Váradi, T.: Shallow Parsing of Hungarian Business News. In: Proceedings of the Corpus Linguistics, Conference, Lancaster, pp. 845–851 (2003)

    Google Scholar 

  11. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons Inc., Chichester (2001)

    MATH  Google Scholar 

  12. Jain, A.K.: Statistical Pattern Recognition: A Review. IEEE Trans. Pattern Analysis and Machine Intelligence 22(1) (January 2000)

    Google Scholar 

  13. Shapire, R.E.: The Strength of Weak Learnability. Machine Learnings 5, 197–227 (1990)

    Google Scholar 

  14. Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons Inc., Chichester (1998)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hócza, A., Felföldi, L., Kocsor, A. (2005). Learning Syntactic Patterns Using Boosting and Other Classifier Combination Schemas. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_9

Download citation

  • DOI: https://doi.org/10.1007/11551874_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28789-6

  • Online ISBN: 978-3-540-31817-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics