Skip to main content

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

  • 852 Accesses

Abstract

This chapter presents the application of ETL to language independent phrase chunking (PCK). The PCK task consists in dividing a text into non-overlapping phrases. We apply ETL and ETL committee to four different corpora in three different languages: Portuguese, English and Hindi. For the four corpora ETL system achieves very competitive results. For two copora ETL achieves state-of-the-art results. ETL committee significantly improves the ETL results for the four corpora. This chapter is organized as follows. In Sect. 6.1, we describe the task and the selected corpora. In Sect. 6.2, we detail some modeling configurations used in our PCK system. In Sect. 6.3, we show some configurations used in the machine learning algorithms. Section 6.4 presents the application of ETL for the SNR-CLIC Corpus. In Sect. 6.5, we detail the application of ETL for the Ramshaw and Marcus Corpus. Section 6.6 presents the application of ETL for the CoNLL-2000 Corpus. In Sect. 6.7, we present the application of ETL for the SPSAL-2007 Corpus. Finally, Sect. 6.8 presents some concluding remarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Avinesh, P.V.S., Karthik, G.: Part-of-speech tagging and chunking using conditional random fields and transformation based learning. In: Proceedings of the IJCAI and the Workshop on Shallow Parsing for South Asian Languages, pp. 21–24 (2007)

    Google Scholar 

  2. Bharati, A., Mannem, P.R.: Introduction to shallow parsing contest on South Asian languages. In: Proceedings of the IJCAI and the Workshop On Shallow Parsing for South Asian Languages, pp. 1–8 (2007)

    Google Scholar 

  3. dos Santos, C.N., Oliveira, C.: Constrained atomic term: Widening the reach of rule templates in transformation based learning. In: Portuguese Conference on Artificial Intelligence—EPIA, pp. 622–633 (2005)

    Google Scholar 

  4. Freitas, M.C., Garrao, M., Oliveira, C., dos Santos, C.N., Silveira, M.: A anotação de um corpus para o aprendizado supervisionado de um modelo de sn. In: Proceedings of the III TIL / XXV Congresso da SBC. São Leopoldo (2005).

    Google Scholar 

  5. Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of the NAACL-2001 (2001)

    Google Scholar 

  6. Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Yarovsky, D., Church, K. (eds.) Proceedings of the Third Workshop on Very Large Corpora, pp. 82–94. Association for Computational Linguistics, Somerset (1995)

    Google Scholar 

  7. Sang, E.F.T.K., Buchholz, S.: Introduction to the conll-2000 shared task: chunking. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th CONLL, pp. 127–132. Association for Computational Linguistics, Morristown (2000). doi: 10.3115/1117601.1117631

  8. Wu, Y.C., Chang, C.H., Lee, Y.S.: A general and multi-lingual phrase chunking model based on masking method. In: Proceedings of 7th International Conference on Intelligent Text Processing and Computational Linguistics, pp. 144–155 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2012 The Author(s)

About this chapter

Cite this chapter

dos Santos, C.N., Milidiú, R.L. (2012). Phrase Chunking. In: Entropy Guided Transformation Learning: Algorithms and Applications. SpringerBriefs in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-2978-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-2978-3_6

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-2977-6

  • Online ISBN: 978-1-4471-2978-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics