Introduction

dos Santos, Cícero Nogueira; Milidiú, Ruy Luiz

doi:10.1007/978-1-4471-2978-3_1

Cícero Nogueira dos Santos³ &
Ruy Luiz Milidiú⁴

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

836 Accesses

Abstract

This chapter presents a brief introduction to entropy guided transformation learning (ETL), a machine learning algorithm for classification tasks. ETL generalizes transformation based learning (TBL) by automatically solving the TBL bottleneck: the construction of good template sets. The main advantage of ETL is its easy applicability to natural language processing (NLP) tasks. This introductory chapter presents the motivation behind ETL and summarizes our experimental results. In Sect. 1.1, we first briefly detail TBL and explain its bottleneck. Next, we briefly present ETL and list some of its advantages. In Sect. 1.2, we first list some related works on the use of ETL for different NLP tasks. Next, we report a summary of our experimental results on the application of ETL to four language independent NLP tasks: part-of-speech tagging, phrase chunking, named entity recognition and semantic role labeling. Finally, in Sect. 1.3, we detail the structure of the book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alvim, L., Vilela, P., Motta, E., Milidiú, R.L.: Sentiment of financial news: a natural language processing approach. In: 1st Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires (2010)
Google Scholar
Brill, E.: Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput. Linguist. 21(4), 543–565 (1995)
Google Scholar
Curran, J.R., Wong, R.K.: Formalisation of transformation-based learning. In: Proceedings of the ACSC, pp. 51–57, Canberra (2000)
Google Scholar
dos Santos, C.N., Carvalho, D.L.: Rule and tree ensembles for unrestricted coreference resolution. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task, pp. 51–55. Association for Computational Linguistics, Portland (2011). http://www.aclweb.org/anthology/W11-1906
dos Santos, C.N., Milidiú, R.L.: Entropy guided transformation learning. Technical Report 29/07, Departamento de Informática, PUC-Rio (2007). http://bib-di.inf.puc-rio.br/techreports/2007.htm
dos Santos, C.N., Milidiú, R.L.: Entropy guided transformation learning. In: Hassanien, A.E., Abraham, A., Vasilakos, A.V., Pedrycz, W. (eds.) Learning and Approximation: Theoretical Foundations and Applications, Foundations of Computational Intelligence, vol. 1. Springer, Berlin (2009)
Google Scholar
dos Santos, C.N., Milidiú, R.L., Crestana, C.E.M., Fernandes, E.R.: ETL ensembles for chunking, NER and SRL. In: Proceedings of the 11th International Conference on Intelligent Text Processing and Computational Linguistics—CICLing, pp. 100–112 (2010)
Google Scholar
dos Santos, C.N., Milidiú, R.L., Rentería, R.P.: Portuguese part-of-speech tagging using entropy guided transformation learning. In: Proceedings of 8th Workshop on Computational Processing of Written and Spoken Portuguese, pp. 143–152, Aveiro (2008)
Google Scholar
dos Santos, C.N., Oliveira, C.: Constrained atomic term: widening the reach of rule templates in transformation based learning. In: Portuguese Conference on Artificial Intelligence—EPIA, pp. 622–633 (2005)
Google Scholar
Fernandes, E., Milidiú, R.L., Rentería, R.: Relhunter: a machine learning method for relation extraction from text. J. Braz. Comput. Soc. 16, 191–199 (2010). doi:10.1007/s13173-010-0018-y
Fernandes, E.R., Crestana, C.E.M., Milidiú, R.L.: Hedge detection using the Relhunter approach. In: Farkas, R., Vincze, V., Szarvas, G., Móra, G., Csirik, J. (eds.) Proceedings of the Fourteenth Conference on Computational Natural Language Learning: Shared Task, pp. 64–69. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Fernandes, E.R., Pires, B.A., dos Santos, C.N., Milidiú, R.L.: Clause identification using entropy guided transformation learning. In: Proceedings of 7th Brazilian Symposium in Information and Human Language Technology (2009)
Google Scholar
Fernandes, E.R., dos Santos, C.N., Milidiú, R.L.: Portuguese language processing service. In: Proceedings of the WWW in Ibero-America Alternate Track of the 19th International World Wide Web Conference (2009)
Google Scholar
Fernandes, E.R., dos Santos, C.N., Milidiú, R.L.: A machine learning approach to portuguese clause identification. In: Proceedings of the International Conference on Computational Processing of Portuguese, Language, pp. 55–64 (2010)
Google Scholar
Finger, M.: Técnicas de otimização da precisão empregadas no etiquetador tycho brahe. In: Proceedings of the Workshop on Computational Processing of Written and Spoken Portuguese, pp. 141–154, São Paulo (2000)
Google Scholar
Florian, R.: Named entity recognition as a house of cards: classifier stacking. In: Proceedings of CoNLL-2002, pp. 175–178, Taipei (2002)
Google Scholar
Freitas, M.C., Duarte, J.C., dos Santos, C.N., Milidiú, R.L., Renteria, R.P., Quental, V.: A machine learning approach to the identification of appositives. In: Proceedings of Ibero-American AI Conference—IBERAMIA, Ribeirão Preto (2006)
Google Scholar
Higgins, D.: A transformation-based approach to argument labeling. In: Ng, H.T., Riloff, E. (eds.) HLT-NAACL 2004 Workshop: Eighth Conference on Computational Natural Language Learning (CoNLL-2004), pp. 114–117. Association for Computational Linguistics, Boston (2004)
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing. Prentice Hall, Upper Saddle River (2000)
Google Scholar
Jurcícek, F., Gasic, M., Keizer, S., Mairesse, F., Thomson, B., Yu, K., Young, S.: Transformation-based learning for semantic parsing. In: INTERSPEECH, pp. 2719–2722 (2009)
Google Scholar
Mangu, L., Brill, E.: Automatic rule acquisition for spelling correction. In: Fisher, D.H. (ed.) Proceedings of The Fourteenth ICML. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Megyesi, B.: Shallow parsing with pos taggers and linguistic features. J. Mach. Learn. Res. 2, 639–668 (2002)
MATH Google Scholar
Milidiú, R.L., Crestana, C.E.M., dos Santos, C.N.: A token classification approach to dependency parsing. In: Proceedings of 7th Brazilian Symposium in Information and Human Language Technology (2009)
Google Scholar
Milidiú, R.L., Duarte, J.C., Cavalcante, R.: Machine learning algorithms for portuguese named entity recognition. In: Proceedings of Fourth Workshop in Information and Human Language Technology, Ribeirão Preto (2006)
Google Scholar
Milidiú, R.L., dos Santos, C.N., Duarte, J.C.: Phrase chunking using entropy guided transformation learning. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies—ACL-08: HLT, Columbus (2008)
Google Scholar
Milidiú, R.L., dos Santos, C.N., Duarte, J.C.: Portuguese corpus-based learning using ETL. J. Braz. Comput. Soc. 14(4), 17–27 (2008)
Article Google Scholar
Milidiú, R.L., dos Santos, C.N., Duarte, J.C., Renteria, R.P.: Semi-supervised learning for portuguese noun phrase extraction. In: Proceedings of 7th Workshop on Computational Processing of Written and Spoken Portuguese, pp. 200–203, Itatiaia (2006)
Google Scholar
Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing Using Very Large Corpora. Kluwer, Dordrecht (1999)
Google Scholar
Saggion, H.: Learning predicate insertion rules for document abstracting. In: 12th International Conference on Computational Linguistics and Intelligent Text Processing, pp. 301–312 (2011)
Google Scholar
Sang, E.F.T.K., Buchholz, S.: Introduction to the CoNLL-2000 shared task: chunking. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th CoNLL, pp. 127–132. Association for Computational Linguistics, Morristown (2000). doi:10.3115/1117601.1117631
Surdeanu, M., Johansson, R., Meyers, A., Màrquez, L., Nivre, J.: The CoNLL 2008 shared task on joint parsing of syntactic and semantic dependencies. In: CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning, pp. 159–177. Coling 2008 Organizing Committee, Manchester (2008). http://www.aclweb.org/anthology/W08-2121
Surdeanu, M., Màrquez, L., Carreras, X., Comas, P.: Combination strategies for semantic role labeling. J. Artif. Intell. Res. 29, 105–151 (2007)
MATH Google Scholar
Taira, H., Fujita, S., Nagata, M.: Predicate argument structure analysis using transformation-based learning. In: Hajič, J., Carberry, S., Clark, S., Nivre, J. (eds.) Proceedings of the ACL 2010 Conference Short Papers, ACLShort’10, pp. 162–167. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Tjong Kim Sang, E.F.: Introduction to the CoNLL-2002 shared task: language-independent named entity recognition. In: Roth, D., van den Bosch, A. (eds.) Proceedings of CoNLL-2002, pp. 155–158, Taipei (2002)
Google Scholar
Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Daelemans, W., Osborne, M. (eds.) Proceedings of CoNLL-2003, pp. 142–147. Edmonton, Canada (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Research, IBM Research Brazil, Av. Pasteur 146, Rio de Janeiro, RJ, 22296-903, Brazil
Cícero Nogueira dos Santos
Departamento de Informática (DI), Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio), Rio de Janeiro, RJ, Brazil
Ruy Luiz Milidiú

Authors

Cícero Nogueira dos Santos
View author publications
You can also search for this author in PubMed Google Scholar
Ruy Luiz Milidiú
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

dos Santos, C.N., Milidiú, R.L. (2012). Introduction. In: Entropy Guided Transformation Learning: Algorithms and Applications. SpringerBriefs in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-2978-3_1

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2978-3_1
Published: 15 March 2012
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2977-6
Online ISBN: 978-1-4471-2978-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics