Advertisement

Introduction

  • Cícero Nogueira dos Santos
  • Ruy Luiz Milidiú
Chapter
Part of the SpringerBriefs in Computer Science book series (BRIEFSCOMPUTER)

Abstract

This chapter presents a brief introduction to entropy guided transformation learning (ETL), a machine learning algorithm for classification tasks. ETL generalizes transformation based learning (TBL) by automatically solving the TBL bottleneck: the construction of good template sets. The main advantage of ETL is its easy applicability to natural language processing (NLP) tasks. This introductory chapter presents the motivation behind ETL and summarizes our experimental results. In Sect. 1.1, we first briefly detail TBL and explain its bottleneck. Next, we briefly present ETL and list some of its advantages. In Sect. 1.2, we first list some related works on the use of ETL for different NLP tasks. Next, we report a summary of our experimental results on the application of ETL to four language independent NLP tasks: part-of-speech tagging, phrase chunking, named entity recognition and semantic role labeling. Finally, in Sect. 1.3, we detail the structure of the book.

Keywords

Machine learning Entropy guided transformation learning ETL committee Transformation based learning Natural language processing Part-of-speech tagging Phrase chunking Named entity recognition Semantic role labeling 

References

  1. 1.
    Alvim, L., Vilela, P., Motta, E., Milidiú, R.L.: Sentiment of financial news: a natural language processing approach. In: 1st Workshop on Natural Language Processing Tools Applied to Discourse Analysis in Psychology, Buenos Aires (2010)Google Scholar
  2. 2.
    Brill, E.: Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput. Linguist. 21(4), 543–565 (1995)Google Scholar
  3. 3.
    Curran, J.R., Wong, R.K.: Formalisation of transformation-based learning. In: Proceedings of the ACSC, pp. 51–57, Canberra (2000)Google Scholar
  4. 4.
    dos Santos, C.N., Carvalho, D.L.: Rule and tree ensembles for unrestricted coreference resolution. In: Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task, pp. 51–55. Association for Computational Linguistics, Portland (2011). http://www.aclweb.org/anthology/W11-1906
  5. 5.
    dos Santos, C.N., Milidiú, R.L.: Entropy guided transformation learning. Technical Report 29/07, Departamento de Informática, PUC-Rio (2007). http://bib-di.inf.puc-rio.br/techreports/2007.htm
  6. 6.
    dos Santos, C.N., Milidiú, R.L.: Entropy guided transformation learning. In: Hassanien, A.E., Abraham, A., Vasilakos, A.V., Pedrycz, W. (eds.) Learning and Approximation: Theoretical Foundations and Applications, Foundations of Computational Intelligence, vol. 1. Springer, Berlin (2009)Google Scholar
  7. 7.
    dos Santos, C.N., Milidiú, R.L., Crestana, C.E.M., Fernandes, E.R.: ETL ensembles for chunking, NER and SRL. In: Proceedings of the 11th International Conference on Intelligent Text Processing and Computational Linguistics—CICLing, pp. 100–112 (2010)Google Scholar
  8. 8.
    dos Santos, C.N., Milidiú, R.L., Rentería, R.P.: Portuguese part-of-speech tagging using entropy guided transformation learning. In: Proceedings of 8th Workshop on Computational Processing of Written and Spoken Portuguese, pp. 143–152, Aveiro (2008)Google Scholar
  9. 9.
    dos Santos, C.N., Oliveira, C.: Constrained atomic term: widening the reach of rule templates in transformation based learning. In: Portuguese Conference on Artificial Intelligence—EPIA, pp. 622–633 (2005)Google Scholar
  10. 10.
    Fernandes, E., Milidiú, R.L., Rentería, R.: Relhunter: a machine learning method for relation extraction from text. J. Braz. Comput. Soc. 16, 191–199 (2010). doi: 10.1007/s13173-010-0018-y
  11. 11.
    Fernandes, E.R., Crestana, C.E.M., Milidiú, R.L.: Hedge detection using the Relhunter approach. In: Farkas, R., Vincze, V., Szarvas, G., Móra, G., Csirik, J. (eds.) Proceedings of the Fourteenth Conference on Computational Natural Language Learning: Shared Task, pp. 64–69. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  12. 12.
    Fernandes, E.R., Pires, B.A., dos Santos, C.N., Milidiú, R.L.: Clause identification using entropy guided transformation learning. In: Proceedings of 7th Brazilian Symposium in Information and Human Language Technology (2009)Google Scholar
  13. 13.
    Fernandes, E.R., dos Santos, C.N., Milidiú, R.L.: Portuguese language processing service. In: Proceedings of the WWW in Ibero-America Alternate Track of the 19th International World Wide Web Conference (2009)Google Scholar
  14. 14.
    Fernandes, E.R., dos Santos, C.N., Milidiú, R.L.: A machine learning approach to portuguese clause identification. In: Proceedings of the International Conference on Computational Processing of Portuguese, Language, pp. 55–64 (2010)Google Scholar
  15. 15.
    Finger, M.: Técnicas de otimização da precisão empregadas no etiquetador tycho brahe. In: Proceedings of the Workshop on Computational Processing of Written and Spoken Portuguese, pp. 141–154, São Paulo (2000)Google Scholar
  16. 16.
    Florian, R.: Named entity recognition as a house of cards: classifier stacking. In: Proceedings of CoNLL-2002, pp. 175–178, Taipei (2002)Google Scholar
  17. 17.
    Freitas, M.C., Duarte, J.C., dos Santos, C.N., Milidiú, R.L., Renteria, R.P., Quental, V.: A machine learning approach to the identification of appositives. In: Proceedings of Ibero-American AI Conference—IBERAMIA, Ribeirão Preto (2006)Google Scholar
  18. 18.
    Higgins, D.: A transformation-based approach to argument labeling. In: Ng, H.T., Riloff, E. (eds.) HLT-NAACL 2004 Workshop: Eighth Conference on Computational Natural Language Learning (CoNLL-2004), pp. 114–117. Association for Computational Linguistics, Boston (2004)Google Scholar
  19. 19.
    Jurafsky, D., Martin, J.H.: Speech and Language Processing. Prentice Hall, Upper Saddle River (2000)Google Scholar
  20. 20.
    Jurcícek, F., Gasic, M., Keizer, S., Mairesse, F., Thomson, B., Yu, K., Young, S.: Transformation-based learning for semantic parsing. In: INTERSPEECH, pp. 2719–2722 (2009)Google Scholar
  21. 21.
    Mangu, L., Brill, E.: Automatic rule acquisition for spelling correction. In: Fisher, D.H. (ed.) Proceedings of The Fourteenth ICML. Morgan Kaufmann, San Francisco (1997)Google Scholar
  22. 22.
    Megyesi, B.: Shallow parsing with pos taggers and linguistic features. J. Mach. Learn. Res. 2, 639–668 (2002)zbMATHGoogle Scholar
  23. 23.
    Milidiú, R.L., Crestana, C.E.M., dos Santos, C.N.: A token classification approach to dependency parsing. In: Proceedings of 7th Brazilian Symposium in Information and Human Language Technology (2009)Google Scholar
  24. 24.
    Milidiú, R.L., Duarte, J.C., Cavalcante, R.: Machine learning algorithms for portuguese named entity recognition. In: Proceedings of Fourth Workshop in Information and Human Language Technology, Ribeirão Preto (2006)Google Scholar
  25. 25.
    Milidiú, R.L., dos Santos, C.N., Duarte, J.C.: Phrase chunking using entropy guided transformation learning. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies—ACL-08: HLT, Columbus (2008)Google Scholar
  26. 26.
    Milidiú, R.L., dos Santos, C.N., Duarte, J.C.: Portuguese corpus-based learning using ETL. J. Braz. Comput. Soc. 14(4), 17–27 (2008)CrossRefGoogle Scholar
  27. 27.
    Milidiú, R.L., dos Santos, C.N., Duarte, J.C., Renteria, R.P.: Semi-supervised learning for portuguese noun phrase extraction. In: Proceedings of 7th Workshop on Computational Processing of Written and Spoken Portuguese, pp. 200–203, Itatiaia (2006)Google Scholar
  28. 28.
    Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing Using Very Large Corpora. Kluwer, Dordrecht (1999)Google Scholar
  29. 29.
    Saggion, H.: Learning predicate insertion rules for document abstracting. In: 12th International Conference on Computational Linguistics and Intelligent Text Processing, pp. 301–312 (2011)Google Scholar
  30. 30.
    Sang, E.F.T.K., Buchholz, S.: Introduction to the CoNLL-2000 shared task: chunking. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th CoNLL, pp. 127–132. Association for Computational Linguistics, Morristown (2000). doi: 10.3115/1117601.1117631
  31. 31.
    Surdeanu, M., Johansson, R., Meyers, A., Màrquez, L., Nivre, J.: The CoNLL 2008 shared task on joint parsing of syntactic and semantic dependencies. In: CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning, pp. 159–177. Coling 2008 Organizing Committee, Manchester (2008). http://www.aclweb.org/anthology/W08-2121
  32. 32.
    Surdeanu, M., Màrquez, L., Carreras, X., Comas, P.: Combination strategies for semantic role labeling. J. Artif. Intell. Res. 29, 105–151 (2007)zbMATHGoogle Scholar
  33. 33.
    Taira, H., Fujita, S., Nagata, M.: Predicate argument structure analysis using transformation-based learning. In: Hajič, J., Carberry, S., Clark, S., Nivre, J. (eds.) Proceedings of the ACL 2010 Conference Short Papers, ACLShort’10, pp. 162–167. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  34. 34.
    Tjong Kim Sang, E.F.: Introduction to the CoNLL-2002 shared task: language-independent named entity recognition. In: Roth, D., van den Bosch, A. (eds.) Proceedings of CoNLL-2002, pp. 155–158, Taipei (2002)Google Scholar
  35. 35.
    Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Daelemans, W., Osborne, M. (eds.) Proceedings of CoNLL-2003, pp. 142–147. Edmonton, Canada (2003)Google Scholar

Copyright information

© The Author(s) 2012

Authors and Affiliations

  • Cícero Nogueira dos Santos
    • 1
  • Ruy Luiz Milidiú
    • 2
  1. 1.ResearchIBM Research BrazilRio de JaneiroBrazil
  2. 2.Departamento de Informática (DI)Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio)Rio de JaneiroBrazil

Personalised recommendations