Entropy Guided Transformation Learning

dos Santos, Cícero Nogueira; Milidiú, Ruy Luiz

doi:10.1007/978-1-4471-2978-3_2

Entropy Guided Transformation Learning

Cícero Nogueira dos Santos³ &
Ruy Luiz Milidiú⁴

Chapter
First Online: 01 January 2012

847 Accesses

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

Abstract

This chapter details the entropy guided transformation learning algorithm [8, 23]. ETL is an effective way to overcome the transformation based learning bottleneck: the construction of good template sets. In order to better motivate and describe ETL, we first provide an overview of the TBL algorithm in Sect. 2.1. Next, in Sect. 2.2, we explain why the manual construction of template sets is a bottleneck for TBL. Then, in Sect. 2.3, we detail the entropy guided template generation strategy employed by ETL. In Sect. 2.3, we also present strategies to handle high dimensional features and to include the current classification feature in the generated templates. In Sects. 2.4–2.6 we present some variations on the basic ETL strategy. Finally, in Sect. 2.7, we discuss some related works.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: Ensemble diversity measures and their application to thinning. Inf. Fusion 6(1), 49–62 (2005)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). doi:10.1023/A:1010933404324
Article MATH Google Scholar
Brill, E.: Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput. Linguist. 21(4), 543–565 (1995)
Google Scholar
Carberry, S., Vijay-Shanker, K., Wilson, A., Samuel, K.: Randomized rule selection in transformation-based learning: a comparative study. Nat. Lang. Eng. 7(2), 99–116 (2001). doi:10.1017/S1351324901002662
Article Google Scholar
Corston-Oliver, S., Gamon, M.: Combining decision trees and transformation-based learning to correct transferred linguistic representations. In: Proceedings of the Ninth Machine Tranlsation Summit, pp. 55–62. Association for Machine Translation in the Americas, New Orleans (2003)
Google Scholar
Curran, J.R., Wong, R.K.: Formalisation of transformation-based learning. In: Proceedings of the ACSC, pp. 51–57, Canberra (2000)
Google Scholar
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1, 131–156 (1997)
Article Google Scholar
dos Santos, C.N., Milidiú, R.L.: Entropy guided transformation learning. Technical Report 29/07, Departamento de Informática, PUC-Rio (2007). http://bib-di.inf.puc-rio.br/techreports/2007.htm
dos Santos, C.N., Milidiú, R.L.: Probabilistic classifications with TBL. In: Proceedings of Eighth International Conference on Intelligent Text Processing and Computational Linguistics—CICLing, pp. 196–207, Mexico (2007)
Google Scholar
dos Santos, C.N., Oliveira, C.: Constrained atomic term: widening the reach of rule templates in transformation based learning. In: Portuguese Conference on Artificial Intelligence—EPIA, pp. 622–633 (2005)
Google Scholar
Elming, J.: Transformation-based corrections of rule-based MT. In: Proceedings of the EAMT 11th Annual Conference, Oslo (2006)
Google Scholar
Florian, R.: Named entity recognition as a house of cards: classifier stacking. In: Proceedings of CoNLL-2002, pp. 175–178, Taipei (2002)
Google Scholar
Florian, R.: Transformation based learning and data-driven lexical disambiguation: syntactic and semantic ambiguity resolution. Ph.D. Thesis, The Johns Hopkins University (2002)
Google Scholar
Florian, R., Henderson, J.C., Ngai, G.: Coaxing confidences from an old friend: probabilistic classifications from transformation rule lists. In: Proceedings of Joint Sigdat Conference on Empirical Methods in NLP and Very Large Corpora. Hong Kong University of Science and Technology, Kowloon (2000)
Google Scholar
Forman, G., Guyon, I., Elisseeff, A.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)
MATH Google Scholar
Higgins, D.: A transformation-based approach to argument labeling. In: Ng, H.T., Riloff, E. (eds.) HLT-NAACL 2004 Workshop: Eighth Conference on Computational Natural Language Learning (CoNLL-2004), pp. 114–117. Association for Computational Linguistics, Boston (2004)
Google Scholar
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998). doi:10.1109/34.709601
Google Scholar
Hwang, Y.S., Chung, H.J., Rim, H.C.: Weighted probabilistic sum model based on decision tree decomposition for text chunking. Int. J. Comput. Process. Orient. Lang. 16(1), 1–20 (2003)
Article Google Scholar
Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of the NAACL-2001 (2001)
Google Scholar
Liu, F., Shi, Q., Tao, J.: Tree-guided transformation-based homograph disambiguation in mandarin TTS system. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4657–4660, Cambridge (2008)
Google Scholar
Milidiú, R.L., Duarte, J.C., dos Santos, C.N.: Evolutionary TBL template generation. J. Braz. Comput. Soc. 13(4), 39–50 (2007)
Article Google Scholar
Milidiú, R.L., Duarte, J.C., dos Santos, C.N.: TBL template selection: an evolutionary approach. In: Proceedings of Conference of the Spanish Association for Artificial Intelligence—CAEPIA, Salamanca (2007)
Google Scholar
Milidiú, R.L., dos Santos, C.N., Duarte, J.C.: Phrase chunking using entropy guided transformation learning. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies—ACL-08: HLT, Columbus (2008)
Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Ngai, G., Florian, R.: Transformation-based learning in the fast lane. In: Proceedings of North Americal ACL, pp. 40–47 (2001)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986). doi:10.1023/A:1022643204877
Google Scholar
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Ramshaw, L., Marcus, M.: Exploring the statistical derivation of transformational rule sequences for part-of-speech tagging. In: Proceedings of the Balancing Act-Workshop on Combining Symbolic and Statistical Approaches to Language, pp. 86–95. Association for Computational Linguistics, Toulouse (1994). http://www.citeseer.ist.psu.edu/article/ramshaw94exploring.html
Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing Using Very Large Corpora. Kluwer, Dordrecht (1999)
Google Scholar
Su, J., Zhang, H.: A fast decision tree learning algorithm. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence—AAAI (2006)
Google Scholar
Surdeanu, M., Johansson, R., Meyers, A., Màrquez, L., Nivre, J.: The CoNLL 2008 shared task on joint parsing of syntactic and semantic dependencies. In: CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning, pp. 159–177. Coling 2008 Organizing Committee, Manchester (2008). http://www.aclweb.org/anthology/W08-2121

Download references

Author information

Authors and Affiliations

Research, IBM Research Brazil, Av. Pasteur 146, Rio de Janeiro, RJ, 22296-903, Brazil
Cícero Nogueira dos Santos
Departamento de Informática (DI), Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio), Rio de Janeiro, RJ, Brazil
Ruy Luiz Milidiú

Authors

Cícero Nogueira dos Santos
View author publications
You can also search for this author in PubMed Google Scholar
Ruy Luiz Milidiú
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

dos Santos, C.N., Milidiú, R.L. (2012). Entropy Guided Transformation Learning. In: Entropy Guided Transformation Learning: Algorithms and Applications. SpringerBriefs in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-2978-3_2

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2978-3_2
Published: 15 March 2012
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2977-6
Online ISBN: 978-1-4471-2978-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics