Abstract
This chapter details the entropy guided transformation learning algorithm [8, 23]. ETL is an effective way to overcome the transformation based learning bottleneck: the construction of good template sets. In order to better motivate and describe ETL, we first provide an overview of the TBL algorithm in Sect. 2.1. Next, in Sect. 2.2, we explain why the manual construction of template sets is a bottleneck for TBL. Then, in Sect. 2.3, we detail the entropy guided template generation strategy employed by ETL. In Sect. 2.3, we also present strategies to handle high dimensional features and to include the current classification feature in the generated templates. In Sects. 2.4–2.6 we present some variations on the basic ETL strategy. Finally, in Sect. 2.7, we discuss some related works.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: Ensemble diversity measures and their application to thinning. Inf. Fusion 6(1), 49–62 (2005)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). doi:10.1023/A:1010933404324
Brill, E.: Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput. Linguist. 21(4), 543–565 (1995)
Carberry, S., Vijay-Shanker, K., Wilson, A., Samuel, K.: Randomized rule selection in transformation-based learning: a comparative study. Nat. Lang. Eng. 7(2), 99–116 (2001). doi:10.1017/S1351324901002662
Corston-Oliver, S., Gamon, M.: Combining decision trees and transformation-based learning to correct transferred linguistic representations. In: Proceedings of the Ninth Machine Tranlsation Summit, pp. 55–62. Association for Machine Translation in the Americas, New Orleans (2003)
Curran, J.R., Wong, R.K.: Formalisation of transformation-based learning. In: Proceedings of the ACSC, pp. 51–57, Canberra (2000)
Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1, 131–156 (1997)
dos Santos, C.N., Milidiú, R.L.: Entropy guided transformation learning. Technical Report 29/07, Departamento de Informática, PUC-Rio (2007). http://bib-di.inf.puc-rio.br/techreports/2007.htm
dos Santos, C.N., Milidiú, R.L.: Probabilistic classifications with TBL. In: Proceedings of Eighth International Conference on Intelligent Text Processing and Computational Linguistics—CICLing, pp. 196–207, Mexico (2007)
dos Santos, C.N., Oliveira, C.: Constrained atomic term: widening the reach of rule templates in transformation based learning. In: Portuguese Conference on Artificial Intelligence—EPIA, pp. 622–633 (2005)
Elming, J.: Transformation-based corrections of rule-based MT. In: Proceedings of the EAMT 11th Annual Conference, Oslo (2006)
Florian, R.: Named entity recognition as a house of cards: classifier stacking. In: Proceedings of CoNLL-2002, pp. 175–178, Taipei (2002)
Florian, R.: Transformation based learning and data-driven lexical disambiguation: syntactic and semantic ambiguity resolution. Ph.D. Thesis, The Johns Hopkins University (2002)
Florian, R., Henderson, J.C., Ngai, G.: Coaxing confidences from an old friend: probabilistic classifications from transformation rule lists. In: Proceedings of Joint Sigdat Conference on Empirical Methods in NLP and Very Large Corpora. Hong Kong University of Science and Technology, Kowloon (2000)
Forman, G., Guyon, I., Elisseeff, A.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)
Higgins, D.: A transformation-based approach to argument labeling. In: Ng, H.T., Riloff, E. (eds.) HLT-NAACL 2004 Workshop: Eighth Conference on Computational Natural Language Learning (CoNLL-2004), pp. 114–117. Association for Computational Linguistics, Boston (2004)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998). doi:10.1109/34.709601
Hwang, Y.S., Chung, H.J., Rim, H.C.: Weighted probabilistic sum model based on decision tree decomposition for text chunking. Int. J. Comput. Process. Orient. Lang. 16(1), 1–20 (2003)
Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of the NAACL-2001 (2001)
Liu, F., Shi, Q., Tao, J.: Tree-guided transformation-based homograph disambiguation in mandarin TTS system. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4657–4660, Cambridge (2008)
Milidiú, R.L., Duarte, J.C., dos Santos, C.N.: Evolutionary TBL template generation. J. Braz. Comput. Soc. 13(4), 39–50 (2007)
Milidiú, R.L., Duarte, J.C., dos Santos, C.N.: TBL template selection: an evolutionary approach. In: Proceedings of Conference of the Spanish Association for Artificial Intelligence—CAEPIA, Salamanca (2007)
Milidiú, R.L., dos Santos, C.N., Duarte, J.C.: Phrase chunking using entropy guided transformation learning. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies—ACL-08: HLT, Columbus (2008)
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Ngai, G., Florian, R.: Transformation-based learning in the fast lane. In: Proceedings of North Americal ACL, pp. 40–47 (2001)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986). doi:10.1023/A:1022643204877
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Ramshaw, L., Marcus, M.: Exploring the statistical derivation of transformational rule sequences for part-of-speech tagging. In: Proceedings of the Balancing Act-Workshop on Combining Symbolic and Statistical Approaches to Language, pp. 86–95. Association for Computational Linguistics, Toulouse (1994). http://www.citeseer.ist.psu.edu/article/ramshaw94exploring.html
Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing Using Very Large Corpora. Kluwer, Dordrecht (1999)
Su, J., Zhang, H.: A fast decision tree learning algorithm. In: Proceedings of the Twenty-First National Conference on Artificial Intelligence—AAAI (2006)
Surdeanu, M., Johansson, R., Meyers, A., Màrquez, L., Nivre, J.: The CoNLL 2008 shared task on joint parsing of syntactic and semantic dependencies. In: CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning, pp. 159–177. Coling 2008 Organizing Committee, Manchester (2008). http://www.aclweb.org/anthology/W08-2121
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2012 The Author(s)
About this chapter
Cite this chapter
dos Santos, C.N., Milidiú, R.L. (2012). Entropy Guided Transformation Learning. In: Entropy Guided Transformation Learning: Algorithms and Applications. SpringerBriefs in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-2978-3_2
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2978-3_2
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2977-6
Online ISBN: 978-1-4471-2978-3
eBook Packages: Computer ScienceComputer Science (R0)