Abstract
In recent years, the taxonomy integration problem has obtained much attention in many research studies. Many sorts of implicit information embedded in the source taxonomy are explored to improve the integration performance. However, the semantic information embedded in the source taxonomy has not been discussed in the past research. In this paper, an enhanced integration approach called SFE (Semantic Feature Expansion) is proposed to exploit the semantic information of the category-specific terms. From our experiments on two hierarchical Web taxonomies, the results are positive to show that the integration performance can be further improved with the SFE scheme.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: On Integrating Catalogs. In: Proceedings of the 10th International Conference on World Wide Web, pp. 603–612 (2001)
Sarawagi, S., Chakrabarti, S., Godbole, S.: Cross-training: Learning Probabilistic Mappings between Topics. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 177–186 (2003)
Zhang, D., Lee, W.S.: Web Taxonomy Integration using Support Vector Machines. In: Proceedings of the 13th International Conference on World Wide Web, pp. 472–481 (2004)
Zhang, D., Lee, W.S.: Web Taxonomy Integration Through Co-Bootstrapping. In: Proceedings of the 27th annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 410–417 (2004)
Wu, C.W., Tsai, T.H., Hsu, W.L.: Learning to Integrate Web Taxonomies with Fine-Grained Relations: A Case Study Using Maximum Entropy Model. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.-H. (eds.) AIRS 2005. LNCS, vol. 3689, pp. 190–205. Springer, Heidelberg (2005)
Chen, I.X., Ho, J.C., Yang, C.Z.: An Iterative Approach for Web Catalog Integration with Support Vector Machines. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.-H. (eds.) AIRS 2005. LNCS, vol. 3689, pp. 703–708. Springer, Heidelberg (2005)
Ho, J.C., Chen, I.X., Yang, C.Z.: Learning to Integrate Web Catalogs with Conceptual Relationships in Hierarchical Thesaurus. In: Ng, H.T., Leong, M.-K., Kan, M.-Y., Ji, D. (eds.) AIRS 2006. LNCS, vol. 4182, pp. 217–229. Springer, Heidelberg (2006)
Krikos, V., Stamou, S., Kokosis, P., Ntoulas, A., Christodoulakis, D.: DirectoryRank: Ordering Pages in Web Directories. In: Proceedings of 7th ACM International Workshop on Web Information and Data Management (WIDM 2005), pp. 17–22 (2005)
Hsu, M.H., Tsai, M.F., Chen, H.H.: Query Expansion with ConceptNet and WordNet: An Intrinsic Comparison. In: Ng, H.T., Leong, M.-K., Kan, M.-Y., Ji, D. (eds.) AIRS 2006. LNCS, vol. 4182, pp. 1–13. Springer, Heidelberg (2006)
Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A Maximum Entropy Approach to Natural Language Processing. In: Computational Linguistics, pp. 39–71 (1996)
Darroch, J.N., Ratcliff, D.: Generalized Iterative Scaling for Log-linear Models. Annals of Mathematical Statistics (43), 1470–1480 (1972)
Tseng, Y.H., Lin, C.J., Chen, H.H., Lin, Y.I.: Toward Generic Title Generation for Clustered Documents. In: Ng, H.T., Leong, M.-K., Kan, M.-Y., Ji, D. (eds.) AIRS 2006. LNCS, vol. 4182, pp. 145–157. Springer, Heidelberg (2006)
Ng, H.T., Goh, W.B., Low, K.L.: Feature selection, Perception Learning, and a Usability Case Study for Text Categorization. In: Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 67–73 (1997)
Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: Proceedings of the 14th International Conference on Machine Learning (ICML 1997), pp. 412–420 (1997)
Information Mapping Project: Computational Semantics Laboratory, Stanford University, http://infomap.stanford.edu/
WordNet: A lexical database for the English language: Cognitive Science Laboratory, Princeton University, http://wordnet.princeton.edu/
Zhang, L.: Maximum Entropy Modeling Toolkit for Python and C++, http://homepages.inf.ed.ac.uk/s0450736/maxent.html
Frakes, W., Baeza-Yates, R.: Information Retrieval: Data Structures and Algorithms, 1st edn. Prentice Hall, PTR, Englewood Cliffs (1992)
The Porter Stemming Algorithm, http://tartarus.org/~martin/PorterStemmer
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, CZ., Chen, IX., Hung, CT., Wu, PJ. (2008). Improving Hierarchical Taxonomy Integration with Semantic Feature Expansion on Category-Specific Terms. In: Li, H., Liu, T., Ma, WY., Sakai, T., Wong, KF., Zhou, G. (eds) Information Retrieval Technology. AIRS 2008. Lecture Notes in Computer Science, vol 4993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68636-1_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-68636-1_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68633-0
Online ISBN: 978-3-540-68636-1
eBook Packages: Computer ScienceComputer Science (R0)