Abstract
Abbreviations are common in everyday Chinese. For applications like information retrieval, we want not only to recognize the abbreviations, but also to know what they stand for. To tackle the emergence of all kinds of new abbreviations, this paper proposes a novel method that expands an abbreviation to its full name employing the Web as the main information source. Snippets containing full names of an abbreviation are obtained through a search engine by learned ”help words”. Then the snippets are examined using linguistic heuristics to generate a list of candidates. We select the optimal candidate according to a kNN-based ranking mechanism. Experiment shows that this method achieves satisfactory results.
This paper is supported in part by Chinese 863 project No. 2009AA01Z334 and the Shanghai Municipal Education Commission Foundation for Excellent Young University Teachers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chang, J., Teng, W.: Mining atomic Chinese abbreviation pairs: A probabilistic model for single character word recovery. Language Resources and Evaluation 40(3/4), 367–374 (2007)
Chen, K., Bai, M.: Unknown word detection for Chinese by a corpus-based learning method. Computational Linguistics 3(1), 27–44 (1998)
Sun, J., Gao, J., Zhang, L., Zhou, M., Huang, C.: Chinese named entity identification using class-based language model. In: COLING 2002, pp. 24–25 (2002)
Sun, X., Wang, H.: Chinese abbreviation identification using abbreviation-template features and context information. In: Matsumoto, Y., Sproat, R.W., Wong, K.-F., Zhang, M. (eds.) ICCPOL 2006. LNCS (LNAI), vol. 4285, pp. 245–255. Springer, Heidelberg (2006)
Li, Z., Yarowsky, D.: Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora. In: Proceedings of ACL, pp. 425–433 (2008)
Chang, J., Lai, Y.: A preliminary study on probabilistic models for Chinese abbreviations. In: Proceedings of the Third SIGHAN Workshop on Chinese Language Learning, pp. 9–16 (2004)
Fu, G., Luke, K., Zhang, M., Zhou, G.: A hybrid approach to Chinese abbreviation expansion. In: Matsumoto, Y., Sproat, R.W., Wong, K.-F., Zhang, M. (eds.) ICCPOL 2006. LNCS (LNAI), vol. 4285, pp. 277–287. Springer, Heidelberg (2006)
Huang, L.: More on the construction of modern Chinese abbreviations. Journal of Suihua University (004) (2008)
Mitchel, T.: Machine Learning 48(1) (1997)
Li, X.: Modern Chinese Standardized Dictionary. Foreign Language Teaching and Researching Press, Language and Literature Press, Beijing (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, H., Chen, Y., Liu, L. (2009). Automatic Expansion of Chinese Abbreviations by Web Mining. In: Deng, H., Wang, L., Wang, F.L., Lei, J. (eds) Artificial Intelligence and Computational Intelligence. AICI 2009. Lecture Notes in Computer Science(), vol 5855. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05253-8_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-05253-8_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05252-1
Online ISBN: 978-3-642-05253-8
eBook Packages: Computer ScienceComputer Science (R0)