Improving Word Vector with Prior Knowledge in Semantic Dictionary

Li, Wei; Wu, Yunfang; Lv, Xueqiang

doi:10.1007/978-3-319-50496-4_38

Wei Li¹⁸,
Yunfang Wu¹⁸ &
Xueqiang Lv¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10102))

Included in the following conference series:

4668 Accesses
2 Citations

Abstract

Using low dimensional vector space to represent words has been very effective in many NLP tasks. However, it doesn’t work well when faced with the problem of rare and unseen words. In this paper, we propose to leverage the knowledge in semantic dictionary in combination with some morphological information to build an enhanced vector space. We get an improvement of 2.3% over the state-of-the-art Heidel Time system in temporal expression recognition, and obtain a large gain in other name entity recognition (NER) tasks. The semantic dictionary Hownet alone also shows promising results in computing lexical similarity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, X., Xu, L., Liu, Z., Sun, M., Luan, H.: Joint learning of character and word embeddings. In: Proceedings of IJCAI, pp. 1236–1242 (2015)
Google Scholar
Cui, Q., Gao, B., Bian, J., Qiu, S., Dai, H., Liu, T.-Y.: KNET: a general framework for learning word embedding using morphological knowledge. ACM Trans. Inf. Syst. (TOIS) 34(1), 4 (2015)
Article Google Scholar
Dong, Z., Dong, Q.: HowNet and the Computation of Meaning. World Scientific, Beijing (2006)
Book Google Scholar
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar
Li, H., Strötgen, J., Zell, J., Gertz, M.: Chinese temporal tagging with heideltime. In: EACL, pp. 133–137 (2014)
Google Scholar
Li, Y., Li, W., Sun, F., Li, S.: Component-enhanced chinese character embeddings (2015). arXiv preprint: arXiv:1508.06669
Liu, Q., Li, S.: Word similarity computing based on How-net. Comput. Linguist. Chin. Lang. Process. 7(2), 59–76 (2002)
Google Scholar
Luong, T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: CoNLL, pp. 104–113. Citeseer (2013)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Peng, N., Dredze, M.: Named entity recognition for Chinese social media with jointly trained embeddings (2015)
Google Scholar
Qiu, S., Cui, Q., Bian, J., Gao, B., Liu, T.-Y.: Co-learning of word representations and morpheme representations. In: COLING, pp. 141–150 (2014)
Google Scholar
Sun, Y., Lin, L., Yang, N., Ji, Z., Wang, X.: Radical-enhanced Chinese character embedding. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014. LNCS, vol. 8835, pp. 279–286. Springer, Heidelberg (2014). doi:10.1007/978-3-319-12640-1_34
Google Scholar
Verhagen, M., Sauri, R., Caselli, T., Pustejovsky, J.: SemEval-2010 task 13: TempEval-2. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 57–62. Association for Computational Linguistics (2010)
Google Scholar

Download references

Acknowledgement

This work is supported by National Natural Science Foundation of China (61371129), National Key Basic Research Program of China (2014CB340504), Key Program of Social Science foundation of China (12&ZD227), and the Opening Project of Beijing Key Laboratory of Internet Culture and Digital Dissemination Research (ICDD201402).

Author information

Authors and Affiliations

Key Laboratory of Computational Linguistics, Peking University, Beijing, China
Wei Li & Yunfang Wu
Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing, China
Xueqiang Lv

Authors

Wei Li
View author publications
You can also search for this author in PubMed Google Scholar
Yunfang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xueqiang Lv
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunfang Wu .

Editor information

Editors and Affiliations

Microsoft Research Asia, Beijing, China
Chin-Yew Lin
Brandeis University, Waltham, Massachusetts, USA
Nianwen Xue
Peking University, Beijing, China
Dongyan Zhao
Fudan University, Shanghai, China
Xuanjing Huang
Peking University, Beijing, China
Yansong Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, W., Wu, Y., Lv, X. (2016). Improving Word Vector with Prior Knowledge in Semantic Dictionary. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-50496-4_38
Published: 02 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50495-7
Online ISBN: 978-3-319-50496-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics