Abstract
Text classification has emerged as an important research area over the last few years in natural language processing (NLP). Different from formal documents and paragraphs, short texts are more ambiguous, due to the lack of contextual information and the data sparsity problem, which poses a great challenge to traditional classification methods. In order to solve this problem, conceptual knowledge is introduced to enrich the information of short texts. However, this method assumes that all knowledge is equally important which is not conducive to distinguishing short texts classification. In addition, it also brings knowledge noise to the text, and causes the degradation of classification performance. To measure the importance of concepts to short texts, the paper introduces the attention mechanism. Text-Relevant-Concept (T-RC) is utilized to resolve the ambiguity of concepts and choose the most appropriate meaning to align short text. We employ Concept-Relevant-Concept (C-RC) to handle conceptual hierarchy and the relative importance of the concept. We investigate a model combining Knowledge with Attention Neural Networks (CK-ANN). Experiments show that CK-ANN outperforms state-of-the-art methods on text classification benchmarks, which proves the effectiveness of our method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP, pp. 1746–1751. ACL (2014)
Rakhshani, H., et al.: Neural architecture search for time series classification. In: IJCNN, pp. 1–8. IEEE (2020)
Chen, Q., Zhu, X., Ling, Z., Inkpen, D., Wei, S.: Neural natural language inference models enhanced with external knowledge. In: ACL (1), pp. 2406–2417. Association for Computational Linguistics (2018)
Wang, F., Wang, Z., Li, Z., Wen, J.: Concept-based short text classification and ranking. In: CIKM, pp. 1069–1078. ACM (2014)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (1), pp. 4171–4186. Association for Computational Linguistics (2019)
Wang, H.: Understanding short texts. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, p. 1. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37401-2_1
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Wu, W., Li, H., Wang, H., Zhu, K.Q.: Probase: a probabilistic taxonomy for text understanding. In: SIGMOD Conference, pp. 481–492. ACM (2012)
Johnson, R., Zhang, T.: Deep pyramid convolutional neural networks for text categorization. In: ACL (1), pp. 562–570. Association for Computational Linguistics (2017)
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: IJCAI, pp. 2915–2921. ijcai.org (2017)
Xu, J., et al.: Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing 386, 42–53 (2020)
Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. Trans. Assoc. Comput. Linguistics 2, 231–244 (2014)
Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In: CIKM, pp. 1625–1628. ACM (2010)
Chen, J., et al.: Cn-probase: a data-driven approach for large-scale Chinese taxonomy construction. In: ICDE, pp. 1706–1709. IEEE (2019)
Wang, Z., Wang, H., Wen, J., Xiao, Y.: An inference approach to basic level of categorization. In: CIKM, pp. 653–662. ACM (2015)
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: EMNLP, pp. 1532–1543. ACL (2014)
Vaswani, A., et al.: Attention is all you need. In: NIPS. pp. 5998–6008 (2017)
Lv, S., et al.: Graph-based reasoning over heterogeneous external knowledge for commonsense question answering. In: AAAI, pp. 8449–8456. AAAI Press (2020)
Phan, X.H., Nguyen, M.L., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: WWW, pp. 91–100. ACM (2008)
Vitale, D., Ferragina, P., Scaiella, U.: Classification of short texts by deploying topical annotations. In: Baeza-Yates, R., et al. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 376–387. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28997-2_32
Wang, S.I., Manning, C.D.: Baselines and bigrams: Simple, good sentiment and topic classification. In: ACL (2), pp. 90–94. The Association for Computer Linguistics (2012)
Chen, J., Hu, Y., Liu, J., Xiao, Y., Jiang, H.: Deep short text classification with knowledge powered attention. In: AAAI, pp. 6252–6259. AAAI Press (2019)
Acknowledgement
This research was supported by NSFC (Grants No. 61877051). Li Li is the corresponding author for the paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, W., Li, L. (2021). Combining Knowledge with Attention Neural Networks for Short Text Classification. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, SY. (eds) Knowledge Science, Engineering and Management . KSEM 2021. Lecture Notes in Computer Science(), vol 12816. Springer, Cham. https://doi.org/10.1007/978-3-030-82147-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-82147-0_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82146-3
Online ISBN: 978-3-030-82147-0
eBook Packages: Computer ScienceComputer Science (R0)