Skip to main content

Short text matching model with multiway semantic interaction based on multi-granularity semantic embedding

Abstract

Short text matching is a fundamental technique of natural language processing. It plays an important role in information retrieval, question answering and paraphrase identification, etc. However, due to the lack of available data after Chinese short text word segmentation, we need to take full advantage of the existing text information. In our paper, we propose a sentence matching model with multiway semantic interaction based on multi-granularity semantic embedding(MSIM) to dispose of the problem of Chinese short text matching. First, each sentence pair is represented as multi-granularity embedding: character embedding based on one hot vector, and word embedding obtained from the pre-trained model. In addition, we add the attention mechanism after the character embedding to weight the characters. In order to capture sufficient semantic features, we process short sentence pairs in three ways. We not only match each time step of the two encoded sentences and perform average pooling and maximum pooling operations, but also make deep interaction between each time step representation with attention representation. Finally, we employ BiLSTM to aggregate matching results into a fixed-length matching vector, with the decision made through a fully connected layer. Our method is evaluated on the Chinese datasets CCKS and ATEC. Experimental results demonstrate that the method in our paper takes full advantage of Chinese short text information, outperforming other methods.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Liu M, Zhang Y, Xu J (2021) Deep bi-directional interaction network for sentence matching. Appl Intell 51:4305–4329. https://doi.org/10.1007/s10489-020-02156-7

    Article  Google Scholar 

  2. Li Z, Wang W, Dong L, Wei F (2020) Harvesting and refining question-answer pairs for unsupervised QA. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 6719–6728

    Chapter  Google Scholar 

  3. Aithal S, Rao A, Singh S (2021) Automatic question-answer pairs generation and question similarity mechanism in question answering system. Appl Intell 51:8484–8497. https://doi.org/10.1007/s10489-021-02348-9

    Article  Google Scholar 

  4. Zhang W, Feng Y, Meng F, Liu Q (2019) Bridging the gap between training and inference for neural machine translation. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 4334–4343

    Chapter  Google Scholar 

  5. Daybelge T, Cicekli I (2011) A ranking method for example based machine translation results by learning from user feedback. Appl Intell 35(2):296–321. https://doi.org/10.1007/s10489-010-0222-7

    Article  Google Scholar 

  6. Tan C, Wei F, Wang W, Lv W, Zhou M (2018) Multiway attention networks for modeling sentence pairs. In: Proceedings of the 27th international joint conference on artificial intelligence, pp 4411–4417

    Google Scholar 

  7. Djenouri Y, Belhadi A, Djenouri D (2020) Cluster-based information retrieval using pattern mining. Appl Intell:1888–1903. https://doi.org/10.1007/s10489-020-01922-x

  8. Zhang K, Xiong C, Liu Z (2020) Selective weak supervision for neural Information retrieval. In: The Web Conference 2020, Taipei, Taiwan, China, April 20-24, 2020, pp 474–485

    Google Scholar 

  9. Zheng S, Yu J (2012) Automatic summarization of web page based on statistics and structure. In: Tan H (ed) Knowledge discovery and data mining. Advances in intelligent and soft computing, vol 135. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27708-5_89

    Chapter  Google Scholar 

  10. Wang Z, Mi H, Ittycheriah A (2016) Sentence similarity learning by lexical decomposition and composition. arXiv:1602.07019

  11. Fu C (2020) User correlation model for question recommendation in community question answering. Appl Intell 50(2):634–645. https://doi.org/10.1007/s10489-019-01544-y

    Article  Google Scholar 

  12. Cui W, Zheng G, Wang W (2020) Unsupervised natural language inference via decoupled multimodal contrastive learning. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 5511–5520

    Chapter  Google Scholar 

  13. Huang P, He X, Gao J, Deng L, Heck L (2013) Learning deep structured semantic models for web search using click through data. In: Proceedings of the 22nd ACM international conference on Information & Knowledge Management (CIKM), pp 2333–2338

    Google Scholar 

  14. Hu B, Lu Z, Li H (2015) Convolutional neural network architectures for matching natural language sentences. In: Advances in neural information processing systems, pp 2042–2050

    Google Scholar 

  15. Palangi H, Deng L, Shen Y (2016) Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24(4):694–707

    Article  Google Scholar 

  16. Wan S, Lan Y, Guo J (2016) A deep architecture for semantic matching with multiple positional sentence representations. In: Proceedings of the 30th AAAI conference on artificial intelligence. Phoenix, USA, pp 2835–2841

    Google Scholar 

  17. Socher R, Huang E, Pennington J (2011) Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Proceedings of the advances in neural information processing systems. Granada, Spain, pp 801–809

    Google Scholar 

  18. Yin W, Schuitze T (2015) MultiGranCNN: an architecture for general matching of text chunks on multiple levels of granularity. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics. Beijing, China, pp 63–73

    Google Scholar 

  19. Wang S, Jiang J (2016) A compare-aggregate model for matching text sequences arXiv: 1611.01747

  20. Bromley J, Guyon I, Lecun Y, Sckinger E, Shah R (1993) Signature verification using a “Siamese” time delay neural network. In: Advances in neural information processing systems 6, [7th NIPS conference, Denver, Colorado, USA, 1993]

    Google Scholar 

  21. Bowman S, Angeli G, Potts C, Manning C (2015) A large annotated corpus for learning natural language inference. In: Computer Science.Proceedings of the 2015 Conference on empirical methods in natural language processing. Lisbon, Portugal, pp 632–642

    Chapter  Google Scholar 

  22. Tan M, Santos C, Xiang B, Zhou B (2016) Lstm-based deep learning models for non-factoid answer selection arXiv: 1511.04108

  23. Severyn A, Moschitti A (2015) Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. In: The 38th International ACM SIGIR Conference, pp 373–382

    Google Scholar 

  24. Liang P, Lan Y, Guo J (2016) Text matching as image recognition. In: Proceedings of the 30th AAAI conference on artificial intelligence. Phoenix, USA, pp 2793–2799

    Google Scholar 

  25. Parikh A, Tckstrm O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: Proceedings of the 2016 Conference on empirical methods in natural language processing. Austin, Texas, pp 2249–2255

    Chapter  Google Scholar 

  26. Chen Q, Zhu X, Ling Z, Wei S, Jiang H, Inkpen D (2017) Enhanced LSTM for natural language inference. In: Proceedings of the 55th annual meeting of the Association for Computational Linguistics. Vancouver, Canada, pp 1657–1668

    Google Scholar 

  27. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representation in vector space. arXiv:1301.3781

  28. Perez J, Liu F (2017) Gated end-to-end memory networks. In: Proceedings of the 15th conference of the European chapter of the Association for Computational Linguistics. Valencia, Spain, pp 1–10

    Google Scholar 

  29. Mou L, Men R, Ge L, Yan X, Zhang L, Yan R, Jin Z (2016) Natural language inference by tree-based convolution and heuristic matching. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics, pp 130–136

    Google Scholar 

  30. Yin W, Schütze H (2016) Abcnn: attention-based convolutional neural network for modeling sentence pairs. In: Transactions of the Association for Computational Linguistics, pp 259–272

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Nature Science Foundation of China under Project 61673079, cooperation projects between universities in Chongqing and institutes affiliated to the Chinese Academy of Sciences (HZ2021018) and Innovation research group of universities in Chongqing (CXQT20016).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Luo.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tang, X., Luo, Y., Xiong, D. et al. Short text matching model with multiway semantic interaction based on multi-granularity semantic embedding. Appl Intell (2022). https://doi.org/10.1007/s10489-022-03410-w

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-022-03410-w

Keywords

  • Natural language processing
  • Chinese short text matching
  • Deep learning
  • Semantic interaction