Abstract
In Chinese semantic sentence matching, existing models use the same architecture to distinguish the semantic differences and extract interaction information simultaneously. However, not only it brings tremendous redundant information but makes the model more overweight and sophisticated. To relieve this condition, a deep architecture with the comparison and interaction modules separated named SNMA is presented in this paper. The SNMA uses the Siamese network to extract context information, and employs the multi-head attention mechanism to extract interaction information from sentence pairs separately. Experimental results on four recent Chinese sentence matching datasets outline the effectiveness of our approach.
This is a preview of subscription content, access via your institution.


References
Wang ZG, Mi HT, Ittycheriah A (2016) Sentence similarity learning by lexical decomposition and composition. In: Proceedings of the 26th International Conference on Computational Linguistics, pp 1340–1349
Yin WP, Schutze H, Xiang B, Zhou BW (2018) ABCNN: attention-based convolutional neural network for modeling sentence pairs. arXiv:1512.05193
Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference.Computer Science. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 632–642. https://doi.org/10.18653/v1/D15-1075
Bar D, Biemann C, Gurevych I, Zesch T (2012) Ukp: Computing semantic textual similarity by combining multiple content similarity measures. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp 435–440
Jimenez S, Becerra C, Gelbukh A (2012) Soft cardinality: a parameterized similarity function for text comparison. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, pp 449–453
Bromley B, Guyon I et al (1993) Signature verification using a siamese time delay neural network. Adv Neural Inform Process Syst 6:737–744
Chen Q, Zhu XD, Ling ZH, Wei S, Jiang H, Inkpen D (2017) Enhanced LSTM for natural language inference. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1657–1668. https://doi.org/10.18653/v1/P17-1152
Pang L, Lan YY, Guo JF, Xu J, Wan SX, Cheng XQ (2016) Text matching as image recognition. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp 1145–1152
Kim S, Kang I, Kwak N (2018) Semantic sentence matching with densely-connected recurrent and co-attentive information. arXiv preprint arXiv:1805.11360
Wan SX, Lan YY, Guo JF (2015) A deep architecture for semantic matching with multiple positional sentence representations. arXiv preprint arXiv:1511.08277
Lai HY, Tao YZ, Wang CL (2020) Bi-directional attention comparison for semantic sentence matching. Multimedia Tools Appl 79(4):14609–14624. https://doi.org/10.1007/s11042-018-7063-5
Bill D, Chris Q, and Chris B (2004) Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In: Proceedings of COLING, pp 350–356. https://www.aclweb.org/anthology/C04-1051
Cer D, Diab M, Agirre E, LopezGazpio I, Specia L (2017) Semeval-2017 task 1: semantic textual similarity-multilingual and cross-lingual focused evaluation. In: Proceedings of the 10th International Workshop on Semantic Evaluation. arXiv:1708.00055
Nakov P, Hoogeveen D, MArquez L, Moschitti A, Mubarak H, Baldwin T, Verspoor K (2019) SemEval-2017 task 3: Community question answering. arXiv preprint arXiv:1912.00730
Csernai K (2017) Quora question pair dataset. https://www.kaggle.com/c/quora-question-pairs/data
Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 632–642
Williams A, Nangia N, Bowman SR (2017) A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426
Ant Financial. Ant Financial Artificial Competition
CCKS (2018) WeBank Intelligent Customer Service Question Matching Competition. https://biendata.com/competition/CCKS2018\(\_\)3
PPDAI 3rd Magic Mirror Data Application Contest. https://www.ppdai.ai/mirror/goToMirrorDetail?mirrorId=1
CHIP 2018-4th China Health Information Processing Conference. https://biendata.com/competition/chip2018
Wang ZG, Hamza W, Florian R (2017) Bilateral multi-perspective matching for natural language sentences. pp 4144–4150. https://doi.org/10.24963/ijcai.2017/579
Parikh AP, Tckstrm O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. pp 2249–2255. https://doi.org/10.18653/v1/D16-1244
Li XY, Meng YX, Sun XF, Han QH, Yuan A, Li JW (2019) Is word segmentation necessary for deep learning of Chinese representation. arXiv preprint arXiv: 1905.05526
Junyi S. jieba. https://github.com/fxsjy/jieba
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representation in vector space. arXiv preprint arXiv:1301.3781
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salak-hutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Tompson J, Goroshin R, Jain A, LeCun Y, Bregler C (2015) Efficient object localization using convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 648–656
Lei Ba J, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv: 1607.06450
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all your need. arXiv preprint arXiv:1706.03762
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, pp 315–323
Timothy D (2016) Incorporating nesterov momentum into adam. In: Proceedings of Workshop Track International Conference on Learning Representations
Acknowledgements
Thanks to Yinxiang Xu for valuable discussion and to Xiaoning Song for helping us to polish the paper during the revision process.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, Q., Sun, J. & Zhao, Y. A Novel Architecture with Separate Comparison and Interaction Modules for Chinese Semantic Sentence Matching. Neural Process Lett 53, 3677–3692 (2021). https://doi.org/10.1007/s11063-021-10561-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-021-10561-3
Keywords
- Chinese sentence matching
- Multi-head attention mechanism
- Comparison and interaction modules
- Siamese network