Bi-directional attention comparison for semantic sentence matching

Lai, Huiyuan; Tao, Yizheng; Wang, Chunliu; Xu, Lunfan; Tang, Dingyong; Li, Gongliang

doi:10.1007/s11042-018-7063-5

Bi-directional attention comparison for semantic sentence matching

Published: 12 January 2019

Volume 79, pages 14609–14624, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Huiyuan Lai¹,
Yizheng Tao¹,
Chunliu Wang¹,
Lunfan Xu¹,
Dingyong Tang¹ &
…
Gongliang Li¹

694 Accesses
4 Citations
Explore all metrics

Abstract

Semantic sentence matching, also known as calculation of text similarity, is one of the most important problems in natural language processing. Existing deep models mostly focus on the neural networks with attention mechanism. In this paper, we present a deep architecture to match two Chinese sentences, which only relies on alignment instead of long short-term memory network after attention mechanism is employed to get interaction information between sentence-pairs, the model becomes more lightweight and simple. Meanwhile, in order to capture semantic features enough, in addition to using max pooling and average pooling operation, we also employ a pooling operation named attention-pooling to aggregate information from the whole sentence, the final matching score is obtained after a multilayer perceptron classifier. Experiments are carried out on ATEC-NLP dataset and outline the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Article Open access 05 March 2024

Impact of word embedding models on text analytics in deep learning environment: a review

Article 22 February 2023

TextConvoNet: a convolutional neural network based architecture for text classification

Article 22 October 2022

References

Adit Deshpande. Diving into Natural Language Processing. https://dzone.com/articles/-natural-language-processing-adit-deshpande-cs-unde
Alex.1D CNN. https://www.kaggle.com/rethfro/1d-cnn-single-model-score-0-14-0-16-or-0-23
Aliguliyev RM (2009) A new sentence similarity measure and sentence based extractive technique for automatic text summarization. Expert Syst Appl 36(4):7764–7772
Article Google Scholar
Ant Financial. Ant Financial Artificial Competition. https://dc.cloud.alipay.com/index#/-topic/data?id=3
Berger A, Caruana R, Cohn D, Freitag D, Mittal V (2000) Bridging the Lexical Chasm: Statistical Approaches to Answer-finding. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 192–199
Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 632–642
Chen Q, Zhu X (2017) Enhanced LSTM for Natural Language Inference. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp1657–1668
Cho K, van Merrienboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259
Choi J, Yoo KM, Lee S (2017) Learning to compose task-specific tree structures. arXiv preprint arXiv:1707.02786v4
Csernai K (2017) Quora question pair dataset
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural net-works. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, pp 315–323
Huang P, He X, Gao J, Deng L, Acero A, Heck L (2013) Learning deep structured semantic models for web search using click through data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp 2333–2338
Huimin Lu, Bin Li, Junwu Zhu, Yujie Li, Yun Li, et al. (2016) Wound intensity correction and segmentation with convolutional neural networks. Concurrency and Computation: Practice and Experience 29(6)
Junyi S. jieba. https://github.com/fxsjy/jieba
Kingma DP, Ba J (2017) Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980
Lu Z, Li H (2013) A Deep Architecture for Matching Short Texts. In: Advances in Neural Information Processing Systems, pp 1367–1375
Lu H, Li Y, Shenglin M, Dong W, Kim H, Serikawa S (2017) Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Internet Things J 99:1–1
Google Scholar
Lu H, Li Y, Chen M, Kim H, Serikawa S (2017) Brain Intelligence: Go Beyond Artificial Intelligence. Mobile Networks and Application, pp.1–8
Lu H, Li Y, Uemura T (2018) Low illumination underwater light field images reconstruction using deep convolutional neural networks. Futur Gener Comput Syst 10:1016
Google Scholar
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781
Mou L, Men R, Ge L, Yan X, Zhang L, Yan R, Jin Z (2016) Natural language inference by tree-based convolution and heuristic matching. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp 130–136
Natural Language Computing Group, Microsoft Research Asia (2017) R-NET: Machine Reading Comprehension with Self-matching Networks. https://www.microsoft.com/en-us/research/publication/mrc/. Accessed 2017
Nie Y, Bansal M (2017) Shortcut-stacked sentence encoders for multi-domain inference. arXiv preprint arXiv:1708.02312
Palangi H, Deng L, Shen Y, Gao J, He X, Chen J, Song X, Ward RK (2015) Deep sentence embedding using the long short term memory network: analysis and application to information retrieval[J]. IEEE Trans Audio Speech & Language Processing 24(4):694–707
Article Google Scholar
Parikh AP, Täckström O, Das D, Uszkoreit J (2016) A Decomposable Attention Model for Natural Language Inference. arXiv preprint arXiv:1606.01933
Seo MJ, Kembhavi A, Farhadi A, and Hajishirzi H (2016) Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603
Serikawa S, Lu H (2014) Underwater Image Dehazing Using Joint Trilateral Filter. Comput Electr Eng 40(1):41–50
Article Google Scholar
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salak-hutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks. arXiv preprint arXiv:1505.00387
Williams A, Nangia N, Bowman SR (2017) A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426
Xing X, Shen F, Yang Y, et al. (2017) Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval. IEEE Transactions on Image Processing (TIP)
Xu X, He L, Shimada A et al (2016) Learning unified binary codes for cross-modal retrieval via latent semantic hashing[J]. Neurocomputing 213:191–203
Article Google Scholar
Xu X, He L, Lu H, Gao L, Ji Y (2018) Deep adversarial metric learning for cross-modal retrieval. World Wide Web Journal, 1–16
Zhang S, Zhang X, Wang H, Cheng J, Li P, Ding Z (2017) Chinese Medical Question Answer Matching Using End-to-End Character-Level Multi-Scale CNNs. Applied Sciences 7(8)

Download references

Acknowledgments

We thank Ant Financial for allowing us to use the dataset from Ant Financial Artificial Competition for experiments. An earlier version of this paper was presented at the International Conference on International Symposium on Artificial Intelligence and Robotics.

Author information

Authors and Affiliations

Institute of Computer Application, China Academy of Engineering Physics, Mianyang, China
Huiyuan Lai, Yizheng Tao, Chunliu Wang, Lunfan Xu, Dingyong Tang & Gongliang Li

Authors

Huiyuan Lai
View author publications
You can also search for this author in PubMed Google Scholar
Yizheng Tao
View author publications
You can also search for this author in PubMed Google Scholar
Chunliu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lunfan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Dingyong Tang
View author publications
You can also search for this author in PubMed Google Scholar
Gongliang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yizheng Tao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lai, H., Tao, Y., Wang, C. et al. Bi-directional attention comparison for semantic sentence matching. Multimed Tools Appl 79, 14609–14624 (2020). https://doi.org/10.1007/s11042-018-7063-5

Download citation

Received: 17 August 2018
Revised: 30 November 2018
Accepted: 11 December 2018
Published: 12 January 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s11042-018-7063-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bi-directional attention comparison for semantic sentence matching

Abstract

Access this article

Similar content being viewed by others

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bi-directional attention comparison for semantic sentence matching

Abstract

Access this article

Similar content being viewed by others

"Challenges and future in deep learning for sentiment analysis: a comprehensive review and a proposed novel hybrid approach"

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation