Recurrent networks with attention and convolutional networks for sentence representation and classification

Liu, Tengfei; Yu, Shuangyuan; Xu, Baomin; Yin, Hongfeng

doi:10.1007/s10489-018-1176-4

Recurrent networks with attention and convolutional networks for sentence representation and classification

Published: 27 April 2018

Volume 48, pages 3797–3806, (2018)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Tengfei Liu ORCID: orcid.org/0000-0001-8084-7663¹,
Shuangyuan Yu¹,
Baomin Xu¹ &
…
Hongfeng Yin²

1426 Accesses
39 Citations
Explore all metrics

Abstract

In this paper, we propose a bi-attention, a multi-layer attention and an attention mechanism and convolution neural network based text representation and classification model (ACNN). The bi-attention have two attention mechanism to learn two context vectors, forward RNN with attention to learn forward context vector \(\overrightarrow {\mathbf {c}}\) and backward RNN with attention to learn backward context vector \(\overleftarrow {\mathbf {c}}\), and then concatenation \(\overrightarrow {\mathbf {c}}\) and \(\overleftarrow {\mathbf {c}}\) to get context vector c. The multi-layer attention is the stack of the bi-attention. In the ACNN, the context vector c is obtained by the bi-attention, then the convolution operation is performed on the context vector c, and the max-pooling operation is used to reduce the dimension. After max-pooling operation the text is converted to low-dimensional sentence vector m. Finally, the Softmax classifier be used for text classification. We test our model on 8 benchmarks text classification datasets, and our model achieved a better or the same performance compare with the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Article 09 April 2024

Pranati Rakshit & Avik Sarkar

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Laith Alzubaidi, Jinglan Zhang, … Laith Farhan

A review on the long short-term memory model

Article 13 May 2020

Greg Van Houdt, Carlos Mosquera & Gonzalo Nápoles

References

Arora S, Liang Y, Ma T (2017) A simple but tough-to-beat baseline for sentence embeddings
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:14090473
Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155
MATH Google Scholar
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:14061078
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
MATH Google Scholar
Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data. arXiv:170502364
Dai AM, Le QV (2015) Semi-supervised sequence learning. In: Advances in neural information processing systems, pp 3079–3087
Hill F, Cho K, Korhonen A (2016) Learning distributed representations of sentences from unlabelled data. arXiv:160203483
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 168–177
Huang E, Socher R, Manning C, Ng A (2012) Improving word representations via global context and multiple word prototypes. In: ACL. ACL, pp 873–882
Johnson R, Zhang T (2014) Effective use of word order for text categorization with convolutional neural networks. arXiv:14121058
Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. arXiv:14042188
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:14085882
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv:14126980
Kiros R, Zhu Y, Salakhutdinov RR, Zemel R, Urtasun R, Torralba A, Fidler S (2015) Skip-thought vectors. In: Advances in neural information processing systems, pp 3294–3302
Lai A, Hockenmaier J (2014) Illinois-lh: a denotational and distributional approach to semantics. In: SemEval@ COLING, pp 329–334
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: AAAI, vol 333, pp 2267–2273
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st international conference on machine learning (ICML-14), pp 1188–1196
Li X, Roth D (2002) Learning question classifiers. In: COLING. ACL, pp 1–7
Lin Z, Feng M, Santos CNd, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. arXiv:170303130
Maas A, Daly R, Pham P, Huang D, Ng A, Potts C (2011) Learning word vectors for sentiment analysis. In: ACL. ACL, pp 142–150
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv:13013781
Mnih A, Hinton G (2007) Three new graphical models for statistical language modelling. In: Proceedings of the 24th international conference on machine learning. ACM, pp 641–648
Mnih V, Heess N, Graves A et al (2014) Recurrent models of visual attention. In: Advances in neural information processing systems, pp 2204–2212
Pang B, Lee L (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: ACL. ACL, p 271
Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: ACL. ACL, pp 115–124
Pang B, Lee L et al (2008) Opinion mining and sentiment analysis. Foundations and Trends®;, in Information Retrieval 2(1–2):1–135
Article Google Scholar
Sahami M, Dumais S, Heckerman D, Horvitz E (1998) A Bayesian approach to filtering junk e-mail. In: Learning for text categorization: papers from the 1998 workshop, vol 62, pp 98–105
Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), vol 1631, pp 1642
Srivastava N, Hinton G E, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. arXiv:150300075
Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp 1422–1432
Wang S, Manning C (2013) Fast dropout training. In: ICML, pp 118–126
Wang S, Manning CD (2012) Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual meeting of the association for computational linguistics: short papers, vol 2. Association for Computational Linguistics, pp 90–94
Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2):165–210
Article Google Scholar
Wieting J, Bansal M, Gimpel K, Livescu K (2015) Towards universal paraphrastic sentence embeddings. arXiv:151108198
Yang Z, Yang D, Dyer C, He X, Smola AJ, Hovy EH (2016) Hierarchical attention networks for document classification. In: HLT-NAACL, pp 1480–1489
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems, pp 649–657
Zhao H, Lu Z, Poupart P (2015) Self-adaptive hierarchical sentence model. In: IJCAI, pp 4069–4076

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China [NSFC61572005], the Fundamental Research Funds for the Central Universities [2016JBM080], and Key Projects of Science and Technology Research of Hebei Province Higher Education [ZD2-017304].

Author information

Authors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, Beijing, 100044, China
Tengfei Liu, Shuangyuan Yu & Baomin Xu
Department of Computer Science, Beijing Jiaotong University Haibin College, Huanghua, 061199, China
Hongfeng Yin

Authors

Tengfei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shuangyuan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Baomin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hongfeng Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tengfei Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, T., Yu, S., Xu, B. et al. Recurrent networks with attention and convolutional networks for sentence representation and classification. Appl Intell 48, 3797–3806 (2018). https://doi.org/10.1007/s10489-018-1176-4

Download citation

Published: 27 April 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s10489-018-1176-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recurrent networks with attention and convolutional networks for sentence representation and classification

Abstract

Access this article

Similar content being viewed by others

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A review on the long short-term memory model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Recurrent networks with attention and convolutional networks for sentence representation and classification

Abstract

Access this article

Similar content being viewed by others

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A review on the long short-term memory model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation