Attentional Transformer Networks for Target-Oriented Sentiment Classification

  • Jianing TongEmail author
  • Wei ChenEmail author
  • Zhihua Wei
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1120)


Text classification task includes total sentence sentimental classification as well as target-based sentimental classification. Target-based sentimental analysis and classification is aiming at locating sentimental classes of given sentences over different opinion aspects. Recurrent neural network is perfectly suitable for this kind of assignment, and it does achieve the state-of-the-art (SOTA) performance by now. Most of the previous works model target and context words with Recurrent Neural Network (RNN) with attention mechanism. However, RNN can hardly parallelize to train and cause too much memory occupancy. What’s more, for this task, long-term memory may cause confusion. For example, the food is delicious but the service is frustrating, where the model may think the food is good while the service is bad. Convolutional neural network (CNN) seems vital in this situation as it can learn the local n-grams information while RNN cannot make it. To address these issues, this paper comes up with an Attention Transformer Network (ATNet) which can perfectly address issues above. Our model employs attention mechanism and transformer component to generate target-orient representation, along with CNN layers to extract N-grams information. On open benchmark datasets, our proposed models achieve state-of-art results, namely, 70.3%, 72.1% and 83.4% in three benchmarks. Also, this paper applies pretrained BERT in the encoder part and acquires SOTA achievement. We performed many contrast experiments to elaborate effectiveness of our method.


Target-based sentimental analysis Attention mechanism Transformer component 



This work is sponsored by National Key Research and Development Project (No. 213), the National Nature Science Foundation of China (No. 61573259, No. 61673299, No. 61673301, No. 61573255) and the Special Project of the Ministry of Public Safety (No. 20170004). Supported by Key Laboratory of Information Network Safety, Ministry of Public Safety No. C18608. It is also supported by Shanghai Health and Family Planning Commission Chinese Medicine Science and Technology Innovation Project (ZYKC201702005).


  1. 1.
    Jiang, L., Yu, M., Zhou, M., et al.: Target-dependent twitter sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 151–160. Association for Computational Linguistics (2011)Google Scholar
  2. 2.
    Mikolov, T., Karafiát, M., Burget, L., et al.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)Google Scholar
  3. 3.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  4. 4.
    Li, X., Bing, L., Lam, W., et al.: Transformation networks for target-oriented sentiment classification. arXiv preprint arXiv:1805.01086 (2018)
  5. 5.
  6. 6.
    Mohammad, S., Kiritchenko, S., Sobhani, P., et al.: A dataset for detecting stance in tweets. In: LREC (2016)Google Scholar
  7. 7.
    Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)CrossRefGoogle Scholar
  8. 8.
    Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)Google Scholar
  9. 9.
    Tang, D., Qin, B., Liu, T.: Aspect level sentiment classification with deep memory network. arXiv preprint arXiv:1605.08900 (2016)
  10. 10.
    Wang, Y., Huang, M., Zhao, L.: Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 606–615 (2016)Google Scholar
  11. 11.
    Yang, M., Tu, W., Wang, J., et al.: Attention based LSTM for target dependent sentiment classification. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)Google Scholar
  12. 12.
    Ma, D., Li, S., Zhang, X., et al.: Interactive attention networks for aspect-level sentiment classification. arXiv preprint arXiv:1709.00893 (2017)
  13. 13.
    Chen, P., Sun, Z., Bing, L., et al.: Recurrent attention network on memory for aspect sentiment analysis. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 452–461 (2017)Google Scholar
  14. 14.
    Dey, R., Salemt, F.M.: Gate-variants of gated recurrent unit (GRU) neural networks. In: 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1597–1600. IEEE (2017)Google Scholar
  15. 15.
    Fan, F., Feng, Y., Zhao, D.: Multi-grained attention network for aspect-level sentiment classification. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 3433–3442 (2018)Google Scholar
  16. 16.
    Huang, B., Ou, Y., Carley, K.M.: Aspect level sentiment classification with attention-over-attention neural networks. In: Thomson, R., Dancy, C., Hyder, A., Bisgin, H. (eds.) SBP-BRiMS 2018. LNCS, vol. 10899, pp. 197–206. Springer, Cham (2018). Scholar
  17. 17.
    Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)Google Scholar
  18. 18.
    Devlin, J., Chang, M.W., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
  19. 19.
    Song, Y., Wang, J., Jiang, T., et al.: Attentional encoder network for targeted sentiment classification. arXiv preprint arXiv:1902.09314 (2019)
  20. 20.
    Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)Google Scholar
  21. 21.
    Liu, Q., Zhang, H., Zeng, Y., et al.: Content attention model for aspect based sentiment analysis. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web, pp. 1023–1032. International World Wide Web Conferences Steering Committee (2018)Google Scholar
  22. 22.
    Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. Wiley Interdisc. Rev.: Data Min. Knowl. Discovery 8(4), e1253 (2018)Google Scholar
  23. 23.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)Google Scholar
  24. 24.
    Zhou, Z., Zhang, W., Wang, J.: Inception score, label smoothing, gradient vanishing and-log (d(x)) alternative. arXiv preprint arXiv:1708.01729 (2017)
  25. 25.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  26. 26.
    Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. Int. Conf. Mach. Learn., 2342–2350 (2015)Google Scholar
  27. 27.
    Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
  28. 28.
    Tang, D., Qin, B., Feng, X., et al.: Effective LSTMs for target-dependent sentiment classification. arXiv preprint arXiv:1512.01100 (2015)

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Tongji UniversityShanghaiChina
  2. 2.Shanghai Institute of Criminal Science and TechnologyShanghaiChina

Personalised recommendations