Skip to main content
Log in

Learning to share by masking the non-shared for multi-domain sentiment classification

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Multi-domain sentiment classification deals with the scenario where labeled data exists for multiple domains but is insufficient for training effective sentiment classifiers that work across domains. Thus, fully exploiting sentiment knowledge shared across domains is crucial for real-world applications. While many existing works try to extract domain-invariant features in high-dimensional space, such models fail to explicitly distinguish between shared and private features at the text level, which to some extent lacks interpretability. Based on the assumption that removing domain-related tokens from texts would help improve their domain invariance, we instead first transform original sentences to be domain-agnostic. To this end, we propose the BERTMasker model which explicitly masks domain-related words from texts, learns domain-invariant sentiment features from these domain-agnostic texts and uses those masked words to form domain-aware sentence representations. Empirical experiments on the benchmark multiple domain sentiment classification datasets demonstrate the effectiveness of our proposed model, which improves the accuracy on multi-domain and cross-domain settings by 1.91% and 3.31% respectively. Further analysis on masking proves that removing those domain-related and sentiment irrelevant tokens decreases texts’ domain separability, resulting in the performance degradation of a BERT-based domain classifier by over 12%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://pfliu.com/paper/adv-mtl.html.

  2. https://github.com/huggingface/transformers.

  3. https://github.com/amueller/word_cloud.

References

  1. Bousmalis K, Trigeorgis G, Silberman N, Krishnan D, Erhan D (2016) Domain separation networks. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems, vol 29. Curran Associates, Inc., pp 343–351. http://papers.nips.cc/paper/6254-domain-separation-networks.pdf. Accessed 30 Mar 2021

  2. Cai Y, Wan X (2019) Multi-domain sentiment classification based on domain-aware embedding and attention. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI-19. International joint conferences on artificial intelligence organization, , pp 4904–4910. https://doi.org/10.24963/ijcai.2019/681

  3. Chen X, Awadallah AH, Hassan H, Wang W, Cardie C (2019) Multi-source cross-lingual model transfer: learning what to share. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, Italy, pp 3098–3112. https://www.aclweb.org/anthology/P19-1299. Accessed 30 Mar 2021

  4. Chen X, Cardie C (2018) Multinomial adversarial networks for multi-domain text classification. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana, pp. 1226–1240. https://doi.org/10.18653/v1/N18-1111. https://www.aclweb.org/anthology/N18-1111. Accessed 30 Mar 2021

  5. Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp. 4171–4186. https://doi.org/10.18653/v1/N19-1423. https://www.aclweb.org/anthology/N19-1423. Accessed 30 Mar 2021

  6. Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):2030–2096

    MathSciNet  MATH  Google Scholar 

  7. Guo J, Shah D, Barzilay R (2018) Multi-source domain adaptation with mixture of experts. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 4694–4703. https://doi.org/10.18653/v1/D18-1498. https://www.aclweb.org/anthology/D18-1498. Accessed 30 Mar 2021

  8. Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of the 56th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, Australia, pp 328–339. https://doi.org/10.18653/v1/P18-1031. https://www.aclweb.org/anthology/P18-1031. Accessed 30 Mar 2021

  9. Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04. Association for Computing Machinery, New York, NY, USA, pp 168-177. https://doi.org/10.1145/1014052.1014073

  10. Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 168–177

  11. Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Madry A (2019) Adversarial examples are not bugs, they are features. In: Advances in neural information processing systems, vol 32. Curran Associates, Inc, pp 125–136

  12. İrsoy O, Cardie C (2014) Opinion mining with deep recurrent neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)Association for Computational Linguistics, Doha, Qatar, pp 720–728. . https://doi.org/10.3115/v1/D14-1080. https://aclanthology.org/D14-1080

  13. Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3(1):79–87. https://doi.org/10.1162/neco.1991.3.1.79

    Article  Google Scholar 

  14. Jang E, Gu S, Poole B (2016) Categorical reparameterization with gumbel-softmax. arXiv:1611.01144

  15. Ke P, Ji H, Liu S, Zhu X, Huang M (2019) Sentilr: linguistic knowledge enhanced language representation for sentiment analysis. arXiv:1911.02493

  16. Li S, Zong C (2008) Multi-domain sentiment classification. In: Proceedings of ACL-08: HLT, short papers. Association for Computational Linguistics, Columbus, Ohio, pp 257–260. https://www.aclweb.org/anthology/P08-2065. Accessed 30 Mar 2021

  17. Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Article  Google Scholar 

  18. Liu P, Joty S, Meng H (2015) Fine-grained opinion mining with recurrent neural networks and word embeddings. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, pp 1433–1443. https://doi.org/10.18653/v1/D15-1168. https://aclanthology.org/D15-1168. Accessed 30 Mar 2021

  19. Liu P, Qiu X, Huang X (2017) Adversarial multi-task learning for text classification. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Vancouver, Canada, pp 1–10. https://doi.org/10.18653/v1/P17-1001. https://www.aclweb.org/anthology/P17-1001. Accessed 30 Mar 2021

  20. Liu Q, Zhang Y, Liu J (2018) Learning domain representation for multi-domain sentiment classification. In: Proceedings of the 2018 conference of the North American Chapter of the Association for computational linguistics: human language technologies, volume 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 541–550. https://doi.org/10.18653/v1/N18-1050. https://www.aclweb.org/anthology/N18-1050

  21. McCann B, Bradbury J, Xiong C, Socher R (2017) Learned in translation: contextualized word vectors. In: Advances in neural information processing systems, pp 6294–6305

  22. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the 2002 conference on empirical methods in natural language processing (emnlp 2002). Association for Computational Linguistics, pp 79–86. https://doi.org/10.3115/1118693.1118704. https://www.aclweb.org/anthology/W02-1011. Accessed 30 Mar 2021

  23. Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237. https://doi.org/10.18653/v1/N18-1202. https://www.aclweb.org/anthology/N18-1202. Accessed 30 Mar 2021

  24. Qiu G, Liu B, Bu J, Chen C (2009) Expanding domain sentiment lexicon through double propagation. In: Proceedings of the 21st international joint conference on artificial intelligence, IJCAI’09. Morgan Kaufmann Publishers Inc., San Francisco, pp 1199–1204

  25. Su X, Li R, Li X (2020) Multi-domain transfer learning for text classification. In: Zhu X, Zhang M, Hong Y, He R (eds) Natural language processing and Chinese computing. Springer International Publishing, Cham, pp 457–469

    Chapter  Google Scholar 

  26. Sun C, Huang L, Qiu X (2019) Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. In: Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 380–385. https://doi.org/10.18653/v1/N19-1035. https://www.aclweb.org/anthology/N19-1035. Accessed 30 Mar 2021

  27. Tian H, Gao C, Xiao X, Liu H, He B, Wu H, Wang H, Wu F (2020) SKEP: sentiment knowledge enhanced pre-training for sentiment analysis. In: Proceedings of the 58th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, pp 4067–4076. https://doi.org/10.18653/v1/2020.acl-main.374. https://www.aclweb.org/anthology/2020.acl-main.374. Accessed 30 Mar 2021

  28. Wu F, Huang Y (2015) Collaborative multi-domain sentiment classification. In: 2015 IEEE international conference on data mining. IEEE, pp 459–468

  29. Wu Y, Guo Y (2020) Dual adversarial co-learning for multi-domain text classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no 04, pp 6438–6445. https://doi.org/10.1609/aaai.v34i04.6115. https://ojs.aaai.org/index.php/AAAI/article/view/6115. Accessed 30 Mar 2021

  30. Wu Y, Inkpen D, El-Roby A (2021) Conditional adversarial networks for multi-domain text classification. In: Proceedings of the second workshop on domain adaptation for NLP. Association for Computational Linguistics, Kyiv, Ukraine, pp 16–27. https://aclanthology.org/2021.adaptnlp-1.3. Accessed 30 Mar 2021

  31. Wu Y, Inkpen D, El-Roby A (2021) Mixup regularized adversarial networks for multi-domain text classification. In: IEEE international conference on acoustics, speech and signal processing, ICASSP 2021, Toronto, ON, Canada, June 6–11, 2021. IEEE, pp 7733–7737. https://doi.org/10.1109/ICASSP39728.2021.9413441

  32. Yuan J, Wu Y, Lu X, Zhao Y, Qin B, Liu T (2020) Recent advances in deep learning based sentiment analysis. Sci China Technol Sci 63(10):1947–1970. https://doi.org/10.1007/s11431-020-1634-3

    Article  Google Scholar 

  33. Zheng R, Chen J, Qiu X (2018) Same representation, different attentions: shareable sentence representation learning from multiple tasks. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI-18. International joint conferences on artificial intelligence organization, pp 4616–4622. https://doi.org/10.24963/ijcai.2018/642

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bing Qin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, J., Zhao, Y. & Qin, B. Learning to share by masking the non-shared for multi-domain sentiment classification. Int. J. Mach. Learn. & Cyber. 13, 2711–2724 (2022). https://doi.org/10.1007/s13042-022-01556-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01556-0

Keywords

Navigation