Abstract
Multi-domain sentiment classification deals with the scenario where labeled data exists for multiple domains but is insufficient for training effective sentiment classifiers that work across domains. Thus, fully exploiting sentiment knowledge shared across domains is crucial for real-world applications. While many existing works try to extract domain-invariant features in high-dimensional space, such models fail to explicitly distinguish between shared and private features at the text level, which to some extent lacks interpretability. Based on the assumption that removing domain-related tokens from texts would help improve their domain invariance, we instead first transform original sentences to be domain-agnostic. To this end, we propose the BERTMasker model which explicitly masks domain-related words from texts, learns domain-invariant sentiment features from these domain-agnostic texts and uses those masked words to form domain-aware sentence representations. Empirical experiments on the benchmark multiple domain sentiment classification datasets demonstrate the effectiveness of our proposed model, which improves the accuracy on multi-domain and cross-domain settings by 1.91% and 3.31% respectively. Further analysis on masking proves that removing those domain-related and sentiment irrelevant tokens decreases texts’ domain separability, resulting in the performance degradation of a BERT-based domain classifier by over 12%.
Similar content being viewed by others
References
Bousmalis K, Trigeorgis G, Silberman N, Krishnan D, Erhan D (2016) Domain separation networks. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems, vol 29. Curran Associates, Inc., pp 343–351. http://papers.nips.cc/paper/6254-domain-separation-networks.pdf. Accessed 30 Mar 2021
Cai Y, Wan X (2019) Multi-domain sentiment classification based on domain-aware embedding and attention. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI-19. International joint conferences on artificial intelligence organization, , pp 4904–4910. https://doi.org/10.24963/ijcai.2019/681
Chen X, Awadallah AH, Hassan H, Wang W, Cardie C (2019) Multi-source cross-lingual model transfer: learning what to share. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Florence, Italy, pp 3098–3112. https://www.aclweb.org/anthology/P19-1299. Accessed 30 Mar 2021
Chen X, Cardie C (2018) Multinomial adversarial networks for multi-domain text classification. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana, pp. 1226–1240. https://doi.org/10.18653/v1/N18-1111. https://www.aclweb.org/anthology/N18-1111. Accessed 30 Mar 2021
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp. 4171–4186. https://doi.org/10.18653/v1/N19-1423. https://www.aclweb.org/anthology/N19-1423. Accessed 30 Mar 2021
Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):2030–2096
Guo J, Shah D, Barzilay R (2018) Multi-source domain adaptation with mixture of experts. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 4694–4703. https://doi.org/10.18653/v1/D18-1498. https://www.aclweb.org/anthology/D18-1498. Accessed 30 Mar 2021
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of the 56th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Melbourne, Australia, pp 328–339. https://doi.org/10.18653/v1/P18-1031. https://www.aclweb.org/anthology/P18-1031. Accessed 30 Mar 2021
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04. Association for Computing Machinery, New York, NY, USA, pp 168-177. https://doi.org/10.1145/1014052.1014073
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 168–177
Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Madry A (2019) Adversarial examples are not bugs, they are features. In: Advances in neural information processing systems, vol 32. Curran Associates, Inc, pp 125–136
İrsoy O, Cardie C (2014) Opinion mining with deep recurrent neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)Association for Computational Linguistics, Doha, Qatar, pp 720–728. . https://doi.org/10.3115/v1/D14-1080. https://aclanthology.org/D14-1080
Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3(1):79–87. https://doi.org/10.1162/neco.1991.3.1.79
Jang E, Gu S, Poole B (2016) Categorical reparameterization with gumbel-softmax. arXiv:1611.01144
Ke P, Ji H, Liu S, Zhu X, Huang M (2019) Sentilr: linguistic knowledge enhanced language representation for sentiment analysis. arXiv:1911.02493
Li S, Zong C (2008) Multi-domain sentiment classification. In: Proceedings of ACL-08: HLT, short papers. Association for Computational Linguistics, Columbus, Ohio, pp 257–260. https://www.aclweb.org/anthology/P08-2065. Accessed 30 Mar 2021
Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167
Liu P, Joty S, Meng H (2015) Fine-grained opinion mining with recurrent neural networks and word embeddings. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, pp 1433–1443. https://doi.org/10.18653/v1/D15-1168. https://aclanthology.org/D15-1168. Accessed 30 Mar 2021
Liu P, Qiu X, Huang X (2017) Adversarial multi-task learning for text classification. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Vancouver, Canada, pp 1–10. https://doi.org/10.18653/v1/P17-1001. https://www.aclweb.org/anthology/P17-1001. Accessed 30 Mar 2021
Liu Q, Zhang Y, Liu J (2018) Learning domain representation for multi-domain sentiment classification. In: Proceedings of the 2018 conference of the North American Chapter of the Association for computational linguistics: human language technologies, volume 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 541–550. https://doi.org/10.18653/v1/N18-1050. https://www.aclweb.org/anthology/N18-1050
McCann B, Bradbury J, Xiong C, Socher R (2017) Learned in translation: contextualized word vectors. In: Advances in neural information processing systems, pp 6294–6305
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the 2002 conference on empirical methods in natural language processing (emnlp 2002). Association for Computational Linguistics, pp 79–86. https://doi.org/10.3115/1118693.1118704. https://www.aclweb.org/anthology/W02-1011. Accessed 30 Mar 2021
Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237. https://doi.org/10.18653/v1/N18-1202. https://www.aclweb.org/anthology/N18-1202. Accessed 30 Mar 2021
Qiu G, Liu B, Bu J, Chen C (2009) Expanding domain sentiment lexicon through double propagation. In: Proceedings of the 21st international joint conference on artificial intelligence, IJCAI’09. Morgan Kaufmann Publishers Inc., San Francisco, pp 1199–1204
Su X, Li R, Li X (2020) Multi-domain transfer learning for text classification. In: Zhu X, Zhang M, Hong Y, He R (eds) Natural language processing and Chinese computing. Springer International Publishing, Cham, pp 457–469
Sun C, Huang L, Qiu X (2019) Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. In: Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 380–385. https://doi.org/10.18653/v1/N19-1035. https://www.aclweb.org/anthology/N19-1035. Accessed 30 Mar 2021
Tian H, Gao C, Xiao X, Liu H, He B, Wu H, Wang H, Wu F (2020) SKEP: sentiment knowledge enhanced pre-training for sentiment analysis. In: Proceedings of the 58th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, pp 4067–4076. https://doi.org/10.18653/v1/2020.acl-main.374. https://www.aclweb.org/anthology/2020.acl-main.374. Accessed 30 Mar 2021
Wu F, Huang Y (2015) Collaborative multi-domain sentiment classification. In: 2015 IEEE international conference on data mining. IEEE, pp 459–468
Wu Y, Guo Y (2020) Dual adversarial co-learning for multi-domain text classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no 04, pp 6438–6445. https://doi.org/10.1609/aaai.v34i04.6115. https://ojs.aaai.org/index.php/AAAI/article/view/6115. Accessed 30 Mar 2021
Wu Y, Inkpen D, El-Roby A (2021) Conditional adversarial networks for multi-domain text classification. In: Proceedings of the second workshop on domain adaptation for NLP. Association for Computational Linguistics, Kyiv, Ukraine, pp 16–27. https://aclanthology.org/2021.adaptnlp-1.3. Accessed 30 Mar 2021
Wu Y, Inkpen D, El-Roby A (2021) Mixup regularized adversarial networks for multi-domain text classification. In: IEEE international conference on acoustics, speech and signal processing, ICASSP 2021, Toronto, ON, Canada, June 6–11, 2021. IEEE, pp 7733–7737. https://doi.org/10.1109/ICASSP39728.2021.9413441
Yuan J, Wu Y, Lu X, Zhao Y, Qin B, Liu T (2020) Recent advances in deep learning based sentiment analysis. Sci China Technol Sci 63(10):1947–1970. https://doi.org/10.1007/s11431-020-1634-3
Zheng R, Chen J, Qiu X (2018) Same representation, different attentions: shareable sentence representation learning from multiple tasks. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI-18. International joint conferences on artificial intelligence organization, pp 4616–4622. https://doi.org/10.24963/ijcai.2018/642
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yuan, J., Zhao, Y. & Qin, B. Learning to share by masking the non-shared for multi-domain sentiment classification. Int. J. Mach. Learn. & Cyber. 13, 2711–2724 (2022). https://doi.org/10.1007/s13042-022-01556-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01556-0