Abstract
In the present age, increasingly more people and entities post their opinions on web-based platforms. With this increasing trend of web-based opinion content, it is quite impossible to digest it by hand. Consequently, a computerized system for opinion and sentiment analysis is required. Sentiment analysis builds systems that automatically extract useful information from massive online reviews. Conventional sentiment analysis simply classifies a document/text into positive/negative/neutral polarity on the basis of overall sentiment polarity. In this case, it is considered that one sentiment is carried for the whole text, which may not be the case in the practical scenario. It is possible to appear more than one aspect, opinion, and sentiment in a single text, especially when the text is long. This motivates the introduction of fine-grained sentiment analysis, in which one can identify what a person/entity is mentioning and how they feel about each aspect/target entity. Aspect-opinion term extraction is a crucial aspect-based sentiment analysis (ABSA). It is a kind of granular-level sentiment analysis. In this study, our target is aspect-opinion terms extraction in the Bangla language. However, no annotated Bangla dataset is available for this task. Hence, we obtained textual data from a publicly available Bangla restaurant dataset, annotated them, and prepared a novel dataset. We then performed the task as a sequence labeling technique by utilizing our dataset. For the experiment, we employed vanilla transformers-based models such as mBERT, BanglaBERT, and BanglishBERT with linear layers and conditional random field (CRF) on top of them. To enhance the models’ performance, we combined different feature embeddings of mBERT, BanglaBERT, and BanglishBERT. We then utilized the linear layers and CRF on top of them. The experimental results indicated that the combined feature embeddings technique significantly improves models’ performance, and (mBERT & BanglishBERT) with linear layer performed best: 0.5840 on the F1 score among all the tested models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Al-Mahmud & Shimada, K. (2022). Dataset construction and classification based on pre-trained models for opinion holder detection. In 12th International congress on advanced applied informatics (IIAI-AAI) (pp. 65–70).
Li, J. & Hovy, E. (2015). Reflections on sentiment/opinion analysis.
Turney, P. (2002). Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 417–424).
Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? sentiment classification using machine learning techniques. In Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP 2002) (pp. 79–86).
Yu, H., & Hatzivassiloglou, V. (2003). Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the 2003 conference on empirical methods in natural language processing (pp. 129–136).
Schouten, Kim, & Frasincar, Flavius. (2016). Survey on aspect-level sentiment analysis. IEEE Transactions on Knowledge and Data Engineering, 28(3), 813–830.
Nazir, Ambreen, Rao, Yuan, Lianwei, Wu., & Sun, Ling. (2022). Issues and challenges of aspect-based sentiment analysis: A comprehensive survey. IEEE Transactions on Affective Computing, 13(2), 845–863.
Zhang, W., Li, X., Deng, Y., & Bing, L. (2022). and Wai Lam. A survey on aspect-based sentiment analysis: Tasks, methods, and challenges.
Zhang, W., Deng, Y., Li, X., Yuan,Y., Bing, L., & Lam, W. (2021). Aspect sentiment quad prediction as paraphrase generation.
Sen, O., Fuad, M., Islam, M. N., Rabbi, J., Hasan, M. K., Fime, A. A., Fuad, M. T. H., Sikder, D., & Iftee, M. A. R. (2021). Bangla natural language processing: A comprehensive review of classical, machine learning, and deep learning based methods. CoRR. abs/2105.14875.
Karim, M. A., Kaykobad, M., & Murshed, M. (2013). IGI Global: Technical challenges and design issues in Bangla language processing.
Jianfei, Yu., Jiang, Jing, & Xia, Rui. (2019). Global inference for aspect and opinion terms co-extraction based on multi-task neural networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(1), 168–177.
Wu, M., Wang, W., & Pan, S. J. (2020). Deep weighted MaxSAT for aspect-based opinion extraction. In Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP) (pp. 5618–5628).
Wang, W., & Pan, S. J. (2018). Recursive neural structural correspondence network for cross-domain aspect and opinion co-extraction. In Proceedings of the 56th annual meeting of the association for computational linguistics (V. 1: Long Papers, pp. 2171–2181).
Yin, Y., Wei, F., Dong,L., Xu, K., Zhang, M., & Zhou, M. (2016). Unsupervised word and dependency path embeddings for aspect term extraction.
Wang, W., Pan, S. J., Dahlmeier, D., & Xiao, X. (2016). Recursive neural conditional random fields for aspect-based sentiment analysis. In Proceedings of the 2016 conference on empirical methods in natural language processing (pp. 616–626).
Li, X., & Lam, W. (2017). Deep multi-task learning for aspect term extraction with memory interaction. In Proceedings of the 2017 conference on empirical methods in natural language processing (pp. 2886–2892).
Wang, W., Pan, S. J., Dahlmeier, D., & Xiao, X. (2017). Coupled multi-layer attentions for co-extraction of aspect and opinion terms. Proceedings of the AAAI conference on artificial intelligence, 31(1).
Li, X., Bing, L., Li, P., Lam, W., & Yang, Z. (2018). Aspect term extraction with history attention and selective transformation.
Chen, S., Liu, J., Wang, Y., Zhang, W., & Chi, Z. (2020). Synchronous double-channel recurrent network for aspect-opinion pair extraction. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 6515–6524).
Zhao, H., Huang, L., Zhang, R., Lu, Q., & Xue, H. (2020). SpanMlt: A span-based multi-task learning framework for pair-wise aspect and opinion terms extraction. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 3239–3248).
Gao, Lei, Wang, Yulong, Liu, Tongcun, Wang, Jingyu, Zhang, Lei, & Liao, Jianxin. (2021). Question-driven span labeling model for aspect-opinion pair extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12875–12883.
Zhen, Wu., Ying, Chengcan, Zhao, Fei, Fan, Zhifang, Dai, Xinyu, & Xia, Rui. (2020). Grid tagging scheme for aspect-oriented fine-grained opinion extraction. In Findings of the Association for Computational Linguistics: EMNLP, 2020, 2576–2585.
Wu, S., Fei, H., Ren, Y., Ji, D., & Li, J. (2021). Learn from syntax: Improving pair-wise aspect and opinion terms extractionwith rich syntactic knowledge.
Rahman, M. A., & Dey, E. K. (2018). Datasets for aspect-based sentiment analysis in bangla and its baseline evaluation. Data, 3(2).
Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (pp. 282–289).
https://hyperscience.com/tech_bog/exploring-conditional-random-fields-for-nlp-applications/
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Bhattacharjee, A., Hasan, T., Ahmad, W., Mubasshir, K. S., Islam, M. S., Iqbal, A., Rahman, M. S., & Shahriyar, R. (2022). BanglaBERT: Language model pretraining and benchmarks for low-resource language understanding evaluation in Bangla. In Findings of the association for computational linguistics: NAACL 2022 (pp. 1318–1327).
Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., & Vollgraf, R. (2019). FLAIR: An easy-to-use framework for state-of-the-art NLP. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics (demonstrations) (pp. 54–59).
Acknowledgements
This work was supported by JST SPRING, Grant Number JPMJSP2154.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Al-Mahmud, Shimada, K. (2024). Dataset Construction and Evaluation for Aspect-Opinion Extraction in Bangla Fine-Grained Sentiment Analysis. In: Nanda, S.J., Yadav, R.P., Gandomi, A.H., Saraswat, M. (eds) Data Science and Applications. ICDSA 2023. Lecture Notes in Networks and Systems, vol 818. Springer, Singapore. https://doi.org/10.1007/978-981-99-7862-5_33
Download citation
DOI: https://doi.org/10.1007/978-981-99-7862-5_33
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7861-8
Online ISBN: 978-981-99-7862-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)