Skip to main content

Dataset Construction and Evaluation for Aspect-Opinion Extraction in Bangla Fine-Grained Sentiment Analysis

  • Conference paper
  • First Online:
Data Science and Applications (ICDSA 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 818))

Included in the following conference series:

  • 98 Accesses

Abstract

In the present age, increasingly more people and entities post their opinions on web-based platforms. With this increasing trend of web-based opinion content, it is quite impossible to digest it by hand. Consequently, a computerized system for opinion and sentiment analysis is required. Sentiment analysis builds systems that automatically extract useful information from massive online reviews. Conventional sentiment analysis simply classifies a document/text into positive/negative/neutral polarity on the basis of overall sentiment polarity. In this case, it is considered that one sentiment is carried for the whole text, which may not be the case in the practical scenario. It is possible to appear more than one aspect, opinion, and sentiment in a single text, especially when the text is long. This motivates the introduction of fine-grained sentiment analysis, in which one can identify what a person/entity is mentioning and how they feel about each aspect/target entity. Aspect-opinion term extraction is a crucial aspect-based sentiment analysis (ABSA). It is a kind of granular-level sentiment analysis. In this study, our target is aspect-opinion terms extraction in the Bangla language. However, no annotated Bangla dataset is available for this task. Hence, we obtained textual data from a publicly available Bangla restaurant dataset, annotated them, and prepared a novel dataset. We then performed the task as a sequence labeling technique by utilizing our dataset. For the experiment, we employed vanilla transformers-based models such as mBERT, BanglaBERT, and BanglishBERT with linear layers and conditional random field (CRF) on top of them. To enhance the models’ performance, we combined different feature embeddings of mBERT, BanglaBERT, and BanglishBERT. We then utilized the linear layers and CRF on top of them. The experimental results indicated that the combined feature embeddings technique significantly improves models’ performance, and (mBERT & BanglishBERT) with linear layer performed best: 0.5840 on the F1 score among all the tested models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/atik-05/Bangla_ABSA_Datasets.

  2. 2.

    See Footnote 1.

  3. 3.

    https://github.com/al-mahmud28/bangla-aspect-opinion/.

References

  1. Al-Mahmud & Shimada, K. (2022). Dataset construction and classification based on pre-trained models for opinion holder detection. In 12th International congress on advanced applied informatics (IIAI-AAI) (pp. 65–70).

    Google Scholar 

  2. Li, J. & Hovy, E. (2015). Reflections on sentiment/opinion analysis.

    Google Scholar 

  3. Turney, P. (2002). Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 417–424).

    Google Scholar 

  4. Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? sentiment classification using machine learning techniques. In Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP 2002) (pp. 79–86).

    Google Scholar 

  5. Yu, H., & Hatzivassiloglou, V. (2003). Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the 2003 conference on empirical methods in natural language processing (pp. 129–136).

    Google Scholar 

  6. Schouten, Kim, & Frasincar, Flavius. (2016). Survey on aspect-level sentiment analysis. IEEE Transactions on Knowledge and Data Engineering, 28(3), 813–830.

    Article  Google Scholar 

  7. Nazir, Ambreen, Rao, Yuan, Lianwei, Wu., & Sun, Ling. (2022). Issues and challenges of aspect-based sentiment analysis: A comprehensive survey. IEEE Transactions on Affective Computing, 13(2), 845–863.

    Article  Google Scholar 

  8. Zhang, W., Li, X., Deng, Y., & Bing, L. (2022). and Wai Lam. A survey on aspect-based sentiment analysis: Tasks, methods, and challenges.

    Google Scholar 

  9. Zhang, W., Deng, Y., Li, X., Yuan,Y., Bing, L., & Lam, W. (2021). Aspect sentiment quad prediction as paraphrase generation.

    Google Scholar 

  10. Sen, O., Fuad, M., Islam, M. N., Rabbi, J., Hasan, M. K., Fime, A. A., Fuad, M. T. H., Sikder, D., & Iftee, M. A. R. (2021). Bangla natural language processing: A comprehensive review of classical, machine learning, and deep learning based methods. CoRR. abs/2105.14875.

    Google Scholar 

  11. Karim, M. A., Kaykobad, M., & Murshed, M. (2013). IGI Global: Technical challenges and design issues in Bangla language processing.

    Google Scholar 

  12. Jianfei, Yu., Jiang, Jing, & Xia, Rui. (2019). Global inference for aspect and opinion terms co-extraction based on multi-task neural networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(1), 168–177.

    Article  Google Scholar 

  13. Wu, M., Wang, W., & Pan, S. J. (2020). Deep weighted MaxSAT for aspect-based opinion extraction. In Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP) (pp. 5618–5628).

    Google Scholar 

  14. Wang, W., & Pan, S. J. (2018). Recursive neural structural correspondence network for cross-domain aspect and opinion co-extraction. In Proceedings of the 56th annual meeting of the association for computational linguistics (V. 1: Long Papers, pp. 2171–2181).

    Google Scholar 

  15. Yin, Y., Wei, F., Dong,L., Xu, K., Zhang, M., & Zhou, M. (2016). Unsupervised word and dependency path embeddings for aspect term extraction.

    Google Scholar 

  16. Wang, W., Pan, S. J., Dahlmeier, D., & Xiao, X. (2016). Recursive neural conditional random fields for aspect-based sentiment analysis. In Proceedings of the 2016 conference on empirical methods in natural language processing (pp. 616–626).

    Google Scholar 

  17. Li, X., & Lam, W. (2017). Deep multi-task learning for aspect term extraction with memory interaction. In Proceedings of the 2017 conference on empirical methods in natural language processing (pp. 2886–2892).

    Google Scholar 

  18. Wang, W., Pan, S. J., Dahlmeier, D., & Xiao, X. (2017). Coupled multi-layer attentions for co-extraction of aspect and opinion terms. Proceedings of the AAAI conference on artificial intelligence, 31(1).

    Google Scholar 

  19. Li, X., Bing, L., Li, P., Lam, W., & Yang, Z. (2018). Aspect term extraction with history attention and selective transformation.

    Google Scholar 

  20. Chen, S., Liu, J., Wang, Y., Zhang, W., & Chi, Z. (2020). Synchronous double-channel recurrent network for aspect-opinion pair extraction. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 6515–6524).

    Google Scholar 

  21. Zhao, H., Huang, L., Zhang, R., Lu, Q., & Xue, H. (2020). SpanMlt: A span-based multi-task learning framework for pair-wise aspect and opinion terms extraction. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 3239–3248).

    Google Scholar 

  22. Gao, Lei, Wang, Yulong, Liu, Tongcun, Wang, Jingyu, Zhang, Lei, & Liao, Jianxin. (2021). Question-driven span labeling model for aspect-opinion pair extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12875–12883.

    Article  Google Scholar 

  23. Zhen, Wu., Ying, Chengcan, Zhao, Fei, Fan, Zhifang, Dai, Xinyu, & Xia, Rui. (2020). Grid tagging scheme for aspect-oriented fine-grained opinion extraction. In Findings of the Association for Computational Linguistics: EMNLP, 2020, 2576–2585.

    Google Scholar 

  24. Wu, S., Fei, H., Ren, Y., Ji, D., & Li, J. (2021). Learn from syntax: Improving pair-wise aspect and opinion terms extractionwith rich syntactic knowledge.

    Google Scholar 

  25. Rahman, M. A., & Dey, E. K. (2018). Datasets for aspect-based sentiment analysis in bangla and its baseline evaluation. Data, 3(2).

    Google Scholar 

  26. Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (pp. 282–289).

    Google Scholar 

  27. https://medium.com/data-science-in-your-pocket/named-entity-recognition-ner-using-conditional-random-fields-in-nlp-3660df22e95c

  28. https://hyperscience.com/tech_bog/exploring-conditional-random-fields-for-nlp-applications/

  29. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

  30. Bhattacharjee, A., Hasan, T., Ahmad, W., Mubasshir, K. S., Islam, M. S., Iqbal, A., Rahman, M. S., & Shahriyar, R. (2022). BanglaBERT: Language model pretraining and benchmarks for low-resource language understanding evaluation in Bangla. In Findings of the association for computational linguistics: NAACL 2022 (pp. 1318–1327).

    Google Scholar 

  31. Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., & Vollgraf, R. (2019). FLAIR: An easy-to-use framework for state-of-the-art NLP. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics (demonstrations) (pp. 54–59).

    Google Scholar 

Download references

Acknowledgements

This work was supported by JST SPRING, Grant Number JPMJSP2154.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Al-Mahmud .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Al-Mahmud, Shimada, K. (2024). Dataset Construction and Evaluation for Aspect-Opinion Extraction in Bangla Fine-Grained Sentiment Analysis. In: Nanda, S.J., Yadav, R.P., Gandomi, A.H., Saraswat, M. (eds) Data Science and Applications. ICDSA 2023. Lecture Notes in Networks and Systems, vol 818. Springer, Singapore. https://doi.org/10.1007/978-981-99-7862-5_33

Download citation

Publish with us

Policies and ethics