Abstract
Sentiment analysis, an extensively explored area in the realm of natural language processing, holds the utmost importance for a wide range of applications. This research addresses key limitations in sentiment analysis studies, encompassing contextual ambiguities, difficulties in sarcasm coverage, and a deficiency of labeled data across varied domains. To address these limitations, this study introduces LexiSNTAGMM, an innovative framework designed specifically for sentiment analysis in social networks. By seamlessly integrating a dictionary-based approach with an unsupervised machine-learning model, LexiSNTAGMM aspires to improve sentiment analysis by addressing these identified limitations. The framework's robust performance is demonstrated through a comprehensive evaluation across diverse datasets, including STStest, SentiStrength, HCR, Sanders, OMD, and Strict OMD, resulting in exemplary F1 scores of 84.12, 76.90, 73.83, 86.12, 81.06, and 86.01, respectively. These compelling outcomes firmly establish LexiSNTAGMM's superiority over conventional supervised methods, positioning it favorably in comparison to contemporary approaches.
Similar content being viewed by others
Data availability
The authors pledge to uphold complete transparency regarding the data and materials utilized in this research, ensuring their proper documentation and accessibility, as applicable.
References
Abid F, Alam M, Yasir M, Li C (2019) Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter. Futur Gener Comput Syst 95:292–308. https://doi.org/10.1016/j.future.2018.12.018
Akhtar MS, Ekbal A, Cambria E (2020) How intense are you? Predicting intensities of emotions and sentiments using stacked ensemble [Application Notes]. IEEE Comput Intell Mag 15(1):64–75. https://doi.org/10.1109/MCI.2019.2954667
AminiMotlagh M, Shahhoseini HS, Fatehi N (2023) A reliable sentiment analysis for classification of tweets in social networks. Soc Netw Anal Min 13(1):1–11. https://doi.org/10.1007/s13278-022-00998-2
Bahrawi N (2019) Sentiment analysis using random forest algorithm-online social media based. J Inf Technol Util 2(2):29. https://doi.org/10.30818/jitu.2.2.2695
Basiri M, Nilchi A, Ghassem-Aghaee N (2014) A framework for sentiment analysis in persian. Open Trans Inf Process 1(3):1–14. https://doi.org/10.15764/otip.2014.03001
Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR (2021) ABCDM: an attention-based bidirectional CNN-RNN deep model for sentiment analysis. Futur Gener Comput Syst 115:279–294. https://doi.org/10.1016/j.future.2020.08.005
Benrouba F, Boudour R (2023) Emotional sentiment analysis of social media content for mental health safety. Soc Netw Anal Min 13(1):1–8. https://doi.org/10.1007/s13278-022-01000-9
Bhatnagar S, Choubey N (2021) Making sense of tweets using sentiment analysis on closely related topics. Soc Netw Anal Min 11(1):1–11. https://doi.org/10.1007/s13278-021-00752-0
Bibi M, Abbasi WA, Aziz W, Khalil S, Uddin M, Iwendi C, Gadekallu TR (2022) A novel unsupervised ensemble framework using concept-based linguistic methods and machine learning for twitter sentiment analysis. Pattern Recogn Lett 158:80–86. https://doi.org/10.1016/j.patrec.2022.04.004
Bijari K, Zare H, Kebriaei E, Veisi H (2020) Leveraging deep graph-based text representation for sentiment polarity applications. Expert Syst Appl 144:113090. https://doi.org/10.1016/j.eswa.2019.113090
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(4–5):993–1022. https://doi.org/10.1017/9781009218245.012
Borg A, Boldt M (2020) Using VADER sentiment and SVM for predicting customer response sentiment. Expert Syst Appl 162:113746. https://doi.org/10.1016/j.eswa.2020.113746
Bravo-Marquez F, Mendoza M, Poblete B (2014) Meta-level sentiment models for big social data analysis. Knowl-Based Syst 69(1):86–99. https://doi.org/10.1016/j.knosys.2014.05.016
Calders T, Verwer S (2010) Three naive Bayes approaches for discrimination-free classification. Data Min Knowl Disc 21(2):277–292. https://doi.org/10.1007/s10618-010-0190-x
Chen CC, Tseng Y (2011) Quality evaluation of product reviews using an information quality framework. Decis Support Syst 50(4):755–768. https://doi.org/10.1016/j.dss.2010.08.023
Chen J, Feng J, Sun X, Liu Y (2020) Co-training semi-supervised deep learning for sentiment classification of MOOC forum posts. Symmetry 12(1):1–24. https://doi.org/10.3390/SYM12010008
Cho H, Lee H (2019) Biomedical named entity recognition using deep neural networks with contextual information. BMC Bioinformatics 20(1):1–11. https://doi.org/10.1186/s12859-019-3321-4
Ghanbari-Adivi F, Mosleh M (2019) Text emotion detection in social networks using a novel ensemble classifier based on parzen tree estimator (TPE). Neural Comput Appl 31(12):8971–8983. https://doi.org/10.1007/s00521-019-04230-9
Ghorbanali A, Sohrabi MK, Yaghmaee F (2022) Ensemble transfer learning-based multimodal sentiment analysis using weighted convolutional neural networks. Inf Process Manage 59(3):102929. https://doi.org/10.1016/j.ipm.2022.102929
Gupta V, Lehal GS (2009) A survey of text mining techniques and applications. J Emerg Technol Web Intell 1(1):60–76. https://doi.org/10.4304/jetwi.1.1.60-76
Hssina B, Merbouha A, Ezzikouri H, Erritali M (2014) A comparative study of decision tree ID3 and C4.5. Int J Adv Comput Sci Appl 4(2):13–19. https://doi.org/10.14569/specialissue.2014.040203
Iddrisu AM, Mensah S, Boafo F, Yeluripati GR, Kudjo P (2023) A sentiment analysis framework to classify instances of sarcastic sentiments within the aviation sector. Int J Inf Manage Data Insights 3(2):100180. https://doi.org/10.1016/j.jjimei.2023.100180
Jain PK, Pamula R, Ansari S (2021) A supervised machine learning approach for the credibility assessment of user-generated content. Wireless Pers Commun 118(4):2469–2485. https://doi.org/10.1007/s11277-021-08136-5
Jiang W, Zhou K, Xiong C, Du G, Ou C, Zhang J (2023) KSCB: a novel unsupervised method for text sentiment analysis. Appl Intell 53(1):301–311. https://doi.org/10.1007/s10489-022-03389-4
Johnson AA, Ott MQ, Dogucu M (2022) Naive Bayes classification. Bayes Rules! 17(3):355–372. https://doi.org/10.1201/9780429288340-14
Kang H, Yoo SJ, Han D (2012) Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews. Expert Syst Appl 39(5):6000–6010. https://doi.org/10.1016/j.eswa.2011.11.107
Keshavarz H, Abadeh MS (2017) ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs. Knowl-Based Syst 122:1–16. https://doi.org/10.1016/j.knosys.2017.01.028
Kora R, Mohammed A (2023) An enhanced approach for sentiment analysis based on meta-ensemble deep learning. Soc Netw Anal Min. https://doi.org/10.1007/s13278-023-01043-6
Li Y, Li T (2013) Deriving market intelligence from microblogs. Decis Support Syst 55(1):206–217. https://doi.org/10.1016/j.dss.2013.01.023
Li X, Wu P, Wang W (2020) Incorporating stock prices and news sentiments for stock market prediction: a case of Hong Kong. Inf Process Manage 57(5):102212. https://doi.org/10.1016/j.ipm.2020.102212
Li H, Chen Q, Zhong Z, Gong R, Han G (2022a) E-word of mouth sentiment analysis for user behavior studies. Inf Process Manage 59(1):1–12. https://doi.org/10.1016/j.ipm.2021.102784
Li W, Shao W, Ji S, Cambria E (2022b) BiERU: bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing 467:73–82. https://doi.org/10.1016/j.neucom.2021.09.057
Liang B, Su H, Gui L, Cambria E, Xu R (2022) Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl-Based Syst 235:107643. https://doi.org/10.1016/j.knosys.2021.107643
Liu Y, Jian-Wu Bi Z-PF (2017) A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithm. Inf Sci 394–395:38–52
Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5(4):1093–1113. https://doi.org/10.1016/j.asej.2014.04.011
Najar F, Bouguila N (2022) On smoothing and scaling language model for sentiment based information retrieval. Adv Data Anal Classif. https://doi.org/10.1007/s11634-022-00522-6
Nandwani P, Verma R (2021a) A review on sentiment analysis and emotion detection from text. Soc Netw Anal Min 11(1):1–19. https://doi.org/10.1007/s13278-021-00776-6
Nandwani P, Verma R (2021b) A review on sentiment analysis and emotion detection from text. Soc Netw Anal Min. https://doi.org/10.1007/s13278-021-00776-6
Patel HH, Prajapati P (2018) Study and analysis of ecision tree based classification algorithms. Int J Comput Sci Eng 6(10):56–61
Patil RS, Kolhe SR (2022) Supervised classifiers with TF–IDF features for sentiment analysis of Marathi tweets. Soc Netw Anal Min 12(1):1–16. https://doi.org/10.1007/s13278-022-00877-w
Phu VN, Tran VTN, Chau VTN, Dat ND, Duy KLD (2017) A decision tree using ID3 algorithm for English semantic analysis. Int J Speech Technol 20(3):593–613. https://doi.org/10.1007/s10772-017-9429-x
Rashed FE, Abdolvand N (2017) A supervised method for constructing sentiment lexicon in persian language. J Comp Robot 10(1):2017–2028
Rathinasamy R, Revathy R, Lawrance R (2019) Comparative analysis of C4.5 and C5.0 algorithms on crop pest data. Artic Int J Innov Res Comp Commun Eng 3297(1):50–58
Ravi K, Ravi V (2015a) A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl-Based Syst 89(November):14–46. https://doi.org/10.1016/j.knosys.2015.06.015
Ravi K, Ravi V (2015b) A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl-Based Syst 89:14–46. https://doi.org/10.1016/j.knosys.2015.06.015
Riaz S, Fatima M, Kamran M, Nisar MW (2019) Opinion mining on large scale data using sentiment analysis and k-means clustering. Clust Comput 22:7149–7164. https://doi.org/10.1007/s10586-017-1077-z
Shang L, Xi H, Hua J, Tang H, Zhou J (2023) A lexicon enhanced collaborative network for targeted financial sentiment analysis. Inf Process Manage 60(2):103187. https://doi.org/10.1016/j.ipm.2022.103187
Shruti Chandrayan PB (2021) A brief survey of Text Mining and its applications. Int J Emerg Trends Eng Res 9(8):1190–1195. https://doi.org/10.30534/ijeter/2021/26982021
Singh S (2014) Comparative study Id3, cart and C4.5 decision tree algorithm: a survey. Int J Adv Inf Sci Technol (IJAIST) 27(27):97–103. https://doi.org/10.15693/ijaist/2014.v3i7.47-52
Sivakumar S, Rajalakshmi R (2022) Context-aware sentiment analysis with attention-enhanced features from bidirectional transformers. Soc Netw Anal Min 12(1):1–23. https://doi.org/10.1007/s13278-022-00910-y
Thomas JA, Valvano MA (2011) Twitter sentiment analysis: the good the bad and the omg! FEMS Microbiol Lett 91(2):107–111. https://doi.org/10.1016/0378-1097(92)90668-E
Tripathy A, Agrawal A, Rath SK (2015) Classification of sentimental reviews using machine learning techniques. Procedia—Procedia Comput Sci 57:821–829. https://doi.org/10.1016/j.procs.2015.07.523
Van De Camp M, Van Den Bosch A (2012) The socialist network. Decis Support Syst 53(4):761–769. https://doi.org/10.1016/j.dss.2012.05.031
Vinodhini G, Chandrasekaran RM (2017) A sampling based sentiment mining approach for e-commerce applications. Inf Process Manage 53(1):223–236. https://doi.org/10.1016/j.ipm.2016.08.003
Wankhade M, Rao ACS, Kulkarni C (2022) A survey on sentiment analysis methods, applications, and challenges. Artif Intell Rev. https://doi.org/10.1007/s10462-022-10144-1
Yue L, Chen W, Li X, Zuo W, Yin M (2019) A survey of sentiment analysis in social media. Knowl Inf Syst 60(2):617–663. https://doi.org/10.1007/s10115-018-1236-4
Zaidelman L, Nosovets Z, Kotov A, Ushakov V, Zabotkina V, Velichkovsky BM (2021) Russian-language neurosemantics: Clustering of word meaning and sense from oral narratives. Cogn Syst Res 67:60–65. https://doi.org/10.1016/j.cogsys.2021.01.001
Zarisfi Kermani F, Sadeghi F, Eslami E (2020) Solving the twitter sentiment analysis problem based on a machine learning-based approach. Evol Intel 13(3):381–398. https://doi.org/10.1007/s12065-019-00301-x
Zhang Q, Zhang Z, Yang M, Zhu L (2021) Exploring coevolution of emotional contagion and behavior for microblog sentiment analysis: a deep learning architecture. Complexity. https://doi.org/10.1155/2021/6630811
Zhao J, Zeng D, Xiao Y, Che L, Wang M (2020) User personality prediction based on topic preference and sentiment analysis using LSTM model. Pattern Recogn Lett 138:397–402. https://doi.org/10.1016/j.patrec.2020.07.035
Cambria E, Li Y, Xing FZ, Poria S, Kwok K (2020) SenticNet 6: ensemble application of symbolic and subsymbolic AI for sentiment analysis. In: International conference on information and knowledge management, proceedings pp 105–114 https://doi.org/10.1145/3340531.3412003
Chen YC, Cheng JY, Hsu HH (2017) A cluster-based opinion leader discovery in social network. In: TAAI 2016—2016 conference on technologies and applications of artificial intelligence, proceedings, pp 78–83 https://doi.org/10.1109/TAAI.2016.7880184
Dashtipour K, Gogate M, Adeel A, Ieracitano C, Larijani H, Hussain A (2018) Exploiting deep learning for persian sentiment analysis. In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10989 LNAI, 597–604 https://doi.org/10.1007/978-3-030-00563-4_58
Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on world wide web, WWW 2003, pp 519–528 https://doi.org/10.1145/775152.775226
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL HLT 2019—2019 conference of the north american chapter of the association for computational linguistics: human language technologies—proceedings of the conference, 1(Mlm), pp 4171–4186
Donkers T, Loepp B, Ziegler J (2017) Sequential user-based recurrent neural network recommendations. In proceedings of the Eleventh ACM conference on recommender systems (RecSys '17). Association for Computing Machinery, New York, NY, USA, p 152–160. https://doi.org/10.1145/3109859.3109877
Ethayarajh K (2019) How contextual are contextualized word representations? Comparing the geometry of BERT, ELMO, and GPT-2 embeddings. In: EMNLP-IJCNLP 2019—2019 conference on empirical methods in natural language processing and 9th international joint conference on natural language processing, proceedings of the conference, p 55–65 https://doi.org/10.18653/v1/d19-1006
Iosifidis V, Ntoutsi E (2017) Large scale sentiment learning with limited labels. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Part F1296, p 1823–1832 https://doi.org/10.1145/3097983.3098159
Lu Y, Tsaparas P, Ntoulas A, Polanyi L (2010) Exploiting social context for review quality prediction. In: Proceedings of the 19th international conference on World Wide Web, WWW ’10, April 2010, p 691–700. https://doi.org/10.1145/1772690.1772761
Manjunath VK (2022) Mining twitter multi-word product opinions with most frequent sequences of aspect terms. In: Pardede E, Delir Haghighi P, Khalil I, Kotsis G (eds) Information integration and web intelligence. iiWAS 2022. Lecture Notes in Computer Science, vol 13635. Springer, Cham. https://doi.org/10.1007/978-3-031-21047-1_12
Pham TH, Le-Hong P (2018) End-to-End recurrent neural network models for vietnamese named entity recognition: word-level vs. character-level. In: Hasida, K., Pa, W. (eds) Computational Linguistics. PACLING 2017. Communications in computer and information science, 781:219–232. https://doi.org/10.1007/978-981-10-8438-6_18
Plank B, Søgaard A, Goldberg Y (2016) Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. In: 54th Annual meeting of the association for computational linguistics, ACL 2016—short papers, p 412–418. https://doi.org/10.18653/v1/p16-2067
Pozzi FA, Maccagnola D, Fersini E, Messina E (2013) Enhance user-level sentiment analysis on microblogs with approval relations. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 8249 LNAI, p 133–144. https://doi.org/10.1007/978-3-319-03524-6_12
Shamma DA, Kennedy L, Churchill EF (2009) Tweet the debates: understanding community annotation of uncollected sources. In: 1st ACM SIGMM international workshop on social media, WSM’09, co-located with the 2009 ACM international conference on multimedia, MM’09, p 3–10. https://doi.org/10.1145/1631144.1631148
Shams M, Shakery A, Faili H (2012) A non-parametric LDA-based induction method for sentiment analysis. In: AISP 2012—16th CSI international symposium on artificial intelligence and signal processing, Aisp, p 216–221. https://doi.org/10.1109/AISP.2012.6313747
Si J, Mukherjee A, Liu B, Li Q, Li H, Deng X (2013) Exploiting topic based twitter sentiment for stock prediction. In: ACL 2013—51st annual meeting of the association for computational linguistics, proceedings of the conference, 2(2011), p 24–29
Speriosu M, Sudan N, Upadhyay S, Baldridge J (2011) Twitter polarity classification with label propagation over lexical links and the follower graph. In: Workshop on unsupervised learning in nlp at the 2011 conference on empirical methods in natural language processing, EMNLP 2011—proceedings, p 53–63
Tan C, Lee L, Tang J, Jiang L, Zhou M, Li P (2011) User-level sentiment analysis incorporating social networks. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, p 1397–1405. https://doi.org/10.1145/2020408.2020614
Tang D, Wei F, Yang N, Zhou M, Liu T, Qin B (2014) Learning sentiment-specific word embedding for twitter sentiment classification. In proceedings of the 52nd annual meeting of the association for computational linguistics. 1:1555–1565
Troussas C, Krouska A, Virvou M (2016) Evaluation of ensemble-based sentiment classifiers for Twitter data. In: IISA 2016 - 7th international conference on information, intelligence, systems and applications. https://doi.org/10.1109/IISA.2016.7785380
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In proceedings of the 31st international conference on neural information processing systems (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, p 6000–6010
Vaziripour E, Giraud-Carrier C, Zappala D (2016) Analyzing the political sentiment of tweets in Farsi. In: Proceedings of the 10th international conference on web and social media, ICWSM 2016, Icwsm, p 699–702
Wan H, Wang H, Scotney B, Liu J (2019) A novel gaussian mixture model for classification. In: Conference proceedings—IEEE international conference on systems, man and cybernetics, 2019-Octob, p 3298–3303. https://doi.org/10.1109/SMC.2019.8914215
Zhao J, Lan M, Zhu TT (2014) ECNU: expression—and message-level sentiment orientation classification in twitter using multiple effective features. In: 8th International workshop on semantic evaluation, semeval 2014 - co-located with the 25th international conference on computational linguistics, COLING 2014, proceedings, semeval, p 259–264. https://doi.org/10.3115/v1/s14-2042
Acknowledgements
Not applicable.
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Contributions
All authors contributed to the design and implementation of the research, analysis of the results and to the writing of the manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no competing interests.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bashiri, H., Naderi, H. LexiSNTAGMM: an unsupervised framework for sentiment classification in data from distinct domains, synergistically integrating dictionary-based and machine learning approaches. Soc. Netw. Anal. Min. 14, 102 (2024). https://doi.org/10.1007/s13278-024-01268-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-024-01268-z