Skip to main content
Log in

A multilingual fuzzy approach for classifying Twitter data using fuzzy logic and semantic similarity

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In recent years, the classification of the social networks' data has witnessed an increasing interest. It aims at extracting opinions, emotions and attitudes from social networks' data such as Facebook comments or tweets. This new scientific research area is called sentiment analysis. (It is sometimes called opinion mining.) In this article, we propose a new method to classify tweets into three classes: positive, negative or neutral. The proposed method is a new hybrid approach based on the fuzzy logic with its three important steps (fuzzification, Rule Inference/aggregation and defuzzification) and the concepts of information retrieval system (IRS) by calculating the semantic similarity between a tweet to classify and two opinion documents (one for the positive opinion words and another one for the negative opinion words) using the WordNet dictionary. To remedy the calculation time’s problem—if we have a huge dataset of tweets—we decide to parallelize our work using the Hadoop framework with its distributed file system (HDFS) and the MapReduce programming model. The experimental results show that our approach outperforms some other methods from the literature as well as by using the fuzzy logic, we improve the results of the classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. https://about.twitter.com/fr/company.

  2. AFINN is a dictionary that contains words with weights between − 5 and 5 which expresses the sentimental degree of the word.

  3. http://sentiwordnet.isti.cnr.it/ that is a lexical resource for opinion mining, it assigns to each synset of WordNet three sentiment scores: positivity, negativity, objectivity.

  4. http://sentic.net/, talking about SenticNet is talking about concept-level sentiment analysis, that is, performing tasks such as polarity detection and emotion recognition by leveraging on semantics and linguistics instead of solely relying on word co-occurrence frequencies.

  5. Apache Sqoop (TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.

  6. The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.

  7. Twitter4J (twitter4j.org/) is an unofficial Java library for the Twitter API. With Twitter4J, you can easily integrate your Java application with the Twitter service. Twitter4J is an unofficial library.

  8. https://flume.apache.org/ Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and faults tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for an online analytic application.

  9. http://api.whatsmate.net/.

References

  1. Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5(4):1093–1113. https://doi.org/10.1016/j.asej.2014.04.011

    Article  Google Scholar 

  2. Catal C, Nangir M (2017) A sentiment classification model based on multiple classifiers. Appl Soft Comput 50(Supplement C):135–141

    Article  Google Scholar 

  3. Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57(Supplement C):117–126

    Article  Google Scholar 

  4. Appel O, Chiclana F, Carter J, Fujita H (2016) A hybrid approach to the sentiment analysis problem at the sentence level. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2016.05.040

    Article  Google Scholar 

  5. Zadeh L (2015) Fuzzy logic: a personal perspective. Fuzzy Sets Syst 281:4–20

    Article  MathSciNet  Google Scholar 

  6. Liu H, Cocea M (2017) Fuzzy rule based systems for interpretable sentiment analysis. In: 2017 9th international conference on advanced computational intelligence (ICACI), 2017, pp 129–136

  7. Zadeh LA (1965) Fuzzy sets. Intl J Inf Control 8:338–353

    Article  Google Scholar 

  8. Bai Y, Wang D (2006) Fundamentals of fuzzy logic control—fuzzy sets, fuzzy rules and defuzzifications. In Advanced fuzzy logic technologies in industrial applications, Springer, 2006, pp 17–36

  9. Coban R (2013) A context layered locally recurrent neural network for dynamic system identification. Eng Appl Artif Intell 26(1):241–250. https://doi.org/10.1016/j.engappai.2012.09.023

    Article  Google Scholar 

  10. Coban R (2011) A fuzzy controller design for nuclear research reactors using the particle swarm optimization algorithm. Nucl Eng Des 241(5):1899–1908. https://doi.org/10.1016/j.nucengdes.2011.01.045

    Article  Google Scholar 

  11. Coban R, Ozge Aksu I (2018) Neuro-controller design by using the multifeedback layer neural network and the particle swarm optimization. Tehnicki vjesnik 25(2):437–444

    Google Scholar 

  12. Coban R, Can B (2005) Identification and control of ITU Triga Mark-II nuclear research reactor using neural networks and fuzzy logic. In: Zhang S, Jarvis R (eds) AI 2005: advances in artificial intelligence. AI 2005, vol 3809. Lecture notes in computer science. Springer, Berlin

    Google Scholar 

  13. Wang B, Huang Y, Wu X, Li X (2015) A fuzzy computing model for identifying polarity of Chinese sentiment words. In: Computational intelligence and neuroscience, 2015. [En ligne]. Disponible sur: https://www.hindawi.com/journals/cin/2015/525437/

  14. Wu K, Zhou M, Lu XS, Huang L (2017) A fuzzy logic-based text classification method for social media data. In: 2017 IEEE international conference on systems, man, and cybernetics (SMC) Banff Center, Banff, Canada, Oct 5–8

  15. Dragoni M, Petrucci G (2018) A fuzzy-based strategy for multi-domain sentiment analysis. Int J Approx Reason 93:59–73. https://doi.org/10.1016/j.ijar.2017.10.021

    Article  MathSciNet  MATH  Google Scholar 

  16. Sathe JB, Mali MP (2017) A hybrid sentiment classification method using neural network and fuzzy logic. In: 2017 11th international conference on intelligent systems and control (ISCO), pp 93–96, Jan. https://doi.org/10.1109/ISCO.2017.7855960

  17. Jefferson C, Liu H, Cocea M (2017) Fuzzy approach for sentiment analysis. In: 2017 IEEE international conference on fuzzy systems (FUZZ-IEEE), pp 1–6

  18. Wang X, Zhang H, Xu Z (2016) Public sentiments analysis based on fuzzy logic for text. Int J Soft Eng Knowl Eng 26(09n10):1341–1360. https://doi.org/10.1142/S0218194016400076

    Article  Google Scholar 

  19. Damodar Gaikar D, Marakarkandy B, Dasgupta C (2015) Using Twitter data to predict the performance of Bollywood movies. Ind Manag Data Syst 115(9):1604–1621. https://doi.org/10.1108/IMDS-04-2015-0145

    Article  Google Scholar 

  20. Kauer AU, Moreira VP (2016) Using information retrieval for sentiment polarity prediction. Expert Syst Appl 61:282–289

    Article  Google Scholar 

  21. Vechtomova O (2017) Disambiguating context-dependent polarity of words: an information retrieval approach. Inf Process Manag 53(5):1062–1079

    Article  Google Scholar 

  22. Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meeting of the associations for computational linguistics, pp 133–138

  23. Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: IJCAI, pp 448–453

  24. Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. In: Fellbaum C (ed) WordNet: an electronic lexical database. MIT Press, Cambridge

    Google Scholar 

  25. Hirst G, St-Onge D (1998) Lexical chains as representation of context for the detection and correction malapropisms, chapter 13. In: Fellbaum C (ed) WordNet: an electronic lexical database. MIT Press, Cambridge, pp 305–332

    Google Scholar 

  26. Madani Y, Bengourram J, Erritali M (2017) Social login and data storage in the big data file system HDFS. In: Proceedings of the international conference on compute and data analysis, New York, NY, USA, pp 91–97

  27. Madani Y, Bengourram J, Erritali M (2017) A parallel semantic sentiment analysis. In: Proceedings of the 3rd international conference on cloud computing technologies and applications—CloudTech’17, Rabat, Morocco, Oct 24–26

  28. Madani Y, Erritali M, Bengourram J (2019) Knowl Inf Syst 59:413. https://doi.org/10.1007/s10115-018-1212-z

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

In this article, we present a new approach for classifying Twitter data based on fuzzy logic, semantic similarity, the notions of IRS and big data. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Youness Madani.

Ethics declarations

Availability of data and material

Not applicable.

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this article.

Funding

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Madani, Y., Erritali, M., Bengourram, J. et al. A multilingual fuzzy approach for classifying Twitter data using fuzzy logic and semantic similarity. Neural Comput & Applic 32, 8655–8673 (2020). https://doi.org/10.1007/s00521-019-04357-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04357-9

Keywords

Navigation