RIFT: A Rule Induction Framework for Twitter Sentiment Analysis

Asghar, Muhammad Zubair; Khan, Aurangzeb; Khan, Furqan; Kundi, Fazal Masud

doi:10.1007/s13369-017-2770-1

RIFT: A Rule Induction Framework for Twitter Sentiment Analysis

Research Article - Computer Engineering and Computer Science
Published: 17 August 2017

Volume 43, pages 857–877, (2018)
Cite this article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Muhammad Zubair Asghar ORCID: orcid.org/0000-0003-3320-2074¹,
Aurangzeb Khan²,
Furqan Khan¹ &
…
Fazal Masud Kundi¹

399 Accesses
33 Citations
10 Altmetric
Explore all metrics

Abstract

The rapid evolution of microblogging and the emergence of sites such as Twitter have propelled online communities to flourish by enabling people to create, share and disseminate free-flowing messages and information globally. The exponential growth of product-based user reviews has become an ever-increasing resource playing a key role in emerging Twitter-based sentiment analysis (SA) techniques and applications to collect and analyse customer trends and reviews. Existing studies on supervised black-box sentiment analysis systems do not provide adequate information, regarding rules as to why a certain review was classified to a class or classification. The accuracy in some ways is less than our personal judgement. To address these shortcomings, alternative approaches, such as supervised white-box classification algorithms, need to be developed to improve the classification of Twitter-based microblogs. The purpose of this study was to develop a supervised white-box microblogging SA system to analyse user reviews on certain products using rough set theory (RST)-based rule induction algorithms. RST classifies microblogging reviews of products into positive, negative, or neutral class using different rules extracted from training decision tables using RST-centric rule induction algorithms. The primary focus of this study is also to perform sentiment classification of microblogs (i.e. also known as tweets) of product reviews using conventional, and RST-based rule induction algorithms. The proposed RST-centric rule induction algorithm, namely Learning from Examples Module version: 2, and LEM2 \(+\) Corpus-based rules (LEM2 \(+\) CBR),which is an extension of the traditional LEM2 algorithm, are used. Corpus-based rules are generated from tweets, which are unclassified using other conventional LEM2 algorithm rules. Experimental results show the proposed method, when compared with baseline methods, is excellent, with regard to accuracy, coverage and the number of rules employed. The approach using this method achieves an average accuracy of 92.57% and an average coverage of 100%, with an average number of rules of 19.14.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on sentiment analysis methods, applications, and challenges

Article 07 February 2022

RETRACTED ARTICLE: Defining content marketing and its influence on online user behavior: a data-driven prescriptive analytics method

Article 12 March 2023

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

References

Chung, W.; Tseng, T.-L.B.: Discovering business intelligence from online product reviews: a rule-induction framework. Expert. Syst. Appl. 39(15), 11870–11879 (2012)
Article Google Scholar
Chan, C.-C.; Liszka, K.J.: Application of rough set theory to sentiment analysis of microblog data. In: Skowron, A., Suraj, Z. (eds.) Rough Sets and Intelligent Systems-Professor Zdzisław Pawlak in Memoriam. Intelligent systems reference library, vol 43. Springer, Berlin (2013)
Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wróblewski J. Rough set algorithms in classification problem. In: Polkowski, L., Tsumoto, S., Lin T.Y. (eds.) Rough set methods and applications. Studies in Fuzziness and soft computing, vol 56. Physica, Heidelberg (2000)
Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundam. Inform. 31(1), 27–39 (1997)
MathSciNet MATH Google Scholar
Pawlak, Z.: Rough sets. IJCIS 11(5), 341–356 (1982)
MATH Google Scholar
Wang, X.; Gotoh, O.: Accurate molecular classification of cancer using simple rules. BMC Med. Genom. 2(1), 1 (2009)
Article Google Scholar
Califf, M.E.; Mooney, R.J.: Bottom-up relational learning of pattern matching rules for information extraction. JMLR 4, 177–210 (2003)
MathSciNet MATH Google Scholar
Choi, Y.; et al.: Identifying sources of opinions with conditional random fields and extraction patterns. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 355–362. Association for Computational Linguistics (2005)
Feldman, R.; Rosenfeld, B.; Fresko, M.: TEG—a hybrid approach to information extraction. Knowl. Inf. Syst. 9(1), 1–18 (2006)
Article Google Scholar
Go, A.; Bhayani, R.; Huang, L.: Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford vol. 1, p. 12 (2009)
Rui, H.; Liu, Y.; Whinston, A.: Whose and what chatter matters? The effect of tweets on movie sales. DSSs 55, 863–870 (2013)
Google Scholar
Asghar, M.Z.; Ahmad, S.; Qasim, M.; Zahra, S.R.; Kundi, F.M.: SentiHealth: Creating Health-Related Sentiment Lexicon Using Hybrid Approach. Springer, Berlin (2016)
Google Scholar
Khan, F.H.; Bashir, S.; Qamar, U.: TOM: Twitter opinion mining framework using hybrid classification scheme. DSSs 57, 245–257 (2014)
Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123 (1995)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Jiang, F.; Sui, Y.; Cao, C.: Some issues about outlier detection in rough set theory. Expert. Syst. Appl. 36(3), 4680–4687 (2009)
Article Google Scholar
Liang, W.-Y.: Apply rough set theory into the web services composition. In: 22nd International Conference on Advanced Information Networking and Applications, 2008. AINA 2008, pp. 888–895. IEEE (2008)
Tay, F.E.H.; Shen, L.: Economic and financial prediction using rough sets model. EJOR 141(3), 641–659 (2002)
Article MATH Google Scholar
Goh, C.; Law, R.: Incorporating the rough sets theory into travel demand analysis. Tour. Manag. 24(5), 511–517 (2003)
Article Google Scholar
Asghar, M.Z.; Khan, A.; Ahmad, S.; Qasim, M.; Khan, : Lexicon-enhanced sentiment analysis framework using rule-based classification scheme. PLoS ONE 12(2), e0171649 (2017). doi:10.1371/journal.pone.0171649
Article Google Scholar
Barbosa, L.; Feng, J.: Robust sentiment detection on twitter from biased and noisy data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics (2010)
Chikersal, P.S.; Cambria, E.: SeNTU: sentiment analysis of tweets by combining a rule-based classifier with supervised learning. In: Proceedings of the International Workshop on Semantic Evaluation (SemEval 2015), pp. 647–651 (2015)
Asghar, M.Z.; Khan, A.; Ahmad, S.; Khan, I.A.; Kundi, F.M.: A unified framework for creating domain dependent polarity lexicons from user generated reviews. PLoS ONE 10(10), e0140204 (2015). doi:10.1371/journal.pone.0140204
Article Google Scholar
Prusa, J.D.; Khoshgoftaar, T.M.; Dittman, D.J.: Impact of feature selection techniques for tweet sentiment classification. In: The Twenty-Eighth International Flairs Conference (2015)
Gunther, T.: Sentiment analysis of microblogs. Master thesis, University of Gothenburg, pp. 66–67 (2013)
Nielsen, F.Å.: A new ANEW: evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903
Nagy, A.; Valley C.M.S.; Stamberger, J.: Crowd sentiment detection during disasters and crises. In: Proceedings of the 9th International ISCRAM Conference, pp. 1–9 (2012)
Kundi, F.M.; et al.: Detection and scoring of internet slangs for sentiment analysis using SentiWordNet. Life Sci. J. 11(9), 66–72 (2014)
Google Scholar
Esuli, A.; Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol. 6 (2006)
Miller, G.; et al.: Introduction to WordNet: an on-line lexical database*. IJL 3(4), 235–244 (1990)
Google Scholar
Li, C.; et al.: Phylogenetic analysis of DNA sequences based on k-word and rough set theory. Physica A 398, 162–171 (2014)
Article MathSciNet Google Scholar
Ma, S.; Huifen, L.; Yuan, Y.: Intrusion detection based on rough-set attribute reduction. In: Proceedings of the International Conference on Information Engineering and Applications (IEA) 2012. Springer, London (2013)
Wakabi-Waiswa, P.P.; Baryamureeba, V.: Extraction of interesting association rules using genetic algorithms. IJCIR 2(1), 26–33 (2008)
Google Scholar
Błaszczyński, J.; Słowiński, R.; Szela̧g, M.: Sequential covering rule induction algorithm for variable consistency rough set approaches. Inf. Sci. 181(5), 987–1002 (2011)
Article MathSciNet Google Scholar
Skowron, A.; et al.: RSES 2.2 user’s guide. Institute of Mathematics, Warsaw University, Warsaw, RBGN, vol. 17, no. 57, p. 1228 (2015)
Stefanowski, J.: On rough set based approaches to induction of decision rules. Rough Sets Knowl. Discov. 1(1), 500–529 (1998)
MATH Google Scholar

Download references

Acknowledgements

We are grateful to Dr. Shakeel Ahmad, Institute of Computing, Gomal University, for facilitating us by providing a licensed software and manuals during the execution of this project.

Author information

Authors and Affiliations

Institute of Computing and Information Technology, Gomal University, Dera Ismail Khan, Pakistan
Muhammad Zubair Asghar, Furqan Khan & Fazal Masud Kundi
Department of Computer Science, University of Science and Technology, Bannu, Pakistan
Aurangzeb Khan

Authors

Muhammad Zubair Asghar
View author publications
You can also search for this author in PubMed Google Scholar
Aurangzeb Khan
View author publications
You can also search for this author in PubMed Google Scholar
Furqan Khan
View author publications
You can also search for this author in PubMed Google Scholar
Fazal Masud Kundi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Zubair Asghar.

Ethics declarations

Conflict of interest

Muhammad Zubair Asghar, Aurangzeb Khan, Furqan Khan and Fazal Masud Kundi, declare that they have no conflict of interest.

Informed Consent

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008 (5). Additional informed consent was obtained from all patients for which identifying information is included in this article.

Human and Animal Rights

This study did not involve any experimental research on humans or animals; hence, an approval from an ethics committee was not applicable in this regard. The data collected from the online forums are publicly available data, and no personally identifiable information of the forum users was collected or used for this study.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Asghar, M.Z., Khan, A., Khan, F. et al. RIFT: A Rule Induction Framework for Twitter Sentiment Analysis. Arab J Sci Eng 43, 857–877 (2018). https://doi.org/10.1007/s13369-017-2770-1

Download citation

Received: 23 November 2016
Accepted: 01 August 2017
Published: 17 August 2017
Issue Date: February 2018
DOI: https://doi.org/10.1007/s13369-017-2770-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RIFT: A Rule Induction Framework for Twitter Sentiment Analysis

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

RETRACTED ARTICLE: Defining content marketing and its influence on online user behavior: a data-driven prescriptive analytics method

A review on sentiment analysis and emotion detection from text

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RIFT: A Rule Induction Framework for Twitter Sentiment Analysis

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

RETRACTED ARTICLE: Defining content marketing and its influence on online user behavior: a data-driven prescriptive analytics method

A review on sentiment analysis and emotion detection from text

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation