Abstract
Social media forums such as Twitter can be used as instruments for understanding the way users behave and engage with other users online. Analysis of data related to material shared by users assists in mining useful information for assessing content for virality. This study proposes a methodology to predict which tweets are likely to become viral and generate a lot of conversations over the Internet, termed as buzz discussions, by considering such discussions as outliers, using bio-inspired algorithms integrated with k-Nearest Neighbors classification. Performances of three bio-inspired optimization algorithms, namely Grey Wolf Optimization, Chicken Swarm Optimization and, Artificial Bee Colony, have also been evaluated based on the efficacy of the proposed hybrid models for mining outliers on a supervised learning data-set containing 11 primary features and 140,707 instances. Among the three algorithms used for this outlier detection problem, Chicken Swarm Optimization shows better performance, overall, in terms of evaluation parameters, including accuracy, precision, recall, specificity, F1-measure and convergence.
Similar content being viewed by others
References
Aci C, İnan C, Avci M (2010) A hybrid classification method of k nearest neighbor, Bayesian methods and genetic algorithm. Expert Syst Appl 37:5061–5067. https://doi.org/10.1016/j.eswa.2009.12.004
Ahmed K, Hassanien AE, Ezzat E, Tsai PW (2016) An adaptive approach for community detection based on chicken swarm optimization algorithm. In: ICGEC, pp. 1–8
Akcora CG, Carminati B, Ferrari E, Kantarcioglu M (2014) Detecting anomalies in social network data consumption. Soc Netw Anal Min 4:231
Amer M, Goldstein M (2012) Nearest-neighbor and clustering based anomaly detection algorithms for rapidminer. In: Proceedings of the 3rd rapid miner community meeting and conferernce, pp 1–12. https://doi.org/10.5455/ijavms.141
Amer M, Goldstein M, Abdennadher S (2013) Enhancing one-class support vector machines for unsupervised anomaly detection. In: Proceedings of the ACM SIGKDD workshop on outlier detection and description, ODD 2013, pp 8–15. https://doi.org/10.1145/2500853.2500857
Angiulli F, Pizzuti C (2002) Fast outlier detection in high dimensional spaces. In: Principles of data mining and knowledge discovery, pp 15–27
Angiulli F, Pizzuti C (2002) Fast outlier detection in high dimensional spaces. In: Proceedings of the 6th European conference on principles of data mining and knowledge discovery, pp 15–26
Angiulli F, Pizzuti C (2005) Outlier mining in large high-dimensional data sets. IEEE Trans Knowl Data Eng 17:203–215
Aswani R, Ghrera S, Kar A, Chandra S (2017) Identifying buzz in social media: a hybrid approach using artificial bee colony and k-nearest neighbors for outlier detection. Soc Netw Anal Min 7:1–10. https://doi.org/10.1007/s13278-017-0461-2
Aswani R, Ghrera SP, Chandra S (2016) A novel approach to outlier detection using modified grey wolf optimization and k-nearest neighbors algorithm. Indian J Sci Technol 9:1–8
Aswani R, Ghrera SP, Chandra S, Kar AK (2017) Outlier detection among influencer blogs based on off-site web analytics data. Forthcoming in Lecture Notes in Computer Science. In: Proceedings of 16th IFIP conference on e-Business, e-Services and e-Society, vol 10595, pp 251–260. https://doi.org/10.1007/978-3-319-68557-1_23
Backstrom S, Haslum J (2016) Detecting trends on twitter. In: Degree project in technology, First cycle, 15 Credits Stockholm, Sweden, pp 4–39
Beheshti Z, Shamsuddin SM (2013) A review of population-based meta-heuristic algorithm. Int J Adv Soft Comput Appl 5:1–35
Berger J, Milkman K (2010) Virality: what gets shared and why. NA Adv Consum Res 37:118–121
Berthon P, Pitt L, Plangger K, Shapiro D (2012) Marketing meets web 2.0, social media, and creative consumers: implications for international marketing strategy. Bus Horiz 55:261–271. https://doi.org/10.1016/j.bushor.2012.01.007
Bhattacharya S, Gaurav K, Ghosh S (2019) Viral marketing on social networks: an epidemiological perspective. Physica A 525:478–490. https://doi.org/10.1016/j.physa.2019.03.008
Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2:1–8. https://doi.org/10.1016/j.jocs.2010.12.007
Borges-Tiago MT, Tiago F, Cosme C (2019) Exploring users’ motivations to participate in viral communication on social media. J Bus Res 101:574–582. https://doi.org/10.1016/j.jbusres.2018.11.011
Cai L, Yu Y, Zhang S, Song Y, Xiong Z, Zhou T (2020) A sample-rebalanced outlier-rejected \(k\)-nearest neighbor regression model for short-term traffic flow forecasting. IEEE Access 8:22686–22696
Cenni D, Nesi P, Pantaleo G, Zaza I (2017) Twitter vigilance: a multi-user platform for cross-domain twitter data analytics, NLP and sentiment analysis. In: IEEE international conference on smart city and innovation, San Francisco, California (USA), pp 1–8. https://doi.org/10.1109/UIC-ATC.2017.8397589
Cha M, Haddadi H, Benevenuto F, Gummadi KP (2010) Measuring user influence in twitter: the million follower fallacy. In: Proceedings of the 4th international AAAI conference on weblogs and social media (ICWSM), pp 2–8
Chakraborty A, Kar AK (2017) Swarm intelligence: a review of algorithms. In: Modeling and optimization in science and technologies, pp 475–494. https://doi.org/10.1007/978-3-319-50920-4_19
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41:1–58. https://doi.org/10.1145/1541880.1541882
Chen S, Yang R, Yang R, Yang L, Yang X, Xu C, Xu B, Zhang H, Lu Y, Liu W (2016) A parameter estimation method for nonlinear systems based on improved boundary chicken swarm optimization. Discrete Dyn Nat Soc. https://doi.org/10.1155/2016/3795961
Cortez P, Cerdeira A, Almeida F, Matos T, Reis J (2009) Modeling wine preferences by data mining from physicochemical properties. Decis Support Syst 47(4):547–553. https://doi.org/10.1016/j.dss.2009.05.016
Cortez P, Cerdeira A, Almeida F, Matos T, Reis J (2009) Wine quality data set—UCI machine learning repository. https://archive.ics.uci.edu/ml/datasets/wine+quality
Darwish A (2018) Bio-inspired computing: algorithms review, deep analysis, and the scope of applications. Future Comput Inform J 3:231–246. https://doi.org/10.1016/j.fcij.2018.06.001
Dimopoulos C, Zalzala A (2000) Recent developments in evolutionary computation for manufacturing optimization: problems, solutions and comparisons. IEEE Trans Evol Comput 4:93–113. https://doi.org/10.1109/4235.850651
Dwivedi YK, Kapoor KK, Chen H (2015) Social media marketing and advertising. Mark Rev 15:289–309. https://doi.org/10.1362/146934715X14441363377999
Emary E, Yamany W, Hassanien AE, Snasel V (2015) Multi-objective gray-wolf optimization for attribute reduction. Proc Comput Sci 65:623–632. https://doi.org/10.1016/j.procs.2015.09.006
Emary E, Zawbaa H, Grosan C, Hassanien AE (2014) Feature subset selection approach by gray-wolf optimization. Adv Intell Syst Comput 334:1–13. https://doi.org/10.1007/978-3-319-13572-4_1
Garibay I (2010) Dario floreano and claudio mattiussi (eds): bio-inspired artificial intelligence: theories, methods, and technologies. Genet Program Evol Mach 11:441–443. https://doi.org/10.1007/s10710-010-9104-3
Ghanem TF, Elkilani WS, Abdul-Kader HM (2015) A hybrid approach for efficient anomaly detection using metaheuristic methods. J Adv Res 6:609–619. https://doi.org/10.1016/j.jare.2014.02.009
Gogna A, Tayal A (2013) Metaheuristics: review and application. J Exp Theor Artif Intell 25:503–526. https://doi.org/10.1080/0952813X.2013.782347
Grover P, Kar AK, Dwivedi YK, Janssen M (2018) Polarization and acculturation in US election 2016 outcomes: can twitter analytics predict changes in voting preferences. J Technol Forecast Soc Change 145:438–460
Hassan R, Cohanim B, de Weck O (2005) A comparison of particle swarm optimization and the genetic algorithm. In: Proceedings of the 46th AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics and material conference, vol 2, 1–13. https://doi.org/10.2514/6.2005-1897
Hausmann A (2012) Creating ‘buzz’: opportunities and limitations of social media for arts institutions and their viral marketing. Int J Nonprofit Volunt Sect Mark 17:173–182. https://doi.org/10.1002/nvsm.1420
Ilavarasan V, Rathore A (2018) Social media and business practices. Encycl Inf Sci Technol. https://doi.org/10.4018/978-1-5225-7601-3.ch042
Irsalinda N, Thobirin A, Wijayanti DE (2017) Chicken swarm as a multi step algorithm for global optimization. Int J Eng Sci Invent 6:1–7
Irsalinda N, Yanto ITR, Chiroma H, Herawan T (2017) A framework of clustering based on chicken swarm optimization. Adv Intell Syst Comput. https://doi.org/10.1007/978-3-319-51281-5_34
Jenders M, Kasneci G, Naumann F (2013) Analyzing and predicting viral tweets. In: Proceedings of the 22nd international conference on world wide web, pp 657–664. https://doi.org/10.1145/2487788.2488017
Kapoor KK, Tamilmani K, Rana NP, Patil P, Dwivedi YK, Nerur S (2017) Advances in social media research: past, present and future. Inf Syst Front 20:1–28. https://doi.org/10.1007/s10796-017-9810-y
Kar AK (2016) Bio inspired computing: a review of algorithms and scope of applications. Expert Syst Appl 59:2–50. https://doi.org/10.1016/j.eswa.2016.04.018
Kar AK, Chakraborty A (2016) A review of bio inspired computing methods and potential applications. In: International conference on signal, networks, computing, and systems, vol 396, pp 2–7. https://doi.org/10.1007/978-81-322-3589-7_16
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical report TR06. Technical Report, Erciyes University pp 1–10
Karaboga D, Akay B (2009) A comparative study of artificial bee colony algorithm. Appl Math Comput 214(1):108–132. https://doi.org/10.1016/j.amc.2009.03.090
Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm. J Glob Optim 39:459–471. https://doi.org/10.1007/s10898-007-9149-x
Karaboga D, Basturk B (2008) On the performance of artificial bee colony (abc) algorithm. Appl Soft Comput 8(1):687–697. https://doi.org/10.1016/j.asoc.2007.05.007
Karaboga D, Gorkemli B, Ozturk C, Karaboga N (2012) A comprehensive survey: artificial bee colony (abc) algorithm and applications. Artif Intell Rev 42:21–57. https://doi.org/10.1007/s10462-012-9328-0
Kawala F, Douzal A, Gaussier E, Dimert E (2013) Buzz in social media data set: UCI machine learning repository. https://archive.ics.uci.edu/ml/datasets/Buzz+in+social+media+
Khaja S, Banu M, Pappula P (2016) A novel approach for k-NN on unsupervised distance-based outlier detection. Int J Technol Res Eng 4:2347–4718
Khandelwal A, Bhargava A, Sharma A (2019) Voltage stability constrained transmission network expansion planning using fast convergent grey wolf optimization algorithm. Evol Intell. https://doi.org/10.1007/s12065-019-00200-1
Kwak H, Lee C, Park H, Moon S (2010) What is twitter, a social network or a news media? In: Proceedings of the 19th international conference on world wide web, WWW ’10, vol 19, 591–600. https://doi.org/10.1145/1772690.1772751
Lamrini B, Gjini A, Daudin S, Armando F, Pratmarty P, Travé-Massuyès L (2018) Anomaly detection using similarity-based one-class SVM for network traffic characterization. In: 29th international workshop on principles of diagnosis, pp 1–8
Leskovec J, Adamic LA, Huberman BA (2007) The dynamics of viral marketing. ACM Trans Web (TWEB) 1:1–39. https://doi.org/10.1145/1232722.1232727
Liang J, Wang L, Ma M, Zhang J (2018) A fast sar image segmentation method based on improved chicken swarm optimization algorithm. Multimed Tools Appl 77:31787–31805. https://doi.org/10.1007/s11042-018-6119-x
Liu Z, hua Guo J, Cao J, Wei Y, Huang W (2018) A hybrid short-term traffic flow forecasting method based on neural networks combined with k-nearest neighbor. Promet Traffic Transp 30:445–456
Lu S, Liu L, Li J, Le TD (2018) Effective outlier detection based on Bayesian network and proximity. In: 2018 IEEE international conference on big data (big data), pp 134–139
Mahmoud HA, Hafez AI, Zawbaa HM, Emary E, Hassanien AE (2015) An innovative approach for feature selection based on chicken swarm optimization. In: International conference of soft computing and pattern recognition (SoCPaR), pp 1–6. https://doi.org/10.1109/SOCPAR.2015.7492775
Marques-Toledo CA, Degener CM, Vinhal L, Coelho G, Meira W, Codeço CT, Teixeira MM (2017) Dengue prediction by the web: tweets are a useful tool for estimating and forecasting dengue at country and city level. PLOS Negl Trop Diseases 11:1–20. https://doi.org/10.1371/journal.pntd.0005729
Meng XB, Liu Y, Gao X, Zhang H (2014) A new bio-inspired algorithm: chicken swarm optimization. Lect Notes Comput Sci 8794:86–94. https://doi.org/10.1007/978-3-319-11857-4_10
Mezghani M, Washha M, Sèdes F (2018) Online social network phenomena: buzz, rumor and spam. In: How information systems can help in alarm/alert detection, pp 219–239. https://doi.org/10.1016/B978-1-78548-302-8.50008-3
Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61. https://doi.org/10.1016/j.advengsoft.2013.12.007
Mirjalili SM (2014) How effective is the grey wolf optimizer in training multi-layer perceptrons. Appl Intell 43:150–161
Mohemmed AW, Zhang M, Browne WN (2010) Particle swarm optimisation for outlier detection. In: Proceedings of GECCO, pp 1–2
Murthy D (2015) Twitter and elections: are tweets, predictive, reactive, or a form of buzz? Inf Commun Soc 18:816–831. https://doi.org/10.1080/1369118X.2015.1006659
Nesi P, Pantaleo G, Paoli I, Zaza I (2018) Assessing the retweet proneness of tweets: predictive models for retweeting. Multimed Tools Appl 77:26371–26396. https://doi.org/10.1007/s11042-018-5865-0
Nguyen HT, Chaudhuri M (2019) Making new products go viral and succeed. Int J Res Mark 36(1):39–62. https://doi.org/10.1016/j.ijresmar.2018.09.007
Rahmat G, Primartha R, Sukemi Wijaya A (2019) Comparative analysis of classification method for wart treatment method. J Phys Conf Ser 1196:1–7. https://doi.org/10.1088/1742-6596/1196/1/012012
Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. ACM SIGMOD Record 29:427–438
Rashidi L, Hashemi S, Hamzeh A (2011) Anomaly detection in categorical datasets using Bayesian networks. Artif Intell Comput Intell 7003:610–619. https://doi.org/10.1007/978-3-642-23887-1_78
Reif M, Goldstein M, Stahl A, Breuel TM (2008) Anomaly detection by combining decision trees and parametric densities. In: 2008 19th international conference on pattern recognition, pp 1–4
Roslina ZM, Yanto ITR, Hartama D (2016) A framework of training ANFIS using chicken swarm optimization for solving classification problems. In: International conference on informatics and computing (ICIC), pp 437–441. https://doi.org/10.1109/IAC.2016.7905759
Sahana S (2019) Hybrid optimizer for the travelling salesman problem. Evol Intel 12:1–10. https://doi.org/10.1007/s12065-019-00208-7
Saremi S, Mirjalili SZ, Mirjalili S (2015) Evolutionary population dynamics and grey wolf optimizer. Neural Comput Appl 26:1257–1263. https://doi.org/10.1007/s00521-014-1806-7
Shekhawat S, Shringi S, Sharma H (2020) Twitter sentiment analysis using hybrid spider monkey optimization method. Evol Intell. https://doi.org/10.1007/s12065-019-00334-2
Shukla AK, Kanungo S (2019) Automated face retrieval using bag-of-features and sigmoidal grey wolf optimization. Evol Intell
Song H, Jiang Z, Men A, Yang B (2017) A hybrid semi-supervised anomaly detection model for high-dimensional data. Comput Intell Neurosci 2017:1–9. https://doi.org/10.1155/2017/8501683
Syarif AR, Gata W (2017) Intrusion detection system using hybrid binary PSO and k-nearest neighborhood algorithm. In: 2017 11th international conference on information communication technology and system (ICTS), pp 181–186
Tsai HC (2020) Artificial bee colony directive for continuous optimization. Appl Soft Comput 87:1–70. https://doi.org/10.1016/j.asoc.2019.105982
han Wang X, Zhang Y, yan Sun X, li Wang Y, he Du C (2020) Multi-objective feature selection based on artificial bee colony: an acceleration approach with variable sample size. Appl Soft Comput 88:1–8. https://doi.org/10.1016/j.asoc.2019.106041
Wang Y, Qian Y, Li Y, Gong M, Banzhaf W (2016) Artificial multi-bee-colony algorithm for k-nearest-neighbor fields search. Proc Genet Evol Comput Conf 2016:1037–1044. https://doi.org/10.1145/2908812.2908835
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1:67–82. https://doi.org/10.1109/4235.585893
Wong LI, Sulaiman MH, Mohamed MR, Hong MS (2014) Grey wolf optimizer for solving economic dispatch problems. In: IEEE international conference on power and energy (PECon), pp 150–154. https://doi.org/10.1109/PECON.2014.7062431
Wu D, Kong F, Gao W, Shen Y, Ji Z (2015) Improved chicken swarm optimization. In: IEEE international conference on cyber technology in automation, control, and intelligent systems (CYBER), pp 681–686. https://doi.org/10.1109/CYBER.2015.7288023
Zhou G, Moayedi H, Bahiraei M, Lyu Z (2020) Employing artificial bee colony and particle swarm techniques for optimizing a neural network in prediction of heating and cooling loads of residential buildings. J Clean Prod 254:1–24. https://doi.org/10.1016/j.jclepro.2020.120082
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jain, R., Batra, J., Kar, A.K. et al. A hybrid bio-inspired computing approach for buzz detection in social media. Evol. Intel. 15, 349–367 (2022). https://doi.org/10.1007/s12065-020-00512-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-020-00512-7