Abstract
A blog is a form of direct interactive communication technology, which allows users to interact and communicate with each other through posting comments and sharing links as well. A blog is a platform where a writer or group of writers gives their opinion on a specific topic. Many issues and topics that are in a certain country being censored and controlled by the government from being presented through the mass media. Nevertheless, blogs have the space to provide a wide platform for exchanging ideas and opinions on various issues. There is a specific proportion between blog features and bloggers’ tendency to social, political, and cultural patterns of different countries and nations that create trends among the bloggers in these countries. In this paper, we use an existing data set from previous research, which has 100 records of data, and manipulate the data by applying three machine learning algorithms for implementing classification and regression tasks. The algorithms are Decision Tree (c4.5), Linear Regression (LR), and Decision Forest (DF) with a 10-fold cross-validation method for training and testing. The results showed that C4.5 achieves the best overall results of 81% accuracy, 83% precision, and 91% recall, compared with the other two algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bandorski, D., Kurniawan, N., Baltes, P., Hoeltgen, R., Hecker, M., Stunder, D., Keuchel, M.: Contraindications for video capsule endoscopy World. J. Gastroenterol. 22, 9898–9908 (2016)
Alsamadani, H.A.: The effectiveness of using online blogging for students’ individual and group writing. Int. Educ. Stud. 11(1), 44 (2017). https://doi.org/10.5539/ies.v11n1p44
Gharehchopogh, F.S., Khaze, S.R., Maleki, I.: A new approach in bloggers classification with hybrid of K-nearest neighbor and artificial neural network algorithms. Indian J. Sci. Technol. 8(3), 237 (2015). https://doi.org/10.17485/ijst/2015/v8i3/59570
Hand, D.J.: Principles of data mining. Drug-Safety 30, 621–622 (2007). https://doi.org/10.2165/00002018-200730070-00010
Cloud Computing Services: Microsoft Azure. (n.d.). https://azure.microsoft.com/en-in/. Accessed April 21, 2020
Dalatu, P.I., Fitrianto, A., Mustapha, A.: A comparative study of linear and nonlinear regression models for outlier detection. In: Recent Advances on Soft Computing and Data Mining, pp 316–326 (2016).https://doi.org/10.1007/978-3-319-51281-5_32
Geetha, M.C.S., Shanthi, I.E., Raman, S.S.: A survey and analysis on regression data mining techniques in agriculture. Int. J. Pure Appl. Math. 118(8), 341–347 (2018). ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)
Alghobiri, M.: A comparative analysis of classification algorithms on diverse datasets. Eng. Technol. Appl. Sci. Res. 8(2), 2790–2795 (2018)
Asim, Y., Shahid, A.R., Malik, A.K., Raza, B.: Significance of machine learning algorithms in professional blogger’s classification. Comput. Electr. Eng. 65, 461–473 (2018). https://doi.org/10.1016/j.compeleceng.2017.08.001
Masetic, Z., Subasi, A., Azemovic, J.: Malicious web sites detection using C4.5 decision tree. Southeast Eur. J. Soft Comput. 5(1) (March 2016). ISSN 2233–1859
Samsudin, N.A., Mustapha, A., Wahab, M.H.A.: Ensemble classification of cyber space users tendency in blog writing using random forest. In: 2016 12th International Conference on Innovations in Information Technology (IIT) (2016). https://doi.org/10.1109/innovations.2016.7880046
Diasa, D.S., Diasb, N.G.J.: Forecasting monthly ad revenue from blogs using machine learning. In: The 3rd International Conference on Advances in Computing and Technology, ICACT 2018 (2018)
Chen, Q., Guo, Z., Sun, C., Li, W.: Research on chinese micro-blog sentiment classification based on recurrent neural network. In: 2017 2nd International Conference on Computer Science and Technology (CST 2017) (2017) ISBN: 978-1-60595-461-5
Simaki, V., Aravantinou, C., Mporas, I., Megalooikonomou, V.: Automatic estimation of web bloggers’ age using regression models. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) Speech and Computer: 17th International Conference, SPECOM 2015, Athens, Greece, September 20–24, 2015, Proceedings. Lecture Notes in Computer Science, vol. 9319, pp. 113–120. Springer (2015). https://doi.org/10.1007/978-3-319-23132-7_14
Yang, P., Yang, G., Liu, J., Qi, J., Yang, Y., Wang, X., Wang, T.: DUAPM: an effective dynamic micro-blogging user activity prediction model towards cyber-physical-social systems. IEEE Trans. Industr. Inf., 1–1 (2019).https://doi.org/10.1109/tii.2019.2959791
Mostafa, S.A., Mustapha, A., Khaleefah, S.H., Ahmad, M.S., Mohammed, M.A.: Evaluating the performance of three classification methods in diagnosis of Parkinson’s disease. In: International Conference on Soft Computing and Data Mining, pp. 43–52. Springer, Cham (February 2018)
Woo, H., Sung Cho, H., Shim, E., Lee, J.K., Lee, K., Song, G., Cho, Y.: Identification of keywords from twitter and web blog posts to detect influenza epidemics in Korea. Disaster Med. Public Health Prep. 12(03), 352–359 (2017). https://doi.org/10.1017/dmp.2017.84
Geetha, M.C.S., Shanthi, I., Raman, S.: A survey and analysis on regression data mining techniques in agriculture. Int. J. Pure Appl. Math. 118, 341–346 (2018)
Rui, L.T., Afif, Z.A., Saedudin, R.D.R., Mustapha, A., Razali, N.: A regression approach for prediction of Youtube views. Bull. Electr. Eng. Inform. 8(4), 1502–1506 (December 2019). ISSN: 2302-9285. https://doi.org/10.11591/eei.v8i4.1630
Bini, B.S., Mathew, T.: Clustering and regression techniques for stock prediction. Procedia Technol. 24, 1248–1255 (2016). https://doi.org/10.1016/j.protcy.2016.05.104
Dali, A.D., Omar, N.A., Mustapha, A.: Data mining approach to herbs classification (2018)
Dua, D., Graff, C.: UCI machine learning repository http://archive.ics.uci.edu/ml. University of California, School of Information and Computer Science, Irvine, CA (2019)
Gharehchopogh, F.S., Khaze, S.R.: Data mining application for cyber space users tendency in blog writing: a case study. Int. J. Comput. Appl. (0975–888) 47(18) (June 2012)
Nafi, S.N.M.M., Mustapha, A., Mostafa, S.A., Khaleefah, S.H., Razali, M.N.: Experimenting two machine learning methods in classifying river water quality. In: Communications in Computer and Information Science, pp. 213–222. Springer, Cham (September 2019)
Rahim, R., Zufria, I., Kurniasih, N., Simargolang, M.Y., Hasibuan, A., Sutiksno, D.U., et al.: C4.5 classification data mining for inventory control. Int. J. Eng. Technol. 7(2.3), 68 (2018). https://doi.org/10.14419/ijet.v7i2.3.12618
Novaković, J., Strbac, P., Bulatović, D.: Toward optimal feature selection using ranking methods and classification algorithms. Yugoslav J. Op. Res. 21(1), 119–135 (2011). https://doi.org/10.2298/YJOR1101119N
Rokach, L.: Decision forest: twenty years of research. Inform. Fusion 27, 111–125 (2016). https://doi.org/10.1016/j.inffus.2015.06.005
Abobakr, A., Hossny, M., Nahavandi, S.: A skeleton-free fall detection system from depth images using random decision forest. IEEE Syst. J., 1–12 (2018). https://doi.org/10.1109/jsyst.2017.2780260
Akram, B.A., Akbar, A.H., Shafiq, O.: HybLoc: hybrid indoor Wi-Fi Localization using soft clustering based random decision forest ensembles. IEEE Access, 1–1 (2018). https://doi.org/10.1109/access.2018.2852658
Acknowledgements
This research is supported by Universiti Tun Hussein Onn Malaysia.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
AbuSalim, S.W.G., Mostafa, S.A., Mustapha, A., Ibrahim, R., Wahab, M.H.A. (2023). Identifying Cyberspace Users’ Tendency in Blog Writing Using Machine Learning Algorithms. In: Gyei-Kark, P., Jana, D.K., Panja, P., Abd Wahab, M.H. (eds) Engineering Mathematics and Computing. Studies in Computational Intelligence, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-19-2300-5_6
Download citation
DOI: https://doi.org/10.1007/978-981-19-2300-5_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2299-2
Online ISBN: 978-981-19-2300-5
eBook Packages: EngineeringEngineering (R0)