Skip to main content

Identifying Cyberspace Users’ Tendency in Blog Writing Using Machine Learning Algorithms

  • Chapter
  • First Online:
Engineering Mathematics and Computing

Abstract

A blog is a form of direct interactive communication technology, which allows users to interact and communicate with each other through posting comments and sharing links as well. A blog is a platform where a writer or group of writers gives their opinion on a specific topic. Many issues and topics that are in a certain country being censored and controlled by the government from being presented through the mass media. Nevertheless, blogs have the space to provide a wide platform for exchanging ideas and opinions on various issues. There is a specific proportion between blog features and bloggers’ tendency to social, political, and cultural patterns of different countries and nations that create trends among the bloggers in these countries. In this paper, we use an existing data set from previous research, which has 100 records of data, and manipulate the data by applying three machine learning algorithms for implementing classification and regression tasks. The algorithms are Decision Tree (c4.5), Linear Regression (LR), and Decision Forest (DF) with a 10-fold cross-validation method for training and testing. The results showed that C4.5 achieves the best overall results of 81% accuracy, 83% precision, and 91% recall, compared with the other two algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bandorski, D., Kurniawan, N., Baltes, P., Hoeltgen, R., Hecker, M., Stunder, D., Keuchel, M.: Contraindications for video capsule endoscopy World. J. Gastroenterol. 22, 9898–9908 (2016)

    Google Scholar 

  2. Alsamadani, H.A.: The effectiveness of using online blogging for students’ individual and group writing. Int. Educ. Stud. 11(1), 44 (2017). https://doi.org/10.5539/ies.v11n1p44

    Article  Google Scholar 

  3. Gharehchopogh, F.S., Khaze, S.R., Maleki, I.: A new approach in bloggers classification with hybrid of K-nearest neighbor and artificial neural network algorithms. Indian J. Sci. Technol. 8(3), 237 (2015). https://doi.org/10.17485/ijst/2015/v8i3/59570

    Article  Google Scholar 

  4. Hand, D.J.: Principles of data mining. Drug-Safety 30, 621–622 (2007). https://doi.org/10.2165/00002018-200730070-00010

    Article  Google Scholar 

  5. Cloud Computing Services: Microsoft Azure. (n.d.). https://azure.microsoft.com/en-in/. Accessed April 21, 2020

  6. Dalatu, P.I., Fitrianto, A., Mustapha, A.: A comparative study of linear and nonlinear regression models for outlier detection. In: Recent Advances on Soft Computing and Data Mining, pp 316–326 (2016).https://doi.org/10.1007/978-3-319-51281-5_32

  7. Geetha, M.C.S., Shanthi, I.E., Raman, S.S.: A survey and analysis on regression data mining techniques in agriculture. Int. J. Pure Appl. Math. 118(8), 341–347 (2018). ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)

    Google Scholar 

  8. Alghobiri, M.: A comparative analysis of classification algorithms on diverse datasets. Eng. Technol. Appl. Sci. Res. 8(2), 2790–2795 (2018)

    Article  Google Scholar 

  9. Asim, Y., Shahid, A.R., Malik, A.K., Raza, B.: Significance of machine learning algorithms in professional blogger’s classification. Comput. Electr. Eng. 65, 461–473 (2018). https://doi.org/10.1016/j.compeleceng.2017.08.001

    Article  Google Scholar 

  10. Masetic, Z., Subasi, A., Azemovic, J.: Malicious web sites detection using C4.5 decision tree. Southeast Eur. J. Soft Comput. 5(1) (March 2016). ISSN 2233–1859

    Google Scholar 

  11. Samsudin, N.A., Mustapha, A., Wahab, M.H.A.: Ensemble classification of cyber space users tendency in blog writing using random forest. In: 2016 12th International Conference on Innovations in Information Technology (IIT) (2016). https://doi.org/10.1109/innovations.2016.7880046

  12. Diasa, D.S., Diasb, N.G.J.: Forecasting monthly ad revenue from blogs using machine learning. In: The 3rd International Conference on Advances in Computing and Technology, ICACT 2018 (2018)

    Google Scholar 

  13. Chen, Q., Guo, Z., Sun, C., Li, W.: Research on chinese micro-blog sentiment classification based on recurrent neural network. In: 2017 2nd International Conference on Computer Science and Technology (CST 2017) (2017) ISBN: 978-1-60595-461-5

    Google Scholar 

  14. Simaki, V., Aravantinou, C., Mporas, I., Megalooikonomou, V.: Automatic estimation of web bloggers’ age using regression models. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) Speech and Computer: 17th International Conference, SPECOM 2015, Athens, Greece, September 20–24, 2015, Proceedings. Lecture Notes in Computer Science, vol. 9319, pp. 113–120. Springer (2015). https://doi.org/10.1007/978-3-319-23132-7_14

  15. Yang, P., Yang, G., Liu, J., Qi, J., Yang, Y., Wang, X., Wang, T.: DUAPM: an effective dynamic micro-blogging user activity prediction model towards cyber-physical-social systems. IEEE Trans. Industr. Inf., 1–1 (2019).https://doi.org/10.1109/tii.2019.2959791

  16. Mostafa, S.A., Mustapha, A., Khaleefah, S.H., Ahmad, M.S., Mohammed, M.A.: Evaluating the performance of three classification methods in diagnosis of Parkinson’s disease. In: International Conference on Soft Computing and Data Mining, pp. 43–52. Springer, Cham (February 2018)

    Google Scholar 

  17. Woo, H., Sung Cho, H., Shim, E., Lee, J.K., Lee, K., Song, G., Cho, Y.: Identification of keywords from twitter and web blog posts to detect influenza epidemics in Korea. Disaster Med. Public Health Prep. 12(03), 352–359 (2017). https://doi.org/10.1017/dmp.2017.84

    Article  Google Scholar 

  18. Geetha, M.C.S., Shanthi, I., Raman, S.: A survey and analysis on regression data mining techniques in agriculture. Int. J. Pure Appl. Math. 118, 341–346 (2018)

    Google Scholar 

  19. Rui, L.T., Afif, Z.A., Saedudin, R.D.R., Mustapha, A., Razali, N.: A regression approach for prediction of Youtube views. Bull. Electr. Eng. Inform. 8(4), 1502–1506 (December 2019). ISSN: 2302-9285. https://doi.org/10.11591/eei.v8i4.1630

  20. Bini, B.S., Mathew, T.: Clustering and regression techniques for stock prediction. Procedia Technol. 24, 1248–1255 (2016). https://doi.org/10.1016/j.protcy.2016.05.104

    Article  Google Scholar 

  21. Dali, A.D., Omar, N.A., Mustapha, A.: Data mining approach to herbs classification (2018)

    Google Scholar 

  22. Dua, D., Graff, C.: UCI machine learning repository http://archive.ics.uci.edu/ml. University of California, School of Information and Computer Science, Irvine, CA (2019)

  23. Gharehchopogh, F.S., Khaze, S.R.: Data mining application for cyber space users tendency in blog writing: a case study. Int. J. Comput. Appl. (0975–888) 47(18) (June 2012)

    Google Scholar 

  24. Nafi, S.N.M.M., Mustapha, A., Mostafa, S.A., Khaleefah, S.H., Razali, M.N.: Experimenting two machine learning methods in classifying river water quality. In: Communications in Computer and Information Science, pp. 213–222. Springer, Cham (September 2019)

    Google Scholar 

  25. Rahim, R., Zufria, I., Kurniasih, N., Simargolang, M.Y., Hasibuan, A., Sutiksno, D.U., et al.: C4.5 classification data mining for inventory control. Int. J. Eng. Technol. 7(2.3), 68 (2018). https://doi.org/10.14419/ijet.v7i2.3.12618

  26. Novaković, J., Strbac, P., Bulatović, D.: Toward optimal feature selection using ranking methods and classification algorithms. Yugoslav J. Op. Res. 21(1), 119–135 (2011). https://doi.org/10.2298/YJOR1101119N

    Article  MathSciNet  MATH  Google Scholar 

  27. Rokach, L.: Decision forest: twenty years of research. Inform. Fusion 27, 111–125 (2016). https://doi.org/10.1016/j.inffus.2015.06.005

    Article  Google Scholar 

  28. Abobakr, A., Hossny, M., Nahavandi, S.: A skeleton-free fall detection system from depth images using random decision forest. IEEE Syst. J., 1–12 (2018). https://doi.org/10.1109/jsyst.2017.2780260

  29. Akram, B.A., Akbar, A.H., Shafiq, O.: HybLoc: hybrid indoor Wi-Fi Localization using soft clustering based random decision forest ensembles. IEEE Access, 1–1 (2018). https://doi.org/10.1109/access.2018.2852658

Download references

Acknowledgements

This research is supported by Universiti Tun Hussein Onn Malaysia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salama A. Mostafa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

AbuSalim, S.W.G., Mostafa, S.A., Mustapha, A., Ibrahim, R., Wahab, M.H.A. (2023). Identifying Cyberspace Users’ Tendency in Blog Writing Using Machine Learning Algorithms. In: Gyei-Kark, P., Jana, D.K., Panja, P., Abd Wahab, M.H. (eds) Engineering Mathematics and Computing. Studies in Computational Intelligence, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-19-2300-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-2300-5_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-2299-2

  • Online ISBN: 978-981-19-2300-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics