Skip to main content

Comparative Study on Arabic Text Classification: Challenges and Opportunities

  • Chapter
  • First Online:
Classification Applications with Deep Learning and Machine Learning Technologies

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1071))

Abstract

There have been great improvements in web technology over the past years which heavily loaded the Internet with various digital contents of different fields. This made finding certain text classification algorithms that fit a specific language or a set of languages a difficult task for researchers. Text Classification or categorization is the practice of allocating a given text document to one or more predefined labels or categories, it aims to obtain valuable information from unstructured text documents. This paper presents a comparative study based on a list of chosen published papers that focus on improving Arabic text classifications, to highlight the given models and the used classifiers besides discussing the faced challenges in these types of researches, then this paper proposes the expected research opportunities in the field of text classification research. Based on the reviewed researches, SVM and Naive Bayes were the most widely used classifiers for Arabic text classification, while more effort is needed to develop and to implement flexible Arabic text classification methods and classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jackson, P., & Moulinier, I. (2007). Natural language processing for online applications: text retrieval, extraction and categorization (vol. 5). John Benjamins Publishing.

    Google Scholar 

  2. Sanasam, R., Murthy, H., & Gonsalves, T. (2010). Feature selection for text classification based on Gini coefficient of inequality. FSDM, 10, 76–85.

    Google Scholar 

  3. Feldman, R. (2007). The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge University Press.

    Google Scholar 

  4. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval.

    Google Scholar 

  5. Gharaibeh, M., Alzu’bi, D., Abdullah, M., Hmeidi, I., Al Nasar, M. R., Abualigah, L., & Gandomi, A. H. (2022). Radiology imaging scans for early diagnosis of kidney tumors: a review of data analytics-based machine learning and deep learning approaches. Big Data and Cognitive Computing, 6(1), 29.

    Google Scholar 

  6. Gandomi, A. H., Chen, F., & Abualigah, L. (2022). Machine learning technologies for big data analytics. Electronics, 11(3), 421.

    Article  Google Scholar 

  7. Bashabsheh, M. Q., Abualigah, L., & Alshinwan, M. (2022). Big data analysis using hybrid meta-heuristic optimization algorithm and MapReduce framework. In Integrating meta-heuristics and machine learning for real-world optimization problems (pp. 181–223). Springer.

    Google Scholar 

  8. Gharaibeh, M., Almahmoud, M., Ali, M. Z., Al-Badarneh, A., El-Heis, M., Abualigah, L., Altalhi, M., Alaiad, A., & Gandomi, A. H. (2021). Early diagnosis of alzheimer’s disease using cerebral catheter angiogram neuroimaging: A novel model based on deep learning approaches. Big Data and Cognitive Computing, 6(1), 2.

    Article  Google Scholar 

  9. Abualigah, L., Diabat, A., & Elaziz, M. A. (2021). Intelligent workflow scheduling for big data applications in IoT cloud computing environments. Cluster Computing, 24(4), 2957–2976.

    Article  Google Scholar 

  10. Abualigah, L., Gandomi, A. H., Elaziz, M. A., Hamad, H. A., Omari, M., Alshinwan, M., & Khasawneh, A. M. (2021). Advances in meta-heuristic optimization algorithms in big data text clustering. Electronics, 10(2), 101.

    Article  Google Scholar 

  11. Abualigah, L., & Masri, B. A. (2021). Advances in MapReduce big data processing: platform, tools, and algorithms. In Artificial intelligence and IoT (pp. 105–128).

    Google Scholar 

  12. Al-Sai, Z. A., & Abualigah, L. M. (2017, May). Big data and e-government: A review. In 2017 8th international conference on information technology (ICIT) (pp. 580–587). IEEE.

    Google Scholar 

  13. Alshaer, H., Otair, M., Abualigah, L., Alshinwan, M., & Khasawneh, A. (2020). Feature selection method using improved CHI Square on Arabic text classifiers.

    Google Scholar 

  14. Chantar, H., Mafarja, M., Alsawalqah, H., Heidari, A. A., Aljarah, I., & Faris, H. (2020). Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification.

    Google Scholar 

  15. Bahassine, S., Madani, A., Al-Sarem, M., & Kissi, M. (2020). Feature selection using an improved Chi-square for Arabic text.

    Google Scholar 

  16. Marie-Sainte, S. L., & Alalyani, N. (2020). Firefly algorithm based feature selection for Arabic text classification.

    Google Scholar 

  17. Elnagar, A., Al-Debsi, R., & Einea, O. (2020). Arabic text classification using deep learning models.

    Google Scholar 

  18. Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., & Gandomi, A. H. (2021). The arithmetic optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 376, 113609.

    Article  MathSciNet  MATH  Google Scholar 

  19. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A. A., Al-Qaness, M. A., & Gandomi, A. H. (2021). Aquila optimizer: A novel meta-heuristic optimization algorithm. Computers and Industrial Engineering, 157, 107250.

    Article  Google Scholar 

  20. Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z. W., & Gandomi, A. H. (2022). Reptile search algorithm (RSA): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications, 191, 116158.

    Article  Google Scholar 

  21. Agushaka, J. O., Ezugwu, A. E., & Abualigah, L. (2022). Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering, 391, 114570.

    Article  MathSciNet  MATH  Google Scholar 

  22. Oyelade, O. N., Ezugwu, A. E. S., Mohamed, T. I., & Abualigah, L. (2022). Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access, 10, 16150–16177.

    Article  Google Scholar 

  23. Ezugwu, A. E., Agushaka, J. O., Abualigah, L., Mirjalili, S., & Gandomi, A. H. (2022). Prairie dog optimization algorithm. Neural Computing and Applications, 1–49.

    Google Scholar 

  24. Khreisat, L. (2009). A machine learning approach for Arabic text classification using N-gram frequency statistics. Journal of Informetrics, 72–77.

    Google Scholar 

  25. Sebastiani, F. (2005). Text categorization. In J. H. Doorn, L. C. Rivero, & V. E. Ferraggine (Eds.), Encyclopedia of database technologies and applications (pp. 683–687). IGI Global.

    Google Scholar 

  26. Dharmadhikari, S., Ingle, M., & Kulkarni, P. (2011). Empirical studies on machine learning based text classification algorithms. Advanced Computing: An International Journal, 161–169.

    Google Scholar 

  27. El Kourdi, M., Bensaid, A., & Rachidi, T. (2004). Automatic Arabic document categorization based on the Naïve Bayes algorithm. In Proceedings of the workshop on computational approaches to Arabic script-based languages (pp. 51–58).

    Google Scholar 

  28. Elnagar, A., Al-Debsi, R., & Einea, O. (2020). Arabic text classification using deep learning models. Information Processing and Management.

    Google Scholar 

  29. Mirjalili, S., Mirjalili, S. M., & Lewisa, A. (2014). Grey Wolf optimizer. Advances in Engineering Software.

    Google Scholar 

  30. Sayadi, M. K., Ramezanian, R., & Ghaffarinasab, N. (2010). A discrete firefly meta-heuristic with local search for makespan minimization in permutation flow shop scheduling problems. International Journal of Industrial Engineering Computations.

    Google Scholar 

  31. Harrag, A., & Nassir, H. (2014). Firefly feature subset selection application to Arabic speaker recognition system. International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laith Abualigah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Melhem, M.K.B., Abualigah, L., Zitar, R.A., Hussien, A.G., Oliva, D. (2023). Comparative Study on Arabic Text Classification: Challenges and Opportunities. In: Abualigah, L. (eds) Classification Applications with Deep Learning and Machine Learning Technologies. Studies in Computational Intelligence, vol 1071. Springer, Cham. https://doi.org/10.1007/978-3-031-17576-3_10

Download citation

Publish with us

Policies and ethics