Skip to main content

A Fuzzy Near Neighbors Approach for Arabic Text Categorization Based on Web Mining Technique

  • Conference paper
  • First Online:
Digital Technologies and Applications (ICDTA 2021)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 211))

Included in the following conference series:

Abstract

Nowadays, the quantity of textual content available online has experienced such a colossal increase. Hence, the need for a system to investigate this content data is mandatory. In this concern, Text Categorization (TC) highlights many performance methods and techniques to analyze, explore and classify various types of documents. This study consists of two main steps. First, we extract terms from text documents using Fuzzy Near Neighbors (FNN) with web-based mining techniques algorithm. Second, we identify documents according to a particular form of similarity based on combining all Arabic encyclopedic dictionaries using clustering algorithms. In this article, Fuzzy C-Means (FCM) as a clustering algorithm is used to perform the precision of documents’ classification. This work suggests Arabic TC based on a multilingual encyclopedic dictionary (Arabic WordNet, OMW, Wikipedia, OmegaWiki, Wictionary, and Wikidata). To evaluate the efficacy of TC approach with FNN and FCM, an experimental study using a real-world dataset is carried out. The results of the present study indicate that proposed approach outperforms the traditional one and produces good results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baraa S, Nazlia O, Zeyad S (2014) An automated Arabic text categorization based on the frequency ratio accumulation. Int Arab J Inform Technol 11(2):213–221

    Google Scholar 

  2. Kumar GD, Gosul M (2011) Web mining research and future directions. In: Wyld DC, Wozniak M, Chaki N, Meghanathan N, Nagamalai D (eds.) Advances in Network Security and Applications. CNSA

    Google Scholar 

  3. Tharwat A (2019) Parameter investigation of support vector machine classifier with kernel functions. Knowl Inf Syst 61:1269–1302

    Article  Google Scholar 

  4. Ali N, Neagu D, Trundle P (2019) Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets. SN Appl Sci 1:1559

    Article  Google Scholar 

  5. Chen J, Dai Z, Duan J et al (2019) Improved Naive Bayes with optimal correlation factor for text classification. SN Appl Sci 1:1129

    Article  Google Scholar 

  6. Gavade AB, Rajpurohit VS (2020) Sparse-FCM and deep learning for effective classification of land area in multi-spectral satellite images. Evol Intel

    Google Scholar 

  7. Schroder HW, Welling H, Wellegehausen B (1973) Appl Phys 1:343–348

    Article  Google Scholar 

  8. Bezdek JC (1987) Analysis of Fuzzy Information, vol 1, 3. CRC Press, Boca Raton

    Google Scholar 

  9. Al-Radaideh QA, Al-Abrat MA (2019) An Arabic text categorization approach using term weighting and multiple reducts. Soft Comput 23:5849–5863

    Article  Google Scholar 

  10. Chantar H, Mafarja M, Alsawalqah H, et al (2019) Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification. Neural Comput Appl

    Google Scholar 

  11. Mesleh A, Ghassan G (2008) Support vector machine text classification system: using ant colony optimization based feature subset selection. Comput Eng Syst

    Google Scholar 

  12. Razavi AR, Gill H, Ă…hlfeldt H, Shahsavar N (2005) A Data Pre-Processing Method To Increase Efficiency And Accuracy In Data Mining. In: Miksch S, Hunter J, Keravnou ET (eds.) Artificial Intelligence in Medicine. AIME 2005. Lecture Notes in Computer Science, vol 3581. Springer, Heidelberg

    Google Scholar 

  13. Biehl M (2012) Admire LVQ—adaptive distance measures in relevance learning vector quantization. Künstl Intell 26:391–395

    Article  Google Scholar 

  14. Al-Radaideh Q, Al-Khateeb S (2015) An associative rule-based classifier for Arabic medical text. Int J Knowl Eng Data Min 3(3–4):255–273

    Article  Google Scholar 

  15. Kramer O (2013) K-nearest neighbors. In: Dimensionality Reduction with Unsupervised Nearest Neighbors. In: Intelligent Systems Reference Library, vol 51. Springer, Heidelberg

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mounir Gouiouez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gouiouez, M. (2021). A Fuzzy Near Neighbors Approach for Arabic Text Categorization Based on Web Mining Technique. In: Motahhir, S., Bossoufi, B. (eds) Digital Technologies and Applications. ICDTA 2021. Lecture Notes in Networks and Systems, vol 211. Springer, Cham. https://doi.org/10.1007/978-3-030-73882-2_52

Download citation

Publish with us

Policies and ethics