International Journal of Speech Technology

, Volume 19, Issue 2, pp 229–236 | Cite as

Semantic indexing of Arabic texts for information retrieval system

  • Mohammed Alaeddine Abderrahim
  • Mohammed Dib
  • Mohammed El-Amine Abderrahim
  • Mohammed Amine Chikh
Special Issue Article
  • 255 Downloads

Abstract

As part of information retrieval systems (IRS) and in the context of the use of ontologies for documents and queries indexing, we propose and evaluate in this paper the contribution of this approach applied to Arabic texts. To do this we indexed a corpus of Arabic text using Arabic WordNet. The disambiguation of words was performed by applying the Lesk algorithm. The results obtained by our experiment allowed us to deduct the contribution of this approach in IRS for Arabic texts.

Keywords

Semantic indexation Disambiguation Ontologies Lesk algorithm Arabic WordNet Information retrieval 

Notes

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. Abderrahim, M.-A., Abderrahim, M.-E.-A., & Chikh, M.-A. (2013). Using Arabic WordNet for semantic indexation in information retrieval system. International Journal of Computer Science Issues, 10(2), 327–332.Google Scholar
  2. Abouenour, L., Bouzoubaa, K., & Rosso, P. (2013). On the evaluation and improvement of Arabic WordNet coverage and usability. Language Resources and Evaluation, 47(3), 891–917.CrossRefGoogle Scholar
  3. Achour, H., & Zouari, M. (2013). Multilingual learning objects indexing and retrieving based on ontologies. World Congress on IEEE 2013 of the Computer and Information Technology (WCCIT).Google Scholar
  4. Agirre, E., & Rigau, G. (1996). Word sense disambiguation using conceptual density. Proceedings of the coling-ACL’96 workshop (pp. 16–22). Copenhagen.Google Scholar
  5. Agirre, E., & Soroa, A. (2009). Personalizing pagerank for word sense disambiguation. Proceedings of the 12th conference of the European chapter of the ACL (pp. 33–41). Athens: ©2009 Association for Computational Linguistics.Google Scholar
  6. Andreasen, T., Bulskov, H., & Knappe, R. (2003). Similarity for conceptual querying. 18th international symposium on computer and information sciences (pp 268–275).Google Scholar
  7. Azzoug, W., & Boubekeur, F. (2013a). Pondération des Concepts en Indexation Sémantique. CORIA’13: Dixième édition de la Conférence en Recherche d’Information et Applications. Neuchatel, Suisse.Google Scholar
  8. Azzoug, W., & Boubekeur, F. (2013b). Désambiguisation des sens des mots-application en recherche d’information. Dans 7ème Journées scientifiques pour la présentation des travaux de recherches des domaines de l’information, INFODays’ 2013. Chlef: Université Hassiba BenBouali.Google Scholar
  9. Azzoug, W., Boubekeur, F., & Boughanem, M. (2011). Indexation Sémantique de documents textuels. CIDE’11: 14 ème Conférence Internationale sue le Document Electronique. Rabat: Maroc.Google Scholar
  10. Azzoug, W., Boubekeur, F., & Boughanem, M. (2012). Les concepts sont-ils de bons candidats à l’indexation ?. COSI’12: 9ème édition du colloque sur l’optimisation et les systèmes d’information. Tlemcen: Algérie.Google Scholar
  11. Bakhouche, A., & Tlili-Guiassa, Y. (2012). Meaning representation for automatic indexing of Arabic texts. International Journal of Computer Science Issues (IJCSI), 9(6), 173–178.Google Scholar
  12. Baziz, M. (2005). Indexation Conceptuelle guidée par Ontologie pour la Recherche d’Information. Thèse Phd. Université Toulouse III-Paul Sabatier.Google Scholar
  13. Baziz, M., Boughanem, M., & Aussenac-Gilles, N. (2004). In Y. Ding, K. Van Riejsbergen & I. Ounis, J. Jose (Eds.), The use of ontology for semantic representation of documents. The 2nd semantic web and information retrieval workshop (SWIR), SIGIR 2004 (pp. 38–45). Sheffield.Google Scholar
  14. Baziz, M., Boughanem, M., & Aussenac-Gilles, N. (2005). A conceptual indexing approach based on document content representation. CoLIS5: Fifth international conference on conceptions of libraries and information science. Glasgow.Google Scholar
  15. Black, W., & Sabri E. (2004). A prototype English-Arabic Dictionary based on WordNet. Proceedings of the second international WordNet conference (pp. 67–74). Brno.Google Scholar
  16. Black, W., Elkateb, S., Rodriguez, H., Alkhalifa, M., Vossen, P., Pease, A., & Fellbaum, C. (2006). Introducing the Arabic WordNet Project. Proceedings of the third international WordNet conference (pp. 295–300).Google Scholar
  17. Boubekeur, F., Boughanem, M., & Tamine, L., (2008). Une approche d’indexation conceptuelle de documents basée sur les graphes CP_Nets. COSI’08: Cinquième édition du colloque sur l’optimisation et les systèmes d’information. Tizi-Ouzou.Google Scholar
  18. Boubekeur, F., Boughanem, M., Tamine, L., & Daoud, M. (2010). De l’utilisation de WordNet pour l’indexation conceptuelle des documents. CIDE’13: 13 ème Colloque International sur le Document Electronique. Paris: INHA.Google Scholar
  19. Boughanem, M., Mallak, I., & Prade, H. (2010). A new factor for computing the relevance of a document to a query. WCCI’10: IEEE world congress on comutational intelligence. Barcelone.Google Scholar
  20. Boughanem, M., Soulé-Dupuy, C. (1992). A connexionist model for information retrieval. DEXA 1992 (pp. 260–265).Google Scholar
  21. Dinh, D. (2012). Accés à l’information biomédicale : vers une approche d’indexation et de recherche d’information conceptuelle basée sur la fusion de ressources termino-ontologiques. Thèse Phd. Université de Toulouse 3 Paul Sabatier.Google Scholar
  22. Dinh, D., & Tamine, L. (2010). Vers un modèle d’indexation sémantique adapté aux dossiers medicaux de patients. CORIA’10: Conférence francophone en Recherche d’Information et Applications (pp. 325–336).Google Scholar
  23. Elkateb, S., Black, W., Rodríguez, H., Alkhalifa, M., Vossen, P., Pease, A., & Fellbaum, C. (2006a). Building a WordNet for Arabic. Proceedings of the fifth international conference on language resources and evaluation (pp. 29–34). Genoa.Google Scholar
  24. Elkateb, S., Black, W., Vossen, P., Farwell, D., Rodríguez, H., Pease, A., & Alkhalifa, M. (2006b). Arabic WordNet and the challenges of Arabic. Proceedings of Arabic NLP/MT conference (pp. 15–24). London. Citeseer 2006.Google Scholar
  25. Gasmi, M. (2009). Utilisation des ontologies pour l’indexation automatique des sites Web en Arabe. Mémoire de magister, Universite Kasdi Merbah Ouargla.Google Scholar
  26. Harrathi, F., Roussey, C., Maisonnnasse, L., & Calabretto, S. (2010). Vers une approche statistique pour l’indexation sémantique des documents multilingues. Proceedings of Actes du XXVIII° congrés INFORSID. Marseille.Google Scholar
  27. Hearst, M. A., & Karadi, C. (1997). Cat-a-cone : an interactive interface for specifying searches and viewing retrieval results using a large category hierarchy. 20th International conference on research and development in information retrieval (pp. 246–257). SIGIR 1997.Google Scholar
  28. Hernandez, N. (2005). Ontologies de Domaine pour la Modélisation du Contexte en Recherche d’Information. Thèse Phd. Université Toulouse III-Paul Sabatier.Google Scholar
  29. Hernandez, N., Hubert, G., Mothe, J., & Ralalason, B. (2008). RI et Ontologies. Technical report, Université Toulouse III-Paul Sabatier.Google Scholar
  30. Khan, L. R. (2000). Ontology based information selection. Phd thesis. Faculty of the Graduate School, University of Southern California.Google Scholar
  31. Khan, L. R., Mc Leod, D., & Hovy, E. (2004). Retrival effectiveness of an ontology based model for information selection. The VLDB Journal (13th ed.) (pp. 71–85).Google Scholar
  32. Kim, H., Park, C. S., Park, J. Y., Jung, B., & Lee, Y. J. (2007). A multimedia content management and retrieval system based on metadata and ontologies. IEEE international conference on multimedia and expo (pp 556—559).Google Scholar
  33. Köhler, J., Philippi, S., Specht, M., & Rüegg, A. (2006). Ontology based text indexing and querying for the Semantic Web. Knowledge Based Systems, 19, 744–754.CrossRefGoogle Scholar
  34. Magnini, B., & Cavagli, G. (2000). Integrating subject field cods into WordNet. Proceedings of LREC-2000: second international conference on language resources and evaluation (pp. 1413–1418). Athens.Google Scholar
  35. Maisonnasse, L., Gaussier, E., & Chevallet, J.-P. (2009). Model fusion in conceptual language modeling. ECIR 2009 (pp. 240–251).Google Scholar
  36. Mallak, I. (2011). De nouveaux facteurs pour l’exploitation de la sémantique d’un texte en recherche d’information. Thèse Phd. Université de Toulouse.Google Scholar
  37. Mihalcea, R., & Moldovan, D. I. (2000). Semantic indexing using WordNet senses. ACL workshop on IR and NLP (pp. 35–45).Google Scholar
  38. Tamine, L. (2000). Optimisation de Requêtes dans un Système de Recherche d’Information. Thèse Phd. Université Toulouse III-Paul Sabatier.Google Scholar
  39. Tazzite, N., Yousfi, A., & Bouyakhf, H. (2008). Conception et réalisation d’un système de recherche d’informations intégrant des connaissances sémantiques dans la phase d’indexation. NTIC’08, Les Technologies de l’information: statuts ET opportunités pour l’amazighe. Rebat MAROC. Retrieved from 28 Nov 2008.Google Scholar
  40. Vallet, D., Castells, P., Fernández, M., Mylonas, P., & Avrithis, Y. (2007). Personalized content retrieval in context using ontological knowledge. IEEE Transactions on Circuits and Systems for Video Technology, 17, 336–346.CrossRefGoogle Scholar
  41. Vasilescu, F. (2003). Monolingual corpus disambiguation by the approaches of Lesk. Master’s thesis. University of Montreal, Faculty of Arts and Sciences.Google Scholar
  42. Voorhees E. 1993. Using WordNet to disambiguate word senses for text retrieval. Proceedings of the 16th annual conference on research and development in information retrieval, SIGIR’93. Pittsburgh, PA.Google Scholar
  43. Wang, H., Chia, L. T., & Liu, S. (2007). Semantic retrieval with enhanced matchmaking and multi-modality ontology. IEEE international conference on multimedia and expo (pp. 516–519).Google Scholar
  44. Xiaomeng, S., & Atle, J. G. (2006). An information retrieval approach to ontology mapping. Data & Knowledge Engineering, 58, 47–69.CrossRefGoogle Scholar
  45. Zouaghi, A., Merhbene, L., & Zrigui, M. (2012a). Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation. Artificial Intelligence Review, 38(4), 257–269.CrossRefGoogle Scholar
  46. Zouaghi, A., Zrigui, M., Antoniadis, G., & Merhbene, L. (2012b). Contribution to semantic analysis of Arabic language. Advances in Artificial Intelligence, 2012, 11.Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Mohammed Alaeddine Abderrahim
    • 1
  • Mohammed Dib
    • 2
  • Mohammed El-Amine Abderrahim
    • 3
  • Mohammed Amine Chikh
    • 3
  1. 1.Department of Computer SciencesAbu Bekr Belkaid UniversityTlemcenAlgeria
  2. 2.Department of Letters and English LanguageMustapha Stambouli UniversityMascaraAlgeria
  3. 3.Department of Electrical and Electronic EngineeringAbu Bekr Belkaid UniversityTlemcenAlgeria

Personalised recommendations