Enhancing Hybrid Indexing for Arabic Information Retrieval

  • Souheila Ben Guirat
  • Ibrahim BounhasEmail author
  • Yahya Slimani
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 935)


Existent literature proposes several approaches to enhance Arabic document retrieval using different indexing units. In anterior work [1, 2], we proposed to combine multiple indexing units which improved retrieval performance. This paper develops this approach and suggests enhancing term weighting through result aggregation and pseudo-relevance feedback techniques. We compare these approaches to three baselines to enhance the previous results which showed the performance of hybrid indexing. To assess our hypothesis, we run four experimental setups based on a larger corpus with various query sets. Finally, we aim to compare all these methods using standard information retrieval metrics.


Arabic information retrieval Hybrid indexing Result aggregation Pseudo-relevance feedback 


  1. 1.
    Ben Guirat, S., Bounhas, I., Slimani, Y.: Combining indexing units for arabic information retrieval. Int. J. Softw. Innov. (IJSI) 4(4), 1–14 (2016)CrossRefGoogle Scholar
  2. 2.
    Ben Guirat, S., Bounhas, I., Slimani, Y.: A hybrid model for arabic document indexing. In: 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Shanghai, China, 30 May–1 June 2016Google Scholar
  3. 3.
    Kopliku, A., Pinel-Sauvagnat, K., Boughanem, M.: Aggregated search: a new information retrieval paradigm. ACM Comput. Surv. 46(4) (2014)CrossRefGoogle Scholar
  4. 4.
    Arguello, J., Diaz, F., Callan, J.: Learning to aggregate vertical results into web search results. In: CIKM 2011, Glasgow, Scotland, UK, 24–28 Oct, pp. 24–28 (2011)Google Scholar
  5. 5.
    Al-Kabi, M., Al-Radaideh, Q., Akawi, K.: Benchmarking and assessing the performance of Arabic Stemmers. J. Inf. Sci. 37(2), 1–12 (2011)CrossRefGoogle Scholar
  6. 6.
    Al-Shawakfa, E., Al-Badarneh, A., Shatnawi, S., Al-Rabab’ah, K., Bani-Ismail, B.: A comparison study of some Arabic root finding algorithms. J. Am. Soc. Inform. Sci. Technol. 6(5), 1015–1024 (2010)CrossRefGoogle Scholar
  7. 7.
    Aljlayl, M., Frieder, O.: On Arabic search: improving the retrieval effectiveness via a light stemming approach. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, McLean, Virginia, USA, 04–09 Nov 2002Google Scholar
  8. 8.
    Khoja, S., Garside, S.: Stemming arabic text. Technical report, Computing Department, Lancaster University, UK (1999)Google Scholar
  9. 9.
    Larkey, L., Connell, M.E.: Arabic information retrieval at UMass in TREC-10. In: Proceedings of Text Retrieval conference (TREC), Gaithersburg, USA (2001)Google Scholar
  10. 10.
    Sawilowsky Shlomo, S.: Misconceptions leading to choosing the t-Test over the wilcoxon mann-whitney test for shift in location parameter. J. Mod. Appl. Stat. Methods 4(2), 598–600 (2005)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Chen, A., Gey, F.: Building an Arabic stemmer for information retrieval. In: Proceedings of the Text Retrieval Conference TREC-11, pp. 631–639 (2002)Google Scholar
  12. 12.
    Hadni, M., Lachkar, A., Alaoui Ouatik, S.: A new and efficient stemming technique for arabic text categorization. In: International Conference on Multimedia Computing and Systems (ICMCS), 10–12 May 2018, Tangier, Morocco, pp. 791–796 (2012)Google Scholar
  13. 13.
    Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3(3), 225–331 (2009)CrossRefGoogle Scholar
  14. 14.
    Hsu, D.F., Taksa, I.: Comparing rank and score combination methods for data fusion in information retrieval. Inf. Retr. 8(3), 449–480 (2005)CrossRefGoogle Scholar
  15. 15.
    Ghwanmeh, S., Rabab’ah, S., Al-Shalabi, R., Kanaan, G.: Enhanced algorithm for extracting the root of arabic words. In: Sixth International Conference on Computer Graphics, Imaging and Visualization, Tianjin, China, 11–14 Aug, pp. 388–391 (2009)Google Scholar
  16. 16.
    Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Lioma, C.: Terrier: a high performance and scalable information retrieval platform. In: Proceedings of Open Source Information Retrieval (OSIR) Workshop, Seattle, USA, 10 Aug 2006Google Scholar
  17. 17.
    Aslam, J.A., Pavlu, V., Yilmaz, E.: A sampling technique for efficiently estimating measures of query retrieval performance using incomplete judgments. In: ICML Workshop on Learning with Partially Classified Training Data, Bonn, Germany, 7 Aug 2005Google Scholar
  18. 18.
    Biau, D.J., Jolles, B.M., Porcher, R.: P-value and the theory of hypothesis testing: an explanation for new researchers. Clin. Orthop. Relat. Res. 468(3), 885–892 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Souheila Ben Guirat
    • 1
    • 2
    • 5
  • Ibrahim Bounhas
    • 2
    • 4
    • 5
    Email author
  • Yahya Slimani
    • 2
    • 3
    • 5
  1. 1.Department of Computer SciencesPrince Sattam Ibn Abdulaziz UniversityRiaydSaudi Arabia
  2. 2.LISI Laboratory of Computer Science for Industrial SystemsCarthage UniversityTunisTunisia
  3. 3.Higher Institute of Multimedia Arts of La Manouba, La Manouba UniversityManoubaTunisia
  4. 4.Higher Institute of Documentation, La Manouba UniversityManoubaTunisia
  5. 5.JARIR: Joint Group for Artificial Reasoning and Information RetrievalManoubaTunisia

Personalised recommendations