Skip to main content

Concatenation Technique for Extracted Arabic Characters for Efficient Content-based Indexing and Searching

  • Conference paper
  • First Online:
Proceedings of the Second International Conference on Computer and Communication Technologies

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 379))

Abstract

This research paper demonstrates the work accomplished in the last phase of the ongoing research project with an objective of developing a system for moving Arabic video text extraction for efficient content-based indexing and searching. The novelty of this paper is the technique used for concatenation of the individual stand alone Arabic characters which are extracted and recognized from image frames. Unicode format of Arabic characters is used for concatenation of extracted characters which is never done before. The concatenated characters are written into the text file in incessant way. This text files are indexed using Lucene and search for the desired string is done in a faster and precise manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Saudagar, A.K.J., Mohammed, H.V.: A comparative study of video splitting techniques. In: 23rd International Conference on Systems Engineering, pp. 783–788. Springer International Publishing, Switzerland (2015)

    Google Scholar 

  2. Saudagar, A.K.J., Mohammed, H.V., Iqbal, K., Gyani, Y.J.: Efficient Arabic text extraction and recognition using thinning and dataset comparison technique. In: International Conference on Communication, Information & Computing Technology, pp. 1–5. IEEE Press, New York (2015)

    Google Scholar 

  3. Elarian, Y.S., Al-Muhtaseb, H.A., Ghouti, L.M.: Arabic handwriting synthesis. In: 1st International Workshop on Frontiers in Arabic Handwriting Recognition. http://hdl.handle.net/2003/27562 (2010)

  4. Assabie, Y., Bigun, J.: HMM-based handwritten amharic word recognition with feature concatenation. In: 10th International Conference on Document Analysis and Recognition, pp. 961–965. IEEE Press, New York (2009)

    Google Scholar 

  5. Buckwalter, T.: Issues in Arabic orthography and morphology analysis. In: Workshop on Computational Approaches to Arabic Script-based Languages, pp. 31–34 (2004)

    Google Scholar 

  6. Amin, A.: Recognition of printed Arabic text based on global features and decision tree learning techniques. Pattern Recogn. 33, 1309–1323 (2000)

    Article  Google Scholar 

  7. Harmanani, H., Keirouz, W., Raheel, S.: A rule-based extensible stemmer for information retrieval with application to Arabic. Int. Arab. J. Inf. Techn. 3, 265–272 (2006)

    Google Scholar 

  8. Chherawala, Y., Cheriet, M.: Arabic word descriptor for handwritten word indexing and lexicon reduction. Pattern Recogn. 47, 3477–3486 (2014)

    Article  Google Scholar 

  9. Mahmoud, R., Majed, S.: Improving Arabic information retrieval system using n-gram method. WSEAS Trans. Comput. 10, 125–133 (2011)

    Google Scholar 

  10. Al-Molijy, A., Hmeidi, I., Alsmadi, I.: Indexing of Arabic documents automatically based on lexical analysis. Int. J. Nat. Lang. Comput. 1, 1–8 (2012)

    Google Scholar 

  11. Wedyan, M., Alhadidi, B., Alrabea, A.: The effect of using a thesaurus in Arabic information retrieval system. Int. J. Comput. Sci. Issues 9, 431–435 (2012)

    Google Scholar 

  12. Abderrahim, M.A., Abderrahim, M.E.A., Chikh, M.A.: Using Arabic wordnet for semantic indexation in information retrieval system. http://arxiv.org/ftp/arxiv/papers/1306/1306.2499.pdf

  13. Chan, J., Ziftci, C., Forsyth, D.: Searching off-line arabic documents. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1455–1462. IEEE Press, New York (2006)

    Google Scholar 

  14. Lin, C.H., Chen, H.: An automatic indexing and neural network approach to concept retrieval and classification of multilingual (Chinese-English) documents. IEEE T. Syst. Man. Cy. B. 26, 75–88 (1996)

    Article  Google Scholar 

  15. Moukdad, H., Large, A.: Information retrieval from full-text arabic databases: can search engines designed for English do the job? Libri. 51, 63–74 (2001)

    Article  Google Scholar 

  16. Kefali, A., Chemmam, C.: A semi-automatic approach of old arabic documents indexing. http://ceur-ws.org/Vol-825/paper_83.pdf

  17. Sari, T., Kefali, A.: A search engine for Arabic documents. https://hal.archives-ouvertes.fr/hal-00334402/document

  18. Yacine, E.Y.: Towards an Arabic web-based information retrieval system (ARABIRS): stemming to indexing. Int. J. Comput. Appl. 109, 16–21 (2015)

    Google Scholar 

  19. Savoy, J., Rasolofo, Y.: Report on the TREC-11 experiment: Arabic, named page and topic distillation searches. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.13.8419&rep=rep1&type=pdf

  20. Darwish, K., Oard, D.W.: Term selection for searching printed Arabic. In: 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 261–268 (2002)

    Google Scholar 

  21. He, J., Yan, H., Suel, T.: Compact full-text indexing of versioned document collections. In: 18th ACM conference on Information and Knowledge Management, pp. 415–424 (2009)

    Google Scholar 

  22. Al-Tayyar, M.S.: Arabic information retrieval system based on morphological analysis (AIRSMA). https://www.dora.dmu.ac.uk/bitstream/handle/2086/4126/DX221482.pdf?sequence=1

  23. Mazari, A.C., Aliane, H., Alimazighi, Z.: A conceptual indexing approach for Arabic texts. In: ACS International Conference on Computer Systems and Applications (AICCSA), p. 1. IEEE Press, New York (2013)

    Google Scholar 

  24. Al-Taani, A,T., Al-Gharaibeh, A.M.: Searching concepts and keywords in the HolyQuran. http://www.nauss.edu.sa/En/DigitalLibrary/Researches/Documents/2011/articles_2011_3088.pdf

  25. Arara, A., Smeda, A., Ellabib, I.: Searching and analyzing Arabic text using regular expressions e–Quran case study. Int. J. Comput. Sci. Electron. Eng. 1, 627–631 (2013)

    Google Scholar 

  26. Saabni, R., El-Sana, J.: Keyword searching for Arabic handwritten documents. http://www.iapr-tc11.org/archive/icfhr2008/Proceedings/papers/cr1134.pdf

  27. Srihari, S.N., Ball, G.R., Srinivasan, H.: Versatile search of scanned Arabic handwriting. In: Arabic and Chinese Handwriting Recognition. LNCS, vol. 4768, pp. 57–69. Springer, Heidelberg (2008)

    Google Scholar 

  28. Navarro, G., Sutinen, E., Tanninen, J., Tarhio, J.: Indexing text with approximate q-Grams. In: Combinatorial Pattern Matching. LNCS, vol. 1848, pp. 350–363. Springer, Heidelberg (2000)

    Google Scholar 

  29. Lucene: http://lucene.apache.org/core/

Download references

Acknowledgments

This research is supported by King Abdulaziz City for Science and Technology (KACST), Saudi Arabia, vide grant no. AT-32-87.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdul Khader Jilani Saudagar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer India

About this paper

Cite this paper

Saudagar, A.K.J., Mohammed, H.V. (2016). Concatenation Technique for Extracted Arabic Characters for Efficient Content-based Indexing and Searching. In: Satapathy, S., Raju, K., Mandal, J., Bhateja, V. (eds) Proceedings of the Second International Conference on Computer and Communication Technologies. Advances in Intelligent Systems and Computing, vol 379. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2517-1_55

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2517-1_55

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2516-4

  • Online ISBN: 978-81-322-2517-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics