Advertisement

Benchmarking Strategy for Arabic Screen-Rendered Word Recognition

  • Fouad Slimane
  • Slim Kanoun
  • Jean Hennebert
  • Rolf Ingold
  • Adel M. Alimi

Abstract

This chapter presents a new benchmarking strategy for Arabic screen-based word recognition. Firstly, we report on the creation of the new APTI (Arabic Printed Text Image) database. This database is a large-scale benchmarking of open-vocabulary, multi-font, multi-size and multi-style word recognition systems in Arabic. Such systems take as input a text image and compute as output a character string corresponding to the text included in the image. The challenges that are addressed by the database are in the variability of the sizes, fonts and styles used to generate the images. A focus is also given on low resolution images where anti-aliasing is generating noise on the characters being recognized. The database contains 45,313,600 single word images totalling more than 250 million characters. Ground truth annotation is provided for each image from an XML file. The annotation includes the number of characters, the number of pieces of Arabic words (PAWs), the sequence of characters, the size, the style, the font used to generate each image, etc. Secondly, we describe the Arabic Recognition Competition: Multi-Font Multi-Size Digitally Represented Text held in the context of the 11th International Conference on Document Analysis and Recognition (ICDAR’2011), during September 18–21, 2011, Beijing, China. This first edition of the competition used the freely available APTI database. Two groups with three systems participated in the competition. The systems were compared using the recognition rates at the character and word levels. The systems were tested on one test dataset which is unknown to all participants (set 6 of APTI database). The systems were compared on the ground of the most important characteristic of classification systems: the recognition rate. A short description of the participating groups, their systems, the experimental setup and the observed results are presented. Thirdly, we present our DIVA-REGIM system (out of competition at ICDAR’2011) with all results of the Arabic recognition competition protocols.

Notes

Acknowledgements

The authors would like to thank all ICDAR’2011—Arabic Recognition Competition: Multi-Font Multi-Size Digitally Represented Text participants and the anonymous reviewers for their comments and suggestions, which have much improved the presentation of this work.

References

  1. 1.
    Abbès, R., Dichy, J., Hassoun, M.: The architecture of a standard Arabic lexical database: some figures, ratios and categories from the DIINAR.1 source program. In: Proceedings of the Workshop on Computational Approaches to Arabic Script-Based Languages, Semitic’04, pp. 15–22. Association for Computational Linguistics, Stroudsburg (2004) CrossRefGoogle Scholar
  2. 2.
    Abdelraouf, A., Higgins, C.A., Khalil, M.: A database for Arabic printed character recognition. In: Proceedings of the 5th International Conference on Image Analysis and Recognition, ICIAR’08, pp. 567–578. Springer, Berlin (2008) Google Scholar
  3. 3.
    AbdelRaouf, A., Higgins, C., Pridmore, T., Khalil, M.: Building a multi-modal Arabic corpus (MMAC). Int. J. Doc. Anal. Recognit. 13, 285–302 (2010) CrossRefGoogle Scholar
  4. 4.
    Al-Muhtaseb, H.A., Mahmoud, S.A., Qahwaji, R.S.: Recognition of off-line printed Arabic text using hidden Markov models. Signal Process. 88, 2902–2912 (2008) zbMATHCrossRefGoogle Scholar
  5. 5.
    Al-Sughaiyer, I.A., Al-Kharashi, I.A.: Arabic morphological analysis techniques: a comprehensive survey. J. Am. Soc. Inf. Sci. Technol. 55, 189–213 (2004) CrossRefGoogle Scholar
  6. 6.
    Baird, H.: The state of the art of document image degradation modelling. In: Chaudhuri, B.B. (ed.) Digital Document Processing, Advances in Pattern Recognition, pp. 261–279. Springer, London (2007) Google Scholar
  7. 7.
    Ben Hamadou, A.: A compression technique for Arabic dictionaries: the affix analysis. In: COLING’86, pp. 286–288 (1986) Google Scholar
  8. 8.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39(1), 1–38 (1977) MathSciNetzbMATHGoogle Scholar
  9. 9.
    Dichy, J., Hassoun, M.: The DIINAR.1—Arabic lexical resource, an outline of contents and methodology. ELRA Newsl. 10(2), 5–10 (2005) Google Scholar
  10. 10.
    Gimenez, A., Juan, A.: Embedded Bernoulli mixture HMMs for handwritten word recognition. In: Proc. of the 10th Int. Conf. on Doc. Analysis and Recognition (ICDAR), pp. 896–900 (2009) Google Scholar
  11. 11.
    Gimenez, A., Khoury, I., Juan, A.: Windowed Bernoulli mixture HMMs for Arabic handwritten word recognition. In: 2010 International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 533–538 (2010) CrossRefGoogle Scholar
  12. 12.
    Graff, D., Chen, K., Kong, J., Maeda, K.: Arabic Gigaword, 2nd edn. Linguistic Data Consortium, Philadelphia (2006) Google Scholar
  13. 13.
    Hilal, Y.: Tahlil sarfi lil arabia. In: Proc. Comput. Process. Arabic Language, Kuwait (1985) Google Scholar
  14. 14.
    Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 22, 4–37 (2000) CrossRefGoogle Scholar
  15. 15.
    Kanoun, S., Slimane, F., Guesmi, H., Ingold, R., Alimi, A.M., Hennebert, J.: Affixal approach versus analytical approach for off-line Arabic decomposable vocabulary recognition. In: ICDAR, pp. 661–665 (2009) Google Scholar
  16. 16.
    Kanoun, S., Alimi, A.M., Lecourtier, Y.: Natural language morphology integration in off-line Arabic optical text recognition. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 41(2), 579–590 (2011) CrossRefGoogle Scholar
  17. 17.
    Khorsheed, M.S.: Offline recognition of omnifont Arabic text using the HMM toolkit (HTK). Pattern Recognit. Lett. 28, 1563–1571 (2007) CrossRefGoogle Scholar
  18. 18.
    Lee, C.H., Kanungo, T.: The architecture of TRUEVIZ: a groundtruth/metadata editing and visualizing toolkit. Pattern Recognit. 36(3), 811–825 (2003) CrossRefGoogle Scholar
  19. 19.
    Märgner, V., El Abed, H.: ICDAR 2009—Arabic handwriting recognition competition. In: ICDAR, pp. 1383–1387 (2009) Google Scholar
  20. 20.
    Märgner, V., El Abed, H.: ICFHR 2010—Arabic handwriting recognition competition. In: ICFHR, pp. 709–714 (2010) Google Scholar
  21. 21.
    Märgner, V., El Abed, H.: ICDAR 2011—Arabic handwriting recognition competition. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 1444–1448 (2011) CrossRefGoogle Scholar
  22. 22.
    Pechwitz, M., Maddouri, S.S., Märgner, V., Ellouze, N., Amiri, H.: IFN/ENIT—database of handwritten Arabic words. In: Proc. of CIFED 2002, pp. 129–136 (2002) Google Scholar
  23. 23.
    Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice Hall, Upper Saddle River (1993) Google Scholar
  24. 24.
    Schlosser, S.: ERIM Arabic database. Document Processing Research Program, Information and Materials Applications Laboratory, Environmental Research Institute of Michigan (1995) Google Scholar
  25. 25.
    Shaaban, Z.: A new recognition scheme for machine-printed Arabic texts based on neural networks. In: Proceedings of World Academy of Science, Engineering and Technology, vol. 31, July 2008 Google Scholar
  26. 26.
    Shafait, F., Rashid, S.F., Breuel, T.M.: An evaluation of HMM-based techniques for the recognition of screen rendered text. In: International Conference on Document Analysis and Recognition, September 2011, pp. 1260–1264 (2011) Google Scholar
  27. 27.
    Slimane, F., Ingold, R., Alimi, A.M., Hennebert, J.: Duration models for Arabic text recognition using hidden Markov models. In: CIMCA, pp. 838–843 (2008) Google Scholar
  28. 28.
    Slimane, F., Ingold, R., Kanoun, S., Alimi, A.M., Hennebert, J.: Database and evaluation protocols for Arabic printed text recognition. In: DIUF—University of Fribourg, Switzerland (2009) Google Scholar
  29. 29.
    Slimane, F., Ingold, R., Kanoun, S., Alimi, A.M., Hennebert, J.: Impact of character models choice on Arabic text recognition performance. In: International Conference on Frontiers in Handwriting Recognition (ICFHR), Novemberf 2010, pp. 670–675 (2010) CrossRefGoogle Scholar
  30. 30.
    Slimane, F., Kanoun, S., Alimi, A.M., Ingold, R., Hennebert, J.: Gaussian mixture models for Arabic font recognition. In: ICPR, pp. 2174–2177 (2010) Google Scholar
  31. 31.
    Wachenfeld, S., Klein, H.-U., Jiang, X.: Recognition of screen-rendered text. In: Proceedings of the 18th International Conference on Pattern Recognition—Vol. 02, ICPR’06, pp. 1086–1089. IEEE Comput. Soc., Washington (2006) Google Scholar
  32. 32.
    Young, S.J., Evermann, G., Gales, M.J.F., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.C.: The HTK Book, Version 3.4. Cambridge University Engineering Department, Cambridge (2006) Google Scholar

Copyright information

© Springer-Verlag London 2012

Authors and Affiliations

  • Fouad Slimane
    • 1
  • Slim Kanoun
    • 2
  • Jean Hennebert
    • 1
  • Rolf Ingold
    • 1
  • Adel M. Alimi
    • 3
  1. 1.DIVA Group, Department of InformaticsUniverstity of FribourgFribourgSwitzerland
  2. 2.National School of Engineers (ENIS)University of SfaxSfaxTunisia
  3. 3.REGIM Group, National School of Engineers (ENIS)University of SfaxSfaxTunisia

Personalised recommendations