Advertisement

On writer identification for Arabic historical manuscripts

  • Abedelkadir Asi
  • Alaa AbdalhaleemEmail author
  • Daniel Fecker
  • Volker Märgner
  • Jihad El-Sana
Original Paper

Abstract

This paper introduces new methodologies for reliably identifying writers of Arabic historical manuscripts. We propose an approach that transforms key point-based features, such as SIFT, into a global form that captures high-level characteristics of writing styles. We suggest a modification for a common local feature, the contour direction feature, and show the contribution of combining local and global features for writer identification. Our work also presents a novel algorithm that determines the number of writers involved in writing a given manuscript. The experimental study confirms the significant improvement in this algorithm on writer identification once applied to historical manuscripts. Comprehensive experiments using different features and classification schemes demonstrate the vitality of the suggested methodologies for reliable writer identification. The presented techniques were evaluated on both historical and modern documents where the suggested features yielded very promising results with respect to state-of-the-art features.

Keywords

Writer identification Writer retrieval Key point-based features Contour-based features Supervised learning Hierarchical clustering Classification 

Notes

Acknowledgements

This research was supported in part by the Frankel Center for Computer Science, the Council of Higher Education of Israel. Special thanks to our colleague Ahmad Droby for his suggestion.

References

  1. 1.
    Abdelhaleem, A., Droby, A., Asi, A., Kassis, M., Al Asam, R., El-sanaa, J.: Wahd: a database for writer identification of arabic historical documents. In: International Workshop on Arabic Script Analysis and Recognition, pp. 140–145 (2017)Google Scholar
  2. 2.
    Adams, N., Smith, D.: The selected ion flow tube (sift); a technique for studying ion-neutral reactions. Int. J. Mass Spectrom. Ion Phys. 21(3–4), 349–359 (1976)CrossRefGoogle Scholar
  3. 3.
    Altman, R.: The illusion of one writer in historical documents and its effect on automating writer identification. In: Proceedings of the Conference of the International Graphonomics Society, Dijon, France, pp. 1–4 (2009)Google Scholar
  4. 4.
    Antonacopoulos, A., Downton, A.C.: Special issue on the analysis of historical documents. Int. J. Document Anal. Recognit. (IJDAR) 9, 75–77 (2007)CrossRefGoogle Scholar
  5. 5.
    Asi, A., Cohen, R., Dinstein, I., Kedem, K., El-Sana, J.: A coarse-to-fine approach for layout analysis of ancient manuscripts. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition, Crete Island, Greece, pp. 140–145 (2014)Google Scholar
  6. 6.
    Awaida, S.M., Mahmoud, S.A.: State of the art in off-line writer identification of handwritten text and survey of writer identification of Arabic text. Educ. Res. Rev. 70(20), 445–463 (2012)Google Scholar
  7. 7.
    Bay, H., Tuytelaars, T., Van Gool, L.: Surf: speeded up robust features. Comput. Vis. ECCV 2006, 404–417 (2006)Google Scholar
  8. 8.
    Bensefia, A., Paquet, T., Heutte, L.: A writer identification and verification system. Pattern Recognit. Lett. 26(13), 2080–2092 (2005)CrossRefzbMATHGoogle Scholar
  9. 9.
    Bertolini, D., Oliveira, L.S., Justino, E., Sabourin, R.: Texture-based descriptors for writer identification and verification. Expert Syst. Appl. 40(6), 2069–2080 (2013)CrossRefGoogle Scholar
  10. 10.
    Brink, A., Bulacu, M., Schomaker, L.: How much handwritten text is needed for text-independent writer verification and identification. In: Proceedings of the International Conference on Pattern Recognition, Florida, USA, pp. 1–4 (2008)Google Scholar
  11. 11.
    Brink, A., Smit, J., Bulacu, M., Schomaker, L.: Writer identification using directional ink-trace width measurements. Pattern Recognit. 45(1), 162–171 (2012)CrossRefGoogle Scholar
  12. 12.
    Bulacu, M., Schomaker, L.: A comparison of clustering methods for writer identification and verification. In: Proceedings of the International Conference on Document Analysis and Recognition, Seoul, South Korea, pp. 1275–1279 (2005)Google Scholar
  13. 13.
    Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 701–717 (2007)CrossRefGoogle Scholar
  14. 14.
    Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Christlein, V., Bernecker, D., Hönig, F., Maier, A., Angelopoulou, E.: Writer identification using GMM supervectors and exemplar-SVMS. Pattern Recognit. 63, 258–267 (2017)CrossRefGoogle Scholar
  16. 16.
    Cohen, R., Asi, A., Kedem, K., El-Sana, J., Dinstein, I.: Robust text and drawing segmentation algorithm for historical documents. In: Proceedings of the International Workshop on Historical Document Imaging and Processing, Washington, DC, USA, pp. 110–117 (2013)Google Scholar
  17. 17.
    Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algorithms 22(1), 60–65 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Fecker, D., Asi, A., Margner, V., El-Sana, J., Fingscheidt, T.: Writer identification for historical documents. In: Proceedings of the International Conference on Pattern Recognition, Stockholm, Sweden, pp. 3050–3055 (2014)Google Scholar
  19. 19.
    Fecker, D., Asi, A., Pantke, W., Margner, V., El-Sana, J., Fingscheidt, T.: Document writer analysis with rejection for historical Arabic documents. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition, Crete Island, Greece, pp. 743–748 (2014)Google Scholar
  20. 20.
    Fiel, S., Sablatnig, R.: Writer retrieval and writer identification using local features. In: Proceedings of the International Workshop on Document Analysis Systems, Queensland, Australia, pp. 145–149 (2012)Google Scholar
  21. 21.
    Fiel, S., Sablatnig, R.: Writer identification and writer retrieval using the fisher vector on visual vocabularies. In: Proceedings of the International Conference on Document Analysis and Recognition, Washington DC, USA, pp. 545–549 (2013)Google Scholar
  22. 22.
    Fogel, I., Sagi, D.: Gabor filters as texture discriminator. Biol. Cybern. 61(2), 103–113 (1989)CrossRefGoogle Scholar
  23. 23.
    Gordo, A., Forns, A., Valveny, E.: Writer identification in handwritten musical scores with bags of notes. Pattern Recognit. 46(5), 1337–1345 (2013)CrossRefGoogle Scholar
  24. 24.
    Hassane, A., Al-Madeed, S.: ICFHR 2012 competition on writer identification challenge 2: Arabic scripts. In: Proceedings of the International Workshop on Frontiers in Handwriting Recognition, Bari, Italy, pp. 835–840 (2012)Google Scholar
  25. 25.
    Holland, S.M.: Principal Components Analysis (PCA). University of Georgia, Georgia (2008)Google Scholar
  26. 26.
    Jain, R., Doermann, D.: Combining local features for offline writer identification. In: Proceedings of the International Conference on Frontiers in Handwriting Recognition, Crete Island, Greece, pp. 583–588 (2014)Google Scholar
  27. 27.
    Khalifa, E., Al-Maadeed, S., Tahir, M.A., Bouridane, A., Jamshed, A.: Off-line writer identification using an ensemble of grapheme codebook features. Pattern Recognit. Lett. 59, 18–25 (2015)CrossRefGoogle Scholar
  28. 28.
    Li, Y., Wang, S., Tian, Q., Ding, X.: A survey of recent advances in visual feature detection. Neurocomputing 149, 736–751 (2015)CrossRefGoogle Scholar
  29. 29.
    Liang, Y., Liu, L., Xu, Y., Xiang, Y., Zou, B.: Multi-task gloh feature selection for human age estimation. In: 2011 18th IEEE International Conference on Image Processing (ICIP), pp. 565–568. IEEE (2011)Google Scholar
  30. 30.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  31. 31.
    Maaten, L., Postma, E.: Improving automatic writer identification. In: Proceedings of the Belgium–Netherlands Conference on Artificial Intelligence, Brussels, Belgium, pp. 260–266 (2005)Google Scholar
  32. 32.
    Mahmoud, S.A., Ahmad, I., Al-Khatib, W.G., Alshayeb, M., Parvez, M.T., Märgner, V., Fink, G.A.: Khatt: an open arabic offline handwritten text database. Pattern Recognit. 47(3), 1096–1112 (2014)CrossRefGoogle Scholar
  33. 33.
    Mahmoud, S.A., Ahmad, I., Al-Khatib, W.G., Alshayeb, M., Parvez, M.T., Margner, V., Fink, G.A.: Khatt: an open Arabic offline handwritten text database. Pattern Recognit. 47(3), 1096–1112 (2014)CrossRefGoogle Scholar
  34. 34.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 27(10), 1615–1630 (2005)CrossRefGoogle Scholar
  35. 35.
    Newell, A.J., Griffin, L.D.: Natural image character recognition using oriented basic image features. In: Proceedings of the International Conference on Digital Image Computing Techniques and Applications, Queensland, Australia, pp. 191–196 (2011)Google Scholar
  36. 36.
    Plamondon, R., Lorette, G.: Automatic signature verification and writer identification the state of the art. Pattern Recognit. 22(2), 107–131 (1989)CrossRefGoogle Scholar
  37. 37.
    Said, H.E.S., Baker, K., Tan, T.: Personal identification based on handwriting. In: Proceedings of the International Conference on Pattern Recognition, Queensland, Australia, pp. 1761–1764 (1998)Google Scholar
  38. 38.
    Schlapbach, A., Bunke, H.: Off-linewriter identification using Gaussian mixture models. In: Proceedings of the International Conference on Pattern Recognition, Hong Kong, China, pp. 992–995 (2006)Google Scholar
  39. 39.
    Schlapbach, A., Bunke, H.: A writer identification and verification system using HMM based recognizers. Pattern Anal. Appl. 10(1), 33–43 (2007)MathSciNetCrossRefGoogle Scholar
  40. 40.
    Siddiqi, I., Vincent, N.: Text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features. Pattern Recognit. 43(11), 3853–3865 (2010)CrossRefzbMATHGoogle Scholar
  41. 41.
    Spasova, V.G.: Experimental evaluation of keypoints detector and descriptor algorithms for indoors person localization. Annu. J. Electron. 8, 85–87 (2014)Google Scholar
  42. 42.
    Sreeraj, M., Idicula, S.M.: A survey on writer identification schemes. Int. J. Comput. Appl. 26(2), 23–33 (2011)Google Scholar
  43. 43.
    Srihari, S., Tomai, C., Zhang, B., Lee, S.: Individuality of numerals. In: Proceedings of the International Conference on Document Analysis and Recognition, Edinburgh, Scotland, pp. 1096–1100 (2003)Google Scholar
  44. 44.
    Srihari, S.N., Cha, S.H., Arora, H., Lee, S.: Individuality of handwriting: a validation study. In: Proceedings of the International Conference on Document Analysis and Recognition, Seatle, USA, pp. 106–109 (2001)Google Scholar
  45. 45.
    vml: WAHD dataset. https://www.cs.bgu.ac.il/~vml/wahad.html (2016). Accessed 2016
  46. 46.
    Zhang, B., Srihari, S.N.: Analysis of handwriting individuality using word features. In: Proceedings of the International Conference on Document Analysis and Recognition, Edinburgh, Scotland, pp. 1142–1146 (2003)Google Scholar
  47. 47.
    Zhang, B., Srihari, S.N., Lee, S.: Individuality of handwritten characters. In: Proceedings of the International Conference on Document Analysis and Recognition, Edinburgh, Scotland, pp. 1086–1090 (2003)Google Scholar
  48. 48.
    Zois, E., Anastassopoulos, V.: Morphological waveform coding for writer identification. Pattern Recognit. 33(3), 385–398 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceBen-Gurion University of the NegevBeershebaIsrael
  2. 2.Institute for Communications TechnologyTechnische Universität BraunschweigBrunswickGermany

Personalised recommendations