Skip to main content

Learning Features for Writer Identification from Handwriting on Papyri

  • Conference paper
  • First Online:
Pattern Recognition and Artificial Intelligence (MedPRAI 2020)

Abstract

Computerized analysis of historical documents has remained an interesting research area for the pattern classification community for many decades. From the perspective of computerized analysis, key challenges in the historical manuscripts include automatic transcription, dating, retrieval, classification of writing styles and identification of scribes etc. Among these, the focus of our current study lies on identification of writers from the digitized manuscripts. We exploit convolutional neural networks for extraction of features and characterization of writer. The ConvNets are first trained on contemporary handwriting samples and then fine-tuned to the limited set of historical manuscripts considered in our study. Dense sampling is carried out over a given manuscript producing a set of small writing patches for each document. Decisions on patches are combined using a majority vote to conclude the authorship of a query document. Preliminary experiments on a set of challenging and degraded manuscripts report promising performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baird, H.S., Govindaraju, V., Lopresti, D.P.: Document analysis systems for digital libraries: challenges and opportunities. In: Marinai, S., Dengel, A.R. (eds.) DAS 2004. LNCS, vol. 3163, pp. 1–16. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28640-0_1

    Chapter  Google Scholar 

  2. Le Bourgeois, F., Trinh, E., Allier, B., Eglin, V., Emptoz, H.: Document images analysis solutions for digital libraries. In: 2004 Proceedings of the First International Workshop on Document Image Analysis for Libraries, pp. 2–24. IEEE (2004)

    Google Scholar 

  3. Sankar, K.P., Ambati, V., Pratha, L., Jawahar, C.V.: Digitizing a million books: challenges for document analysis. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 425–436. Springer, Heidelberg (2006). https://doi.org/10.1007/11669487_38

    Chapter  Google Scholar 

  4. Klemme, A.: International Dunhuang project: the silk road online. Ref. Rev. 28(2), 51–52 (2014)

    Google Scholar 

  5. Van der Zant, T., Schomaker, L., Haak, K.: Handwritten-word spotting using biologically inspired features. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1945–1957 (2008)

    Article  Google Scholar 

  6. Aiolli, F., Ciula, A.: A case study on the system for paleographic inspections (SPI): challenges and new developments. Comput. Intell. Bioeng. 196, 53–66 (2009)

    Google Scholar 

  7. Hamid, A., Bibi, M., Siddiqi, I., Moetesum, M.: Historical manuscript dating using textural measures. In: 2018 International Conference on Frontiers of Information Technology (FIT), pp. 235–240. IEEE (2018)

    Google Scholar 

  8. Hamid, A., Bibi, M., Moetesum, M., Siddiqi, I.: Deep learning based approach for historical manuscript dating. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 967–972 (2019)

    Google Scholar 

  9. He, S., Samara, P., Burgers, J., Schomaker, L.: Image-based historical manuscript dating using contour and stroke fragments. Pattern Recogn. 58, 159–171 (2016)

    Article  Google Scholar 

  10. Srihari, S.N., Cha, S.-H., Arora, H., Lee, S.: Individuality of handwriting. J. Forensic Sci. 47(4), 1–17 (2002)

    Article  Google Scholar 

  11. Said, H.E., Tan, T.N., Baker, K.D.: Personal identification based on handwriting. Pattern Recogn. 33(1), 149–160 (2000)

    Article  Google Scholar 

  12. He, Z., You, X., Tang, Y.Y.: Writer identification using global wavelet-based features. Neurocomputing 71(10–12), 1832–1841 (2008)

    Article  Google Scholar 

  13. He, S., Schomaker, L.: Deep adaptive learning for writer identification based on single handwritten word images. Pattern Recogn. 88, 64–74 (2019)

    Article  Google Scholar 

  14. Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 701–717 (2007)

    Article  Google Scholar 

  15. Siddiqi, I., Vincent, N.: Text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features. Pattern Recogn. 43(11), 3853–3865 (2010)

    Article  Google Scholar 

  16. Xing, L., Qiao, Y.: DeepWriter: a multi-stream deep CNN for text-independent writer identification. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 584–589. IEEE (2016)

    Google Scholar 

  17. Mohammed, H., Marthot-Santaniello, I., Märgner, V.: GRK-Papyri: a dataset of Greek handwriting on papyri for the task of writer identification. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 726–731 (2019)

    Google Scholar 

  18. Rehman, A., Naz, S., Razzak, M.I., Hameed, I.A.: Automatic visual features for writer identification: a deep learning approach. IEEE Access 7, 17149–17157 (2019)

    Article  Google Scholar 

  19. Xing, L., Qiao, Y.: DeepWriter: a multi-stream deep CNN for text-independent writer identification. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 584–589 (2016)

    Google Scholar 

  20. Christlein, V., Gropp, M., Fiel, S., Maier, A.: Unsupervised feature learning for writer identification and writer retrieval. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 991–997 (2017)

    Google Scholar 

  21. Keglevic, M., Fiel, S., Sablatnig, R.: Learning features for writer retrieval and identification using triplet CNNs. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 211–216 (2018)

    Google Scholar 

  22. Awaida, S.M., Mahmoud, S.A.: State of the art in off-line writer identification of handwritten text and survey of writer identification of Arabic text. Educ. Res. Rev. 7(20), 445–463 (2012)

    Article  Google Scholar 

  23. Tan, G.J., Sulong, G., Rahim, M.S.M.: Writer identification: a comparative study across three world major languages. Forensic Sci. Int. 279, 41–52 (2017)

    Article  Google Scholar 

  24. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  25. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  26. Tang, Y., Wu, X.: Text-independent writer identification via CNN features and joint Bayesian. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 566–571, October 2016

    Google Scholar 

  27. Nasuno, R., Arai, S.: Writer identification for offline Japanese handwritten character using convolutional neural network. In: Proceedings of the 5th IIAE (Institute of Industrial Applications Engineers) International Conference on Intelligent Systems and Image Processing, pp. 94–97 (2017)

    Google Scholar 

  28. Fiel, S., Sablatnig, R.: Writer identification and retrieval using a convolutional neural network. In: Azzopardi, G., Petkov, N. (eds.) CAIP 2015. LNCS, vol. 9257, pp. 26–37. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23117-4_3

    Chapter  Google Scholar 

  29. Chen, S., Wang, Y., Lin, C.-T., Ding, W., Cao, Z.: Semi-supervised feature learning for improving writer identification. Inf. Sci. 482, 156–170 (2019)

    Article  MathSciNet  Google Scholar 

  30. Islam, A.U., Khan, M.J., Khurshid, K., Shafait, F.: Hyperspectral image analysis for writer identification using deep learning. In: 2019 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–7 (2019)

    Google Scholar 

  31. Bar-Yosef, I., Beckman, I., Kedem, K., Dinstein, I.: Binarization, character extraction, and writer identification of historical Hebrew calligraphy documents. IJDAR 9(2–4), 89–99 (2007). https://doi.org/10.1007/s10032-007-0041-5

    Article  Google Scholar 

  32. Fecker, D., Asit, A., Märgner, V., El-Sana, J., Fingscheidt, T.: Writer identification for historical Arabic documents. In: 2014 22nd International Conference on Pattern Recognition, pp. 3050–3055. IEEE (2014)

    Google Scholar 

  33. Schomaker, L., Franke, K., Bulacu, M.: Using codebooks of fragmented connected-component contours in forensic and historic writer identification. Pattern Recogn. Lett. 28(6), 719–727 (2007)

    Article  Google Scholar 

  34. Cilia, N., De Stefano, C., Fontanella, F., Marrocco, C., Molinara, M., Di Freca, A.S.: An end-to-end deep learning system for medieval writer identification. Pattern Recogn. Lett. 129, 137–143 (2020)

    Article  Google Scholar 

  35. Studer, L., et al.: A comprehensive study of imagenet pre-training for historical document image analysis. arXiv preprint arXiv:1905.09113 (2019)

  36. Cilia, N.D., De Stefano, C., Fontanella, F., Marrocco, C., Molinara, M., Scotto Di Freca, A.: A two-step system based on deep transfer learning for writer identification in medieval books. In: Vento, M., Percannella, G. (eds.) CAIP 2019. LNCS, vol. 11679, pp. 305–316. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29891-3_27

    Chapter  Google Scholar 

  37. Mohammed, H., Märgner, V., Stiehl, H.S.: Writer identification for historical manuscripts: analysis and optimisation of a classifier as an easy-to-use tool for scholars from the humanities. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 534–539 (2018)

    Google Scholar 

  38. McCann, S., Lowe, D.G.: Local Naive Bayes nearest neighbor for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3650–3656. IEEE (2012)

    Google Scholar 

  39. Pagels, P.E.: e-codices-virtual manuscript library of Switzerland (2016)

    Google Scholar 

  40. Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)

    Article  Google Scholar 

  41. He, S., Schomaker, L.: DeepOtsu: document enhancement and binarization using iterative deep learning. Pattern Recogn. 91, 379–390 (2019)

    Article  Google Scholar 

  42. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  43. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

    Google Scholar 

  44. Targ, S., Almeida, D., Lyman, K.: Resnet in resnet: generalizing residual architectures. arXiv preprint arXiv:1603.08029 (2016)

  45. Marti, U.-V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recogn. 5(1), 39–46 (2002). https://doi.org/10.1007/s100320200071

    Article  MATH  Google Scholar 

  46. Rong, W., Li, Z., Zhang, W., Sun, L.: An improved canny edge detection algorithm. In: 2014 IEEE International Conference on Mechatronics and Automation, pp. 577–582. IEEE (2014)

    Google Scholar 

Download references

Acknowledgement

Authors would like to thank Dr. Isabelle Marthot-Santaniello from University of Basel, Switzerland for making the dataset available.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sidra Nasir or Imran Siddiqi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nasir, S., Siddiqi, I. (2021). Learning Features for Writer Identification from Handwriting on Papyri. In: Djeddi, C., Kessentini, Y., Siddiqi, I., Jmaiel, M. (eds) Pattern Recognition and Artificial Intelligence. MedPRAI 2020. Communications in Computer and Information Science, vol 1322. Springer, Cham. https://doi.org/10.1007/978-3-030-71804-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-71804-6_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-71803-9

  • Online ISBN: 978-3-030-71804-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics