Skip to main content

PapyRow: A Dataset of Row Images from Ancient Greek Papyri for Writers Identification

  • Conference paper
  • First Online:
Pattern Recognition. ICPR International Workshops and Challenges (ICPR 2021)

Abstract

Papyrology is the discipline that studies texts written on ancient papyri. An important problem faced by papyrologists and, in general by paleographers, is to identify the writers, also known as scribes, who contributed to the drawing up of a manuscript. Traditionally, paleographers perform qualitative evaluations to distinguish the writers, and in recent years, these techniques have been combined with computer-based tools to automatically measure quantities such as height and width of letters, distances between characters, inclination angles, number and types of abbreviations, etc. Recently-emerged approaches in digital paleography combine powerful machine learning algorithms with high-quality digital images. Some of these approaches have been used for feature extraction, other to classify writers with machine learning algorithms or deep learning systems. However, traditional techniques require a preliminary feature engineering step that involves an expert in the field. For this reason, publishing a well-labeled dataset is always a challenge and a stimulus for the academic world as researchers can test their methods and then compare their results from the same starting point. In this paper, we propose a new dataset of handwriting on papyri for the task of writer identification. This dataset is derived directly from GRK-Papyri dataset and the samples are obtained with some enhancement image operation. This paper presents not only the details of the dataset but also the operation of resizing, rotation, background smoothing, and rows segmentation in order to overcome the difficulties posed by the image degradation of this dataset. It is prepared and made freely available for non-commercial research along with their confirmed ground-truth information related to the task of writer identification.

This work was supported in part by the Italian Ministry of University and Research, grant “Dipartimenti di Eccellenza 2018–2022”, and the Swiss National Science Foundation as part of the project n PZ00P1-174149 “Reuniting fragments, identifying scribes and characterizing scripts: the Digital paleography of Greek and Coptic papyri”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    An ostracon (Greek term, whose plural is ostraca) is a piece of pottery on which an inscription is engraved. Usually these pieces are fragments of broken pottery vessels, on which inscriptions were subsequently made.

References

  1. Bipab (ed.): The Bank of Papyrus Images of Byzantine Aphrodite BIPAb (00). http://bipab.aphrodito.info/

  2. Bria, A., et al.: Deep transfer learning for writer identification in medieval books. In: 2018 IEEE International Conference on Metrology for Archaeology and Cultural Heritage, pp. 455–460 (2018)

    Google Scholar 

  3. Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 701–717 (2007)

    Article  Google Scholar 

  4. Cilia, N., De Stefano, C., Fontanella, F., di Freca, A.S.: A ranking-based feature selection approach for handwritten character recognition. Pattern Recogn. Lett. 121, 77–86 (2018)

    Article  Google Scholar 

  5. Cilia, N.D., De Stefano, C., Fontanella, F., Molinara, M., Scotto di Freca, A.: Minimizing training data for reliable writer identification in medieval manuscripts. In: Cristani, M., Prati, A., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11808, pp. 198–208. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30754-7_20

    Chapter  Google Scholar 

  6. Cilia, N.D., De Stefano, C., Fontanella, F., Molinara, M., Scotto di Freca, A.: An end-to-end deep learning system for medieval writer identification. Pattern Recogn. Lett. 129, 137–143 (2020)

    Article  Google Scholar 

  7. Cilia, N.D., De Stefano, C., Fontanella, F., Molinara, M., Scotto di Freca, A.: An experimental comparison between deep learning and classical machine learning approaches for writer identification in medieval documents. J. Imaging 6(9), 89 (2020)

    Article  Google Scholar 

  8. Cilia, N.D., De Stefano, C., Fontanella, F., Molinara, M., Scotto di Freca, A.: What is the minimum training data size to reliably identify writers in medieval manuscripts? Pattern Recogn. Lett. 129, 198–204 (2020)

    Article  Google Scholar 

  9. Cilia, N.D., De Stefano, C., Fontanella, F., Marrocco, C., Molinara, M., Scotto di Freca, A.: A page-based reject option for writer identification in medieval books. In: Cristani, M., Prati, A., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11808, pp. 187–197. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30754-7_19

    Chapter  Google Scholar 

  10. Cilia, N.D., De Stefano, C., Fontanella, F., Marrocco, C., Molinara, M., Scotto Di Freca, A.: A two-step system based on deep transfer learning for writer identification in medieval books. In: Vento, M., Percannella, G. (eds.) CAIP 2019. LNCS, vol. 11679, pp. 305–316. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29891-3_27

    Chapter  Google Scholar 

  11. Dahllof, M.: Scribe attribution for early medieval handwriting by means of letter extraction and classification and a voting procedure for larger pieces. In: Proceedings of the 22nd International Conference on Pattern Recognition, pp. 1910–1915. IEEE Computer Society (2014). https://doi.org/10.1109/ICPR.2014.334

  12. Djeddi, C., Al-Maadeed, S., Gattal, A., Siddiqi, I., Ennaji, A., El Abed, H.: ICFHR 2016 competition on multi-script writer demographics classification using “QUWI” database. ICFHR Proc. IEEE 01, 602–606 (2016)

    Google Scholar 

  13. Fiel, S., et al.: ICDAR 2017 competition on historical document writer identification (historical-WI). In: Proceedings of the 2017 International Conference on Document Analysis and Recognition (ICDAR), pp. 1377–1382 (2017)

    Google Scholar 

  14. Fornes, A., Dutta, A., Gordo, A., J., L.: The ICDAR 2011 music scores competition: staff removal and writer identification. In: Proceedings of the 2011 International Conference on Document Analysis and Recognition, ICDAR, pp. 1511–1515 (2011)

    Google Scholar 

  15. Fournet, J. (ed.): Les archives de Dioscore d’Aphrodité cent ans après leur découverte, histoire et culture dans l’Égypte byzantine. Actes du Colloque de Strasbourg. Études d’archéologie et d’histoire ancienne, Paris (2008)

    Google Scholar 

  16. Joutel, G., Eglin, V., Bres, S., Emptoz, H.: Curvelets based feature extraction of handwritten shapes for ancient manuscripts classification. In: Document Recognition and Retrieval XIV, San Jose, California, USA, 30 January - 1 February 2007, pp. 65000D 1–12 (2007)

    Google Scholar 

  17. Kleber, F., Fiel, S., Diem, M., R., S.: CVL-database: an off-line database for writer retrieval, writer identification and word spotting. In: Proceedings of the 2013 International Conference on Document Analysis and Recognition, ICDAR pp. 560–564 (2013)

    Google Scholar 

  18. Liang, Y., Fairhurst, M.C., Guest, R.M., Erbilek, M.: Automatic handwriting feature extraction, analysis and visualization in the context of digital palaeography. IJPRAI 30(4), 1653001 (2016). 1–26

    Google Scholar 

  19. Louloudis, G., Gatos, B., N., S.: ICFHR 2012 competition on writer identification challenge 1: Latin/Greek documents. In: Frontiers in Handwriting Recognition (ICFHR), pp. 829–834 (2012)

    Google Scholar 

  20. Louloudis, G., Gatos, B., Stamatopoulos, N., Papandreou, A.: ICDAR 2013 competition on writer identification. In: Proceedings of the 2013 International Conference on Document Analysis and Recognition, ICDAR, pp. 1397–1401 (2013)

    Google Scholar 

  21. Louloudis, G., Stamatopoulos, N., Gatos, B.: ICDAR 2011 writer identification contest. In: Proceedings of the 2011 International Conference on Document Analysis and Recognition, ICDAR, pp. 1475–1479 (2011)

    Google Scholar 

  22. Mohammed, H.A., Marthot-Santaniello, I., Märgner, V.: GRK-papyri: a dataset of Greek handwriting on papyri for the task of writer identification. In: Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 726–731 (2019)

    Google Scholar 

  23. Papavassiliou, V., Stafylakis, T., Katsouros, V., Carayannis, G.: Handwritten document image segmentation into text lines and words. Pattern Recogn. 43(1), 369–377 (2010)

    Article  Google Scholar 

  24. Pintus, R., Yang, Y., Gobbetti, E., Rushmeier, H.E.: A TALISMAN: automatic text and line segmentation of historical manuscripts. In: 2014 Eurographics Workshop on Graphics and Cultural Heritage, GCH 2014, Darmstadt, Germany, 6–8 October 2014, pp. 35–44 (2014)

    Google Scholar 

  25. Pintus, R., Yang, Y., Rushmeier, H.E.: ATHENA: automatic text height extraction for the analysis of text lines in old handwritten manuscripts. JOCCH 8(1), 1:1–1:25 (2015)

    Article  Google Scholar 

  26. Ruffini, G. (ed.): Life in an Egyptian Village in Late Antiquity: Aphrodito Before and After the Islamic Conquest. Cambridge University Press, Cambridge, New York (2018)

    Google Scholar 

  27. Shaus, A., Gerber, Y., Faigenbaum-Golovin, S., Sober, B., Piasetzky, E., Finkelstein, I.: Forensic document examination and algorithmic handwriting analysis of Judahite biblical period inscriptions reveal significant literacy level. PLoS One 15(9), e0237962 (2020)

    Article  Google Scholar 

  28. Van Minnen, P. (ed.): The future of papyrology. In: Bagnall, R.S. (ed.) The Oxford Handbook of Papyrology. Oxford University Press, Oxford (2009)

    Google Scholar 

  29. Worp, K., Diethart, J. (eds.): Notarsunterschriften im Byzantinischen Ägypten (1986)

    Google Scholar 

  30. Yosef, I.B., Beckman, I., Kedem, K., Dinstein, I.: Binarization, character extraction, and writer identification of historical Hebrew calligraphy documents. IJDAR 9(2–4), 89–99 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicole Dalia Cilia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cilia, N.D., De Stefano, C., Fontanella, F., Marthot-Santaniello, I., Scotto di Freca, A. (2021). PapyRow: A Dataset of Row Images from Ancient Greek Papyri for Writers Identification. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12667. Springer, Cham. https://doi.org/10.1007/978-3-030-68787-8_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68787-8_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68786-1

  • Online ISBN: 978-3-030-68787-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics