Abstract
Papyrology is the discipline that studies texts written on ancient papyri. An important problem faced by papyrologists and, in general by paleographers, is to identify the writers, also known as scribes, who contributed to the drawing up of a manuscript. Traditionally, paleographers perform qualitative evaluations to distinguish the writers, and in recent years, these techniques have been combined with computer-based tools to automatically measure quantities such as height and width of letters, distances between characters, inclination angles, number and types of abbreviations, etc. Recently-emerged approaches in digital paleography combine powerful machine learning algorithms with high-quality digital images. Some of these approaches have been used for feature extraction, other to classify writers with machine learning algorithms or deep learning systems. However, traditional techniques require a preliminary feature engineering step that involves an expert in the field. For this reason, publishing a well-labeled dataset is always a challenge and a stimulus for the academic world as researchers can test their methods and then compare their results from the same starting point. In this paper, we propose a new dataset of handwriting on papyri for the task of writer identification. This dataset is derived directly from GRK-Papyri dataset and the samples are obtained with some enhancement image operation. This paper presents not only the details of the dataset but also the operation of resizing, rotation, background smoothing, and rows segmentation in order to overcome the difficulties posed by the image degradation of this dataset. It is prepared and made freely available for non-commercial research along with their confirmed ground-truth information related to the task of writer identification.
This work was supported in part by the Italian Ministry of University and Research, grant “Dipartimenti di Eccellenza 2018–2022”, and the Swiss National Science Foundation as part of the project n PZ00P1-174149 “Reuniting fragments, identifying scribes and characterizing scripts: the Digital paleography of Greek and Coptic papyri”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
An ostracon (Greek term, whose plural is ostraca) is a piece of pottery on which an inscription is engraved. Usually these pieces are fragments of broken pottery vessels, on which inscriptions were subsequently made.
References
Bipab (ed.): The Bank of Papyrus Images of Byzantine Aphrodite BIPAb (00). http://bipab.aphrodito.info/
Bria, A., et al.: Deep transfer learning for writer identification in medieval books. In: 2018 IEEE International Conference on Metrology for Archaeology and Cultural Heritage, pp. 455–460 (2018)
Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 701–717 (2007)
Cilia, N., De Stefano, C., Fontanella, F., di Freca, A.S.: A ranking-based feature selection approach for handwritten character recognition. Pattern Recogn. Lett. 121, 77–86 (2018)
Cilia, N.D., De Stefano, C., Fontanella, F., Molinara, M., Scotto di Freca, A.: Minimizing training data for reliable writer identification in medieval manuscripts. In: Cristani, M., Prati, A., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11808, pp. 198–208. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30754-7_20
Cilia, N.D., De Stefano, C., Fontanella, F., Molinara, M., Scotto di Freca, A.: An end-to-end deep learning system for medieval writer identification. Pattern Recogn. Lett. 129, 137–143 (2020)
Cilia, N.D., De Stefano, C., Fontanella, F., Molinara, M., Scotto di Freca, A.: An experimental comparison between deep learning and classical machine learning approaches for writer identification in medieval documents. J. Imaging 6(9), 89 (2020)
Cilia, N.D., De Stefano, C., Fontanella, F., Molinara, M., Scotto di Freca, A.: What is the minimum training data size to reliably identify writers in medieval manuscripts? Pattern Recogn. Lett. 129, 198–204 (2020)
Cilia, N.D., De Stefano, C., Fontanella, F., Marrocco, C., Molinara, M., Scotto di Freca, A.: A page-based reject option for writer identification in medieval books. In: Cristani, M., Prati, A., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11808, pp. 187–197. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30754-7_19
Cilia, N.D., De Stefano, C., Fontanella, F., Marrocco, C., Molinara, M., Scotto Di Freca, A.: A two-step system based on deep transfer learning for writer identification in medieval books. In: Vento, M., Percannella, G. (eds.) CAIP 2019. LNCS, vol. 11679, pp. 305–316. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29891-3_27
Dahllof, M.: Scribe attribution for early medieval handwriting by means of letter extraction and classification and a voting procedure for larger pieces. In: Proceedings of the 22nd International Conference on Pattern Recognition, pp. 1910–1915. IEEE Computer Society (2014). https://doi.org/10.1109/ICPR.2014.334
Djeddi, C., Al-Maadeed, S., Gattal, A., Siddiqi, I., Ennaji, A., El Abed, H.: ICFHR 2016 competition on multi-script writer demographics classification using “QUWI” database. ICFHR Proc. IEEE 01, 602–606 (2016)
Fiel, S., et al.: ICDAR 2017 competition on historical document writer identification (historical-WI). In: Proceedings of the 2017 International Conference on Document Analysis and Recognition (ICDAR), pp. 1377–1382 (2017)
Fornes, A., Dutta, A., Gordo, A., J., L.: The ICDAR 2011 music scores competition: staff removal and writer identification. In: Proceedings of the 2011 International Conference on Document Analysis and Recognition, ICDAR, pp. 1511–1515 (2011)
Fournet, J. (ed.): Les archives de Dioscore d’Aphrodité cent ans après leur découverte, histoire et culture dans l’Égypte byzantine. Actes du Colloque de Strasbourg. Études d’archéologie et d’histoire ancienne, Paris (2008)
Joutel, G., Eglin, V., Bres, S., Emptoz, H.: Curvelets based feature extraction of handwritten shapes for ancient manuscripts classification. In: Document Recognition and Retrieval XIV, San Jose, California, USA, 30 January - 1 February 2007, pp. 65000D 1–12 (2007)
Kleber, F., Fiel, S., Diem, M., R., S.: CVL-database: an off-line database for writer retrieval, writer identification and word spotting. In: Proceedings of the 2013 International Conference on Document Analysis and Recognition, ICDAR pp. 560–564 (2013)
Liang, Y., Fairhurst, M.C., Guest, R.M., Erbilek, M.: Automatic handwriting feature extraction, analysis and visualization in the context of digital palaeography. IJPRAI 30(4), 1653001 (2016). 1–26
Louloudis, G., Gatos, B., N., S.: ICFHR 2012 competition on writer identification challenge 1: Latin/Greek documents. In: Frontiers in Handwriting Recognition (ICFHR), pp. 829–834 (2012)
Louloudis, G., Gatos, B., Stamatopoulos, N., Papandreou, A.: ICDAR 2013 competition on writer identification. In: Proceedings of the 2013 International Conference on Document Analysis and Recognition, ICDAR, pp. 1397–1401 (2013)
Louloudis, G., Stamatopoulos, N., Gatos, B.: ICDAR 2011 writer identification contest. In: Proceedings of the 2011 International Conference on Document Analysis and Recognition, ICDAR, pp. 1475–1479 (2011)
Mohammed, H.A., Marthot-Santaniello, I., Märgner, V.: GRK-papyri: a dataset of Greek handwriting on papyri for the task of writer identification. In: Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 726–731 (2019)
Papavassiliou, V., Stafylakis, T., Katsouros, V., Carayannis, G.: Handwritten document image segmentation into text lines and words. Pattern Recogn. 43(1), 369–377 (2010)
Pintus, R., Yang, Y., Gobbetti, E., Rushmeier, H.E.: A TALISMAN: automatic text and line segmentation of historical manuscripts. In: 2014 Eurographics Workshop on Graphics and Cultural Heritage, GCH 2014, Darmstadt, Germany, 6–8 October 2014, pp. 35–44 (2014)
Pintus, R., Yang, Y., Rushmeier, H.E.: ATHENA: automatic text height extraction for the analysis of text lines in old handwritten manuscripts. JOCCH 8(1), 1:1–1:25 (2015)
Ruffini, G. (ed.): Life in an Egyptian Village in Late Antiquity: Aphrodito Before and After the Islamic Conquest. Cambridge University Press, Cambridge, New York (2018)
Shaus, A., Gerber, Y., Faigenbaum-Golovin, S., Sober, B., Piasetzky, E., Finkelstein, I.: Forensic document examination and algorithmic handwriting analysis of Judahite biblical period inscriptions reveal significant literacy level. PLoS One 15(9), e0237962 (2020)
Van Minnen, P. (ed.): The future of papyrology. In: Bagnall, R.S. (ed.) The Oxford Handbook of Papyrology. Oxford University Press, Oxford (2009)
Worp, K., Diethart, J. (eds.): Notarsunterschriften im Byzantinischen Ägypten (1986)
Yosef, I.B., Beckman, I., Kedem, K., Dinstein, I.: Binarization, character extraction, and writer identification of historical Hebrew calligraphy documents. IJDAR 9(2–4), 89–99 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Cilia, N.D., De Stefano, C., Fontanella, F., Marthot-Santaniello, I., Scotto di Freca, A. (2021). PapyRow: A Dataset of Row Images from Ancient Greek Papyri for Writers Identification. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12667. Springer, Cham. https://doi.org/10.1007/978-3-030-68787-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-68787-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68786-1
Online ISBN: 978-3-030-68787-8
eBook Packages: Computer ScienceComputer Science (R0)