Abstract
This paper deals with the task of practical and open source Handwritten Text Recognition (HTR) on German medieval manuscripts. We report on our efforts to construct mixed recognition models which can be applied out-of-the-box without any further document-specific training but also serve as a starting point for finetuning by training a new model on a few pages of transcribed text (ground truth). To train the mixed models we collected a corpus of 35 manuscripts and ca. 12.5k text lines for two widely used handwriting styles, Gothic and Bastarda cursives. Evaluating the mixed models out-of-the-box on four unseen manuscripts resulted in an average Character Error Rate (CER) of 6.22%. After training on 2, 4 and eventually 32 pages the CER dropped to 3.27%, 2.58%, and 1.65%, respectively. While the in-domain recognition and training of models (Bastarda model to Bastarda material, Gothic to Gothic) unsurprisingly yielded the best results, finetuning out-of-domain models to unseen scripts was still shown to be superior to training from scratch. Our new mixed models have been made openly available to the community.
Keywords
- Handwritten text recognition
- Medieval manuscripts
- Mixed models
- Document-specific finetuning
This is a preview of subscription content, access via your institution.
Buying options


Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
In Calamari short notation:conv=40:3\(\,\times \,\)3, pool=2\(\,\times \,\)2, conv=60:3\(\,\times \,\)3, pool=2\(\,\times \,\)2, lstm=200, dropout=0.5.
- 15.
In Calamari short notation:conv=40:3\(\,\times \,\)3, pool=2\(\,\times \,\)2, conv=60:3\(\,\times \,\)3, pool=2\(\,\times \,\)2, conv=120:3\(\,\times \,\)3, pool=2\(\,\times \,\)2,lstm=200, lstm=200, lstm=200, dropout=0.5.
- 16.
- 17.
References
Breuel, T.M., Ul-Hasan, A., Al-Azawi, M.A., Shafait, F.: High-performance OCR for printed English and Fraktur using LSTM networks. In: 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 683–687. IEEE (2013). https://doi.org/10.1109/ICDAR.2013.140
Diaz, D.H., Qin, S., Ingle, R., Fujii, Y., Bissacco, A.: Rethinking text line recognition models. arXiv preprint (2021). https://arxiv.org/abs/2104.07787
Eichenberger, N., Suwelack, H., Schröer, A.: Faithful transcriptions. 027.7 J. Libr. Cult. (2021). https://doi.org/10.21428/1bfadeb6.d3bdbcd2
Hawk, B.W., Karaisl, A., White, N.: Modelling medieval hands: practical OCR for caroline minuscule. Digit. Humaniti. Q. 13(1) (2019). http://www.digitalhumanities.org/dhq/vol/13/1/000412/000412.html
Hodel, T., Schoch, D., Schneider, C., Purcell, J.: General models for handwritten text recognition: feasibility and state-of-the art. German kurrent as an example. J. Open Humanit. Data 7(13), 1–10 (2021). https://doi.org/10.5334/johd.46
Kahle, P., Colutto, S., Hackl, G., Mühlberger, G.: Transkribus-a service platform for transcription, recognition and retrieval of historical documents. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 4, pp. 19–24. IEEE (2017). https://doi.org/10.1109/ICDAR.2017.307
Kang, L., Riba, P., Rusiñol, M., Fornés, A., Villegas, M.: Pay attention to what you read: non-recurrent handwritten text-line recognition. arXiv preprint (2020). arXiv:2005.13044, https://arxiv.org/abs/2005.13044
Memon, J., Sami, M., Khan, R.A., Uddin, M.: Handwritten optical character recognition (OCR): a comprehensive systematic literature review (SLR). IEEE Access 8, 142642–142668 (2020). https://doi.org/10.1109/ACCESS.2020.3012542
Michael, J., Weidemann, M., Labahn, R.: HTR engine based on NNs P3. Horizon 2020 Technical report (2018). https://readcoop.eu/wp-content/uploads/2018/12/Del_D7_9.pdf
Mocholí Calvo, C., et al.: Development and experimentation of a deep learning system for convolutional and recurrent neural networks. Ph.D. thesis. Universitat Politècnica de València (2018)
Pletschacher, S., Antonacopoulos, A.: The PAGE (page analysis and ground-truth elements) format framework. In: 20th International Conference on Pattern Recognition, pp. 257–260. IEEE (2010). https://doi.org/10.1109/ICPR.2010.72
Reul, C., et al.: OCR4all-an open-source tool providing a (semi-)automatic OCR workflow for historical printings. Appl. Sci. 9(22), 4853 (2019). https://doi.org/10.3390/app9224853
Reul, C., Springmann, U., Wick, C., Puppe, F.: Improving OCR accuracy on early printed books by combining pretraining, voting, and active learning. JLCL: Spec. Issue Autom. Text Layout Recognit. 33(1), 3–24 (2018). https://jlcl.org/content/2-allissues/2-heft1-2018/jlcl_2018-1_1.pdf
Reul, C., Springmann, U., Wick, C., Puppe, F.: Improving OCR accuracy on early printed books by utilizing cross fold training and voting. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 423–428. IEEE (2018). https://doi.org/10.1109/DAS.2018.30
Reul, C., Wick, C., Noeth, M., Wehner, M., Springmann, U.: Mixed model OCR training on historical Latin script for Out-of-the-box recognition and finetuning. In: The 6th International Workshop on Historical Document Imaging and Processing, pp. 7–12 (2021). https://doi.org/10.1145/3476887.3476910
Sánchez, J.A., Romero, V., Toselli, A.H., Villegas, M., Vidal, E.: A set of benchmarks for handwritten text recognition on historical documents. Pattern Recognit. 94, 122–134 (2019). https://doi.org/10.1016/j.patcog.2019.05.025
Springmann, U., Lüdeling, A.: OCR of historical printings with an application to building diachronic corpora: a case study using the RIDGES herbal corpus. Digit. Humanit. Q. 11(2) (2017), http://www.digitalhumanities.org/dhq/vol/11/2/000288/000288.html
Stökl Ben Ezra, D., Brown-DeVost, B., Jablonski, P., Lapin, H., Kiessling, B., Lolli, E.: BiblIA-a general model for medieval hebrew manuscripts and an open annotated dataset. In: The 6th International Workshop on Historical Document Imaging and Processing, pp. 61–66 (2021). https://doi.org/10.1145/3476887.3476896
Wick, C., Reul, C., Puppe, F.: Calamari-a high-performance tensorflow-based deep learning package for optical character recognition. Digit. Humanit. Q. 14(2) (2020). http://www.digitalhumanities.org/dhq/vol/14/2/000451/000451.html
Acknowledgement
The authors would like to thank our student research assistants Lisa Gugel, Kiara Hart, Ursula Heß, Annika Müller, and Anne Schmid for their extensive segmentation and transcription work as well as Maximilian Nöth and Maximilian Wehner for supporting the data preparation.
This work was partially funded by the German Research Foundation (DFG) under project no. 460665940.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Reul, C., Tomasek, S., Langhanki, F., Springmann, U. (2022). Open Source Handwritten Text Recognition on Medieval Manuscripts Using Mixed Models and Document-Specific Finetuning. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237. Springer, Cham. https://doi.org/10.1007/978-3-031-06555-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-031-06555-2_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06554-5
Online ISBN: 978-3-031-06555-2
eBook Packages: Computer ScienceComputer Science (R0)
-
Published in cooperation with
http://www.iapr.org/