Mathematical Models and Neural Networks for the Description and the Correction of Typical Distortions of Historical Manuscripts

Savino, Pasquale; Tonazzini, Anna

doi:10.1007/978-3-031-37117-2_37

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14108))

Included in the following conference series:

International Conference on Computational Science and Its Applications

381 Accesses

Abstract

Historical manuscripts are very often degraded by the seeping or transparency of the ink from the page opposite side. Suppressing the interfering text can be of great aid to philologists and paleographers who aim at interpreting the primary text, and nowadays also for the automatic analysis of the text. We formerly proposed a data model, which approximately describes this damage, to generate an artificial training set able to teach a shallow neural network how to classify pixels in clean or corrupted. This NN has proved to be effective in classifying manuscripts where the degradation can be also widely variable. In this paper, we modify the architecture of the NN to better account for ink saturation in text overlay areas, by including a specific class for these pixels. From the experiments, the improvement of the classification and then the restoration is significant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Pratikakis, I., Zagori, K., Kaddas, P., Gatos, B.: ICFHR 2018 competition on handwritten document image binarization (H-DIBCO 2018). In Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 489–493 (2018)
Google Scholar
Pai, Y., Chang, Y., Ruan, S.: Adaptive thresholding algorithm: Efficient computation technique based on intelligent block detection for degraded document images. Pattern Recognit. 43, 3177–3187 (2010)
Article MATH Google Scholar
Westphal, F., Lavesson, N., Grahn, H.: Document image binarization using recurrent neural networks. In: Proceedings of the 13th IAPR International Workshop on Document Analysis Systems (DAS2018), pp. 263–268 (2018)
Google Scholar
Tensmeyer, R., Martinez, T.: Document image binarization with fully convolutional neural networks. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR 2017), pp. 99–104 (2017)
Google Scholar
Vo, Q., Kim, S., Yang, H., Lee, G.: Binarization of degraded document images based on hierarchical deep supervised network. Pattern Recognit. 74, 568–586 (2018)
Article Google Scholar
Fadoua, D., Le Bourgeois, F., Emptoz, H: Restoring ink bleed-through degraded document images using a recursive unsupervised classification technique. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 38–49. Springer, Heidelberg (2006). https://doi.org/10.1007/11669487_4
Sun, B., Li, S., Zhang, X.P., Sun, J.: Blind bleed-through removal for scanned historical document image with conditional random fields. IEEE Trans. Image Process. 5702–5712 (2016)
Google Scholar
Rowley-Brooke, R., Pitié, F., Kokaram, A.: A non-parametric framework for document bleed-through removal. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2954–2960 (2013)
Google Scholar
Huang, Y., Brown, M.S., Xu, D.: User assisted ink-bleed reduction. IEEE Trans. Image Process. 19(10), 2646–2658 (2010)
Article MathSciNet MATH Google Scholar
Hanif, M., Tonazzini, A., Savino, P., Salerno, E.: Non-local sparse image in paintig for document bleed-through removal. J. Imaging 4, 68 (2018)
Article Google Scholar
Tonazzini, A., Savino, P., Salerno, E.: A non-stationary density model to separate overlapped texts in degraded documents. Signal Image Video Process. 9, 155–164 (2015)
Article Google Scholar
Rowley-Brooke, R., Pitié, F., Kokaram, A.C.: Non-rigid recto-verso registration using page outline structure and content preserving warps. In: Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing, pp. 8–13 (2013)
Google Scholar
Wang, J., Tan, C.L.: Non-rigid registration and restoration of double-sided historical manuscripts. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), pp. 1374–1378 (2011)
Google Scholar
Savino, P., Tonazzini, A.: Digital restoration of ancient color manuscripts from geometrically misaligned recto-verso pairs. J. Cultural Heritage 19, 511–521 (2016)
Article Google Scholar
Savino, P., Tonazzini, A., Bedini, L.: Bleed-through cancellation in non-rigidly misaligned recto-verso archival manuscripts based on local registration. Int. J. Doc. Anal. Recognit. 22, 163–176 (2019)
Article Google Scholar
Tonazzini, A., Bedini, L., Salerno, E.: Independent component analysis for document restoration. Int. J. Doc. Anal. Recognit. 7, 17–27 (2004)
Article Google Scholar
Tonazzini, A., Bedini, L.: Restoration of recto-verso colour documents using correlated component analysis. EURASIP J. Adv. Signal Process. 58, 2013 (2013)
Google Scholar
Tonazzini, A., Salerno, E., Bedini, L.: Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique. Int. J. Doc. Anal. Recogn. 10, 17–25 (2007)
Article Google Scholar
Criminisi, A., Pérez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE Trans. Image Process. 13, 1200–1212 (2004)
Article Google Scholar
He, S., Schomaker, L.: DeepOtsu: Document enhancement and binarization using iterative dep learning. Pattern Recogn. 9, 379–390 (2019)
Article Google Scholar
Savino, P., Tonazzini, A.: A Procedure for the routinary correction of back-to-front degradations in archival manuscripts. In: Nguyen, N.T., et al. (eds.) ICCCI 2020. LNCS (LNAI), vol. 12496, pp. 838–849. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63007-2_66
Tonazzini, A., Savino, P., Salerno, E., Hanif, M., Debole, F.: Virtual restoration and content analysis of ancient degraded manuscripts. Int. J. Inf. Sci. Technol. 3, 16–25 (2019)
Google Scholar
Hagan, M.T., Demuth, H.B., Beale, M.H.: Neural Network Design. PWS Publishing, Boston (1996)
Google Scholar
Xiong, W., Jia, X., Xu, J., Xiong, Z., Liu, M., Wang, J.: Historical document image binarization using background estimation and energy minimization. In: Proceedings of the 24th International Conference on Pattern Recognition (ICPR 2018), pp. 3716–3721 (2018)
Google Scholar
Xiong, W., Zhou, L., Yue, L., Li, L., Wang, S.: An enhanced binarization framework for degraded historical document images. EURASIP J. Image Video Process. (2021)
Google Scholar
Rowley-Brooke, R., Pitié, F., Kokaram, A.: A ground truth bleed-through document image database. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol. 7489, pp. 185–196. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33290-6_21
Irish Script On Screen Project (2012). www.isos.dias.ie
Google Scholar
Hanif, M., et al.: Blind bleed-through removal in color ancient manuscripts. Multim. Tools Appl. (2022). https://doi.org/10.1007/s11042-022-13755-6
Savino, P., Tonazzini, A.: A shallow neural net with model-based learning for the virtual restoration of recto-verso manuscripts. 1st International Virtual Conference on Visual Pattern Extraction and Recognition for Cultural Heritage Understanding VIPERC 2022 (2022). https://ceur-ws.org/Vol-3266/paper3.pdf

Download references

Author information

Authors and Affiliations

Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Via G. Moruzzi 1, 56124, Pisa, Italy
Pasquale Savino & Anna Tonazzini

Authors

Pasquale Savino
View author publications
You can also search for this author in PubMed Google Scholar
Anna Tonazzini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Tonazzini .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Italy
Beniamino Murgante
University of Minho, Braga, Portugal
Ana Maria A. C. Rocha
University of Cagliari, Cagliari, Italy
Chiara Garau
University of Basilicata, Potenza, Italy
Francesco Scorza
University of Massachusetts Medical School, Worcester, MA, USA
Yeliz Karaca
Polytechnic University of Bari, Bari, Italy
Carmelo M. Torre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Savino, P., Tonazzini, A. (2023). Mathematical Models and Neural Networks for the Description and the Correction of Typical Distortions of Historical Manuscripts. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2023 Workshops. ICCSA 2023. Lecture Notes in Computer Science, vol 14108. Springer, Cham. https://doi.org/10.1007/978-3-031-37117-2_37

Download citation

DOI: https://doi.org/10.1007/978-3-031-37117-2_37
Published: 29 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37116-5
Online ISBN: 978-3-031-37117-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mathematical Models and Neural Networks for the Description and the Correction of Typical Distortions of Historical Manuscripts