Abstract
Historical manuscripts are very often degraded by the seeping or transparency of the ink from the page opposite side. Suppressing the interfering text can be of great aid to philologists and paleographers who aim at interpreting the primary text, and nowadays also for the automatic analysis of the text. We formerly proposed a data model, which approximately describes this damage, to generate an artificial training set able to teach a shallow neural network how to classify pixels in clean or corrupted. This NN has proved to be effective in classifying manuscripts where the degradation can be also widely variable. In this paper, we modify the architecture of the NN to better account for ink saturation in text overlay areas, by including a specific class for these pixels. From the experiments, the improvement of the classification and then the restoration is significant.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pratikakis, I., Zagori, K., Kaddas, P., Gatos, B.: ICFHR 2018 competition on handwritten document image binarization (H-DIBCO 2018). In Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 489–493 (2018)
Pai, Y., Chang, Y., Ruan, S.: Adaptive thresholding algorithm: Efficient computation technique based on intelligent block detection for degraded document images. Pattern Recognit. 43, 3177–3187 (2010)
Westphal, F., Lavesson, N., Grahn, H.: Document image binarization using recurrent neural networks. In: Proceedings of the 13th IAPR International Workshop on Document Analysis Systems (DAS2018), pp. 263–268 (2018)
Tensmeyer, R., Martinez, T.: Document image binarization with fully convolutional neural networks. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR 2017), pp. 99–104 (2017)
Vo, Q., Kim, S., Yang, H., Lee, G.: Binarization of degraded document images based on hierarchical deep supervised network. Pattern Recognit. 74, 568–586 (2018)
Fadoua, D., Le Bourgeois, F., Emptoz, H: Restoring ink bleed-through degraded document images using a recursive unsupervised classification technique. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 38–49. Springer, Heidelberg (2006). https://doi.org/10.1007/11669487_4
Sun, B., Li, S., Zhang, X.P., Sun, J.: Blind bleed-through removal for scanned historical document image with conditional random fields. IEEE Trans. Image Process. 5702–5712 (2016)
Rowley-Brooke, R., Pitié, F., Kokaram, A.: A non-parametric framework for document bleed-through removal. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2954–2960 (2013)
Huang, Y., Brown, M.S., Xu, D.: User assisted ink-bleed reduction. IEEE Trans. Image Process. 19(10), 2646–2658 (2010)
Hanif, M., Tonazzini, A., Savino, P., Salerno, E.: Non-local sparse image in paintig for document bleed-through removal. J. Imaging 4, 68 (2018)
Tonazzini, A., Savino, P., Salerno, E.: A non-stationary density model to separate overlapped texts in degraded documents. Signal Image Video Process. 9, 155–164 (2015)
Rowley-Brooke, R., Pitié, F., Kokaram, A.C.: Non-rigid recto-verso registration using page outline structure and content preserving warps. In: Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing, pp. 8–13 (2013)
Wang, J., Tan, C.L.: Non-rigid registration and restoration of double-sided historical manuscripts. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), pp. 1374–1378 (2011)
Savino, P., Tonazzini, A.: Digital restoration of ancient color manuscripts from geometrically misaligned recto-verso pairs. J. Cultural Heritage 19, 511–521 (2016)
Savino, P., Tonazzini, A., Bedini, L.: Bleed-through cancellation in non-rigidly misaligned recto-verso archival manuscripts based on local registration. Int. J. Doc. Anal. Recognit. 22, 163–176 (2019)
Tonazzini, A., Bedini, L., Salerno, E.: Independent component analysis for document restoration. Int. J. Doc. Anal. Recognit. 7, 17–27 (2004)
Tonazzini, A., Bedini, L.: Restoration of recto-verso colour documents using correlated component analysis. EURASIP J. Adv. Signal Process. 58, 2013 (2013)
Tonazzini, A., Salerno, E., Bedini, L.: Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique. Int. J. Doc. Anal. Recogn. 10, 17–25 (2007)
Criminisi, A., Pérez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE Trans. Image Process. 13, 1200–1212 (2004)
He, S., Schomaker, L.: DeepOtsu: Document enhancement and binarization using iterative dep learning. Pattern Recogn. 9, 379–390 (2019)
Savino, P., Tonazzini, A.: A Procedure for the routinary correction of back-to-front degradations in archival manuscripts. In: Nguyen, N.T., et al. (eds.) ICCCI 2020. LNCS (LNAI), vol. 12496, pp. 838–849. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63007-2_66
Tonazzini, A., Savino, P., Salerno, E., Hanif, M., Debole, F.: Virtual restoration and content analysis of ancient degraded manuscripts. Int. J. Inf. Sci. Technol. 3, 16–25 (2019)
Hagan, M.T., Demuth, H.B., Beale, M.H.: Neural Network Design. PWS Publishing, Boston (1996)
Xiong, W., Jia, X., Xu, J., Xiong, Z., Liu, M., Wang, J.: Historical document image binarization using background estimation and energy minimization. In: Proceedings of the 24th International Conference on Pattern Recognition (ICPR 2018), pp. 3716–3721 (2018)
Xiong, W., Zhou, L., Yue, L., Li, L., Wang, S.: An enhanced binarization framework for degraded historical document images. EURASIP J. Image Video Process. (2021)
Rowley-Brooke, R., Pitié, F., Kokaram, A.: A ground truth bleed-through document image database. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol. 7489, pp. 185–196. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33290-6_21
Irish Script On Screen Project (2012). www.isos.dias.ie
Hanif, M., et al.: Blind bleed-through removal in color ancient manuscripts. Multim. Tools Appl. (2022). https://doi.org/10.1007/s11042-022-13755-6
Savino, P., Tonazzini, A.: A shallow neural net with model-based learning for the virtual restoration of recto-verso manuscripts. 1st International Virtual Conference on Visual Pattern Extraction and Recognition for Cultural Heritage Understanding VIPERC 2022 (2022). https://ceur-ws.org/Vol-3266/paper3.pdf
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Savino, P., Tonazzini, A. (2023). Mathematical Models and Neural Networks for the Description and the Correction of Typical Distortions of Historical Manuscripts. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2023 Workshops. ICCSA 2023. Lecture Notes in Computer Science, vol 14108. Springer, Cham. https://doi.org/10.1007/978-3-031-37117-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-031-37117-2_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37116-5
Online ISBN: 978-3-031-37117-2
eBook Packages: Computer ScienceComputer Science (R0)