Abstract
Document binarization is a well-known process addressed in the document image analysis literature, which aims to isolate the ink information from the background. Current solutions use deep learning, which requires a great amount of annotated data for training robust models. Data augmentation is known to reduce such annotation requirements, and it can be used in two ways: during training and during prediction. The latter is the so-called Test Time Augmentation (TTA), which has been successfully applied for general classification tasks. In this work, we study the application of TTA for binarization, a more complex and specific task. We focus on cases with a severe scarcity of annotated data over 5 existing binarization benchmarks. Although the results report certain improvements, these are rather limited. This implies that existing TTA strategies are not sufficient for binarization, which points to interesting lines of future work to further boost the performance.
This work was supported by the I+D+i project TED2021-132103A-I00 (DOREMI), funded by MCIN/AEI/10.13039/501100011033.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Afzal, M.Z., Pastor-Pellicer, J., Shafait, F., Breuel, T.M., Dengel, A., Liwicki, M.: Document image binarization using LSTM: a sequence learning approach. In: Proc. of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 79–84. New York, NY, USA (2015)
Ayatollahi, S.M., Nafchi, H.Z.: Persian heritage image binarization competition (PHIBC 2012). In: 2013 First Iranian Conference on Pattern Recognition and Image Analysis (PRIA), pp. 1–4. IEEE (2013)
Bainbridge, D., Bell, T.: The challenge of optical music recognition. Comput. Humanit. 35(2), 95–121 (2001)
Burie, J.C., et al.: ICFHR 2016 competition on the analysis of handwritten text in images of balinese palm leaf manuscripts. In: 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 596–601 (2016)
Calvo-Zaragoza, J., Gallego, A.J.: A selectional auto-encoder approach for document image binarization. Pattern Recogn. 86, 37–47 (2019)
Calvo-Zaragoza, J., Rico-Juan, J.R., Gallego, A.J.: Ensemble classification from deep predictions with test data augmentation. Soft. Comput. 24, 1423–1433 (2020)
Campos, V.B., Toselli, A.H., Vidal, E.: Natural language inspired approach for handwritten text line detection in legacy documents. In: Proc. of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2012), pp. 107–111 (2012)
Castellanos, F.J., Gallego, A.J., Calvo-Zaragoza, J.: Unsupervised neural domain adaptation for document image binarization. Pattern Recogn. 119, 108099 (2021)
Doermann, D., Tombre, K.: Handbook of Document Image Processing and Recognition. Springer, London (2014). https://doi.org/10.1007/978-0-85729-859-1
Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: 2009 10th International Conference on Document Analysis and Recognition, pp. 1375–1382. IEEE (2009)
Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C.: A survey of document image word spotting techniques. Pattern Recogn. 68, 310–332 (2017)
Greff, K., Srivastava, R.K., KoutnÃk, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2017)
He, S., Wiering, M., Schomaker, L.: Junction detection in handwritten documents and its application to writer identification. Pattern Recogn. 48(12), 4036–4048 (2015)
Huang, X., Li, L., Liu, R., Xu, C., Ye, M.: Binarization of degraded document images with global-local U-Nets. Optik 203, 164025 (2020)
Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line detection in handwritten documents. Pattern Recogn. 41(12), 3758–3772 (2008)
Nalepa, J., Myller, M., Kawulok, M.: Training- and test-time data augmentation for hyperspectral image segmentation. IEEE Geosci. Remote Sens. Lett. 17(2), 292–296 (2020)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Pastor-Pellicer, J., España-Boquera, S., Zamora-MartÃnez, F., Afzal, M.Z., Castro-Bleda, M.J.: Insights on the use of convolutional neural networks for document image binarization. In: Rojas, I., Joya, G., Catala, A. (eds.) IWANN 2015. LNCS, vol. 9095, pp. 115–126. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19222-2_10
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rosello, A., Castellanos, F.J., Martinez-Esteso, J.P., Gallego, A.J., Calvo-Zaragoza, J. (2023). Test-Time Augmentation for Document Image Binarization. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-36616-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36615-4
Online ISBN: 978-3-031-36616-1
eBook Packages: Computer ScienceComputer Science (R0)