Abstract
To be able to process historical documents, it is often required to first binarize the image (background and foreground separation) before applying the processing itself. Historical documents are challenging to binarize because of the numerous degradations they suffer such as bleed-through, illuminations, background degradations or ink drops. We present in this paper a new approach to tackle this task by a combination of two neural networks. Recently, the DIBCO binarization competition has seen a growing interest in the use of supervised methods to binarize challenging images. Inspired by the winner of the DIBCO 17 competition, which uses a fully convolutional neural network (FCN), we propose a combination of two FCNs to obtain better performance. While the two FCNs have the same architecture, they are trained on different representations of the input image. The first one uses downscaled image to capture the global context and the object locations. The second one works on patches of native resolution to help defining precisely the boundaries of the characters by capturing the local context. The final prediction is obtained by combining the results of the two FCNs. We show in the experiments that this strategy provides better results and outperforms the winner of the DIBCO17 competition.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Impact project. http://www.impact-project.eu
Read project. http://read.transkribus.eu/
Afzal, M.Z., Pastor-Pellicer, J., Shafait, F., Breuel, T.M., Dengel, A., Liwicki, M.: Document image binarization using LSTM: a sequence learning approach. In: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 79–84. ACM (2015)
Almeida, M., Lins, R.D., Bernardino, R., Jesus, D., Lima, B.: A new binarization algorithm for historical documents. J. Imaging 4(2), 27 (2018)
Alvarez, J.M., Gevers, T., LeCun, Y., Lopez, A.M.: Road scene segmentation from a single image. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 376–389. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33786-4_28
Calvo-Zaragoza, J., Gallego, A.J.: A selectional auto-encoder approach for document image binarization. arXiv preprint arXiv:1706.10241 (2017)
Fink, M., Layer, T., Mackenbrock, G., Sprinzl, M.: Baseline detection in historical documents using convolutional u-nets. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 37–42. IEEE (2018)
Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recogn. 39(3), 317–327 (2006)
Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C.: A survey of document image word spotting techniques. Pattern Recogn. 68, 310–332 (2017)
Grüning, T., Leifert, G., Strauß, T., Labahn, R.: A Two-Stage Method for Text Line Detection in Historical Documents (2018). http://arxiv.org/abs/1802.03345
He, S., Wiering, M., Schomaker, L.: Junction detection in handwritten documents and its application to writer identification. Pattern Recogn. 48(12), 4036–4048 (2015)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. arXiv preprint arXiv:1709.01507 (2017)
Journet, N., Visani, M., Mansencal, B., Van-Cuong, K., Billy, A.: DocCreator: a new software for creating synthetic ground-truthed document images. J. Imaging 3(4), 62 (2017)
LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line detection in handwritten documents. Pattern Recogn. 41(12), 3758–3772 (2008)
Lu, H., Kot, A.C., Shi, Y.Q.: Distance-reciprocal distortion measure for binary document images. IEEE Sig. Process. Lett. 11(2), 228–231 (2004)
Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs (1986)
Ntirogiannis, K., Gatos, B., Pratikakis, I.: Performance evaluation methodology for historical document image binarization. IEEE Trans. Image Process. 22(2), 595–609 (2013)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Pastor-Pellicer, J., España-Boquera, S., Zamora-Martínez, F., Afzal, M.Z., Castro-Bleda, M.J.: Insights on the use of convolutional neural networks for document image binarization. In: Rojas, I., Joya, G., Catala, A. (eds.) IWANN 2015. LNCS, vol. 9095, pp. 115–126. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19222-2_10
Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR 2017 competition on document image binarization (DIBCO 2017). In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp. 1395–1403. IEEE (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)
Tensmeyer, C., Martinez, T.: Document image binarization with fully convolutional neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 99–104. IEEE (2017)
Westphal, F., Lavesson, N., Grahn, H.: Document image binarization using recurrent neural networks. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 263–268. IEEE (2018)
Wolf, C., Jolion, J.M., Chassaing, F.: Text localization, enhancement and binarization in multimedia documents. In: 2002 Proceedings of 16th International Conference on Pattern Recognition, vol. 2, pp. 1037–1040. IEEE (2002)
Afzal, M.Z., Krämer, M., Bukhari, S.S., Yousefi, M.R., Shafait, F., Breuel, T.M.: Robust binarization of stereo and monocular document images using percentile filter. In: Iwamura, M., Shafait, F. (eds.) CBDAR 2013. LNCS, vol. 8357, pp. 139–149. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05167-3_11
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Karpinski, R., Belaïd, A. (2019). Combination of Two Fully Convolutional Neural Networks for Robust Binarization. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11363. Springer, Cham. https://doi.org/10.1007/978-3-030-20893-6_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-20893-6_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20892-9
Online ISBN: 978-3-030-20893-6
eBook Packages: Computer ScienceComputer Science (R0)