Abstract
Document images are now widely captured by handheld devices such as mobile phones. The OCR performance on these images are largely affected due to geometric distortion of the document paper, diverse camera positions and complex backgrounds. In this paper, we propose a simple yet effective approach to rectify distorted document image by estimating control points and reference points. After that, we use interpolation method between control points and reference points to convert sparse mappings to backward mapping, and remap the original distorted document image to the rectified image. Furthermore, control points are controllable to facilitate interaction or subsequent adjustment. We can flexibly select post-processing methods and the number of vertices according to different application scenarios. Experiments show that our approach can rectify document images with various distortion types, and yield state-of-the-art performance on real-world dataset. This paper also provides a training dataset based on control points for document dewarping. Both the code and the dataset are released at https://github.com/gwxie/Document-Dewarping-with-Control-Points.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brown, M.S., Tsoi, Y.C.: Geometric and shading correction for images of printed materials using boundary. IEEE Trans. Image Process. 15(6), 1544–1554 (2006)
Courteille, F., Crouzil, A., Durou, J.D., Gurdjos, P.: Shape from shading for the digitization of curved documents. Mach. Vision Appl. 18(5), 301–316 (2007)
Das, S., Ma, K., Shu, Z., Samaras, D., Shilkrot, R.: DewarpNet: single-image document unwarping with stacked 3D and 2D regression networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 131–140 (2019)
He, Y., Pan, P., Xie, S., Sun, J., Naoi, S.: A book dewarping system by boundary-based 3D surface reconstruction. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 403–407. IEEE (2013)
Li, X., Zhang, B., Liao, J., Sander, P.V.: Document rectification and illumination correction using a patch-based CNN. ACM Trans. Graphics 38(6), 1–11 (2019)
Liu, C., Zhang, Y., Wang, B., Ding, X.: Restoring camera-captured distorted document images. Int. J. Doc. Anal. Recogn. 18(2), 111–124 (2015)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Liu, X., Meng, G., Fan, B., Xiang, S., Pan, C.: Geometric rectification of document images using adversarial gated unwarping network. Pattern Recogn. 108, 107576 (2020)
Ma, K., Shu, Z., Bai, X., Wang, J., Samaras, D.: DocUNet: document image unwarping via a stacked U-Net. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4709 (2018)
Markovitz, A., Lavi, I., Perel, O., Mazor, S., Litman, R.: Can you read me now? Content aware rectification using angle supervision. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 208–223. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_13
Meijering, E.: A chronology of interpolation: from ancient astronomy to modern signal and image processing. Proc. IEEE 90(3), 319–342 (2002)
Ramanna, V., Bukhari, S.S., Dengel, A.: Document image dewarping using deep learning. In: International Conference on Pattern Recognition Applications and Methods (2019)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Neural Information Processing Systems (2015)
Sorkine, O.: Laplacian mesh processing. In: Eurographics (STARs), p. 29 (2005)
Stamatopoulos, N., Gatos, B., Pratikakis, I., Perantonis, S.J.: Goal-oriented rectification of camera-based document images. IEEE Trans. Image Process. 20(4), 910–920 (2010)
Tian, Y., Narasimhan, S.G.: Rectification and 3D reconstruction of curved document images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 377–384. IEEE (2011)
Tsoi, Y.C., Brown, M.S.: Multi-view document rectification using boundary. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Wada, T., Ukida, H., Matsuyama, T.: Shape from shading with interreflections under a proximal light source: Distortion-free copying of an unfolded book. Int. J. Comput. Vision 24(2), 125–135 (1997)
Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.-G.: Pixel2Mesh: generating 3D mesh models from single RGB images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 55–71. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_4
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, pp. 1398–1402. IEEE (2003)
Xie, G.-W., Yin, F., Zhang, X.-Y., Liu, C.-L.: Dewarping document image by displacement flow estimation with fully convolutional network. In: Bai, X., Karatzas, D., Lopresti, D. (eds.) DAS 2020. LNCS, vol. 12116, pp. 131–144. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57058-3_10
You, S., Matsushita, Y., Sinha, S., Bou, Y., Ikeuchi, K.: Multiview rectification of folded documents. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 505–511 (2017)
Zhang, L., Yip, A.M., Brown, M.S., Tan, C.L.: A unified framework for document restoration using inpainting and shape-from-shading. Pattern Recogn. 42(11), 2961–2978 (2009)
Acknowledgements
This work has been supported by the National Key Research and Development Program Grant 2020AAA0109702, the National Natural Science Foundation of China (NSFC) grants 61733007, 61721004.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Xie, GW., Yin, F., Zhang, XY., Liu, CL. (2021). Document Dewarping with Control Points. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12821. Springer, Cham. https://doi.org/10.1007/978-3-030-86549-8_30
Download citation
DOI: https://doi.org/10.1007/978-3-030-86549-8_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86548-1
Online ISBN: 978-3-030-86549-8
eBook Packages: Computer ScienceComputer Science (R0)