Abstract
The camera-captured digital documents may be often distorted and warped due to various document surfaces or camera angles. Also, the OCR systems find difficulty in reading such distorted images. In this paper, a framework for dewarping the images based on estimating the change of pixel-positions due to the unevenness of the surface is proposed. Here, at first, the changes of pixel-positions are measured using the warping factors, which depend on warping position and control parameters. The warping control parameters are calculated from the top and bottom text lines of the document. The warping positional parameters are estimated using the convolution neural network (CNN) that needs many images for training. Capturing such a large number of images is very difficult. For this purpose, we synthetically generated a warped document image dataset. The proposed dewarping technique works for both alphabetic and alpha-syllabary scripts. The results on Bangla (alphasyllabary) and English (alphabetic) are encouraging.
Similar content being viewed by others
References
Arpan G, Samit B, Sekhar M, Chaudhuri BB (2020) Automatic rectification of warped bangla document images. IET Image Process 14(9):74–83
Brown MS, Seales WB (2004) Image restoration of arbitrarily warped documents. IEEE Trans Pattern Anal Mach Intell 26(10):1295–1306. https://doi.org/10.1109/TPAMI.2004.87
Bukhari SS, Shafait F, Breuel TM (2009) T.m.: Dewarping of document images using coupled-snakes. In: Proceedings of third international workshop on camera-based document analysis and recognition, pp 34–41
Bukhari SS, Shafait F, Breuel TM (2012a) Border noise removal of camera-captured document images using page frame detection. In: Iwamura M, Shafait F (eds) Camera-based document analysis and recognition. Springer, Berlin, pp 126–137
Bukhari SS, Shafait F, Breuel TM (2012b) The IUPR dataset of Camera-Captured document images. Springer, Berlin, pp 164–171
Cao H, Ding X, Liu C (2003) A cylindrical surface model to rectify the bound document image. In: Proceedings Ninth IEEE international conference on computer vision, vol 1, pp 228–233 , DOI https://doi.org/10.1109/ICCV.2003.1238346
Chaudhuri A, Chaudhuri S (1997) Robust detection of skew in document images. IEEE Trans Image Process 6(2):344–349. https://doi.org/10.1109/83.551708
Dai X (2010) A novel approach for the restoration of camera images of planar and curled document. In: 2010 5th international conference on computer science education, pp 1373–1376. https://doi.org/10.1109/ICCSE.2010.5593717
Das S (2019) A statistical tool based binarization method for document images. Multimed Tools Appl 78:27449–27462. https://doi.org/10.1007/s11042-019-07857-x
Diwakar M, Kumar M (2018) Ct image denoising using nlm and correlation-based wavelet packet thresholding. IET Image Process 12(5):708–715
Diwakar M, Singh P (2020) Ct image denoising using multivariate model and its method noise thresholding in non-subsampled shearlet domain. Biomed Signal Process Cont 57:101754. https://doi.org/10.1016/j.bspc.2019.101754. http://www.sciencedirect.com/science/article/pii/S1746809419303350
Dutta A, Garai A, Biswa S (2018) Segmentation of meaningful text-regions from camera captured document images. In: 2018 fifth international conference on emerging applications of information technology (EAIT), pp 1–4, DOI https://doi.org/10.1109/EAIT.2018.8470403
Egozi A, Dinstein I (2011) Statistical mixture model for documents skew angle estimation. Pattern Recognit Lett 32(14):1912–1921. https://doi.org/10.1016/j.patrec.2011.07.004
El BH, Zatni A (2019) Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network. Multimed Tools Appl 78:26453–26481. https://doi.org/10.1007/s11042-019-07855-z
Ezaki H, Uchida S, Asano A, Sakoe H (2005) Dewarping of document image by global optimization. In: Eighth international conference on document analysis and recognition (ICDAR’05), vol 1, pp 302–306, DOI https://doi.org/10.1109/ICDAR.2005.87
Fan H, Zhu L, Tang Y (2010) Skew detection in document images based on rectangular active contour. Int J Document Anal Recognit (IJDAR) 13 (4):261–269. https://doi.org/10.1007/s10032-010-0119-3
Fu B, Wu M, Li R, Li W, Xu Z, Yang C (2007) A model-based book dewarping method using text line detection. In: 2nd Int. workshop on camera-based document analysis and recognition
Fu B, Li W, Wu M, Li R, Xu Z (2012) A document rectification approach dealing with both perspective distortion and warping based on text flow curve fitting. Int J Image Graphics 12(01):1250002. https://doi.org/10.1142/S0219467812500027
Garai A, Biswas S (2020) Dewarping of single-folded camera captured bangla document images. In: Das A K, Nayak J, Naik B, Pati S K, Pelusi D (eds) Computational intelligence in pattern recognition. Springer, Singapore, pp 647–656
Garai A, Biswas S, Mandal S, Chaudhuri BB (2017) Automatic dewarping of camera captured born-digital bangla document images. In: 2017 Ninth international conference on advances in pattern recognition (ICAPR), pp 1–6, DOI https://doi.org/10.1109/ICAPR.2017.8593157
Garai A, Biswas S, Mandal S (2021) A theoretical justification of warping generation for dewarping using cnn. Pattern Recognit 109:107621. https://doi.org/10.1016/j.patcog.2020.107621. http://www.sciencedirect.com/science/article/pii/S0031320320304246
Gatos B, Pratikakis I, Ntirogiannis K (2007) Segmentation based recovery of arbitrarily warped document images. In: Ninth international conference on document analysis and recognition (ICDAR 2007), vol 2, pp 989–993. https://doi.org/10.1109/ICDAR.2007.4377063
Guan Y (2012) Fast and robust skew estimation in document images through bilinear filtering model. IET Image Process 6 (6):761–769. https://doi.org/10.1049/iet-ipr.2011.0236
He Y, Pan P, Xie S, Sun J, Naoi S (2013) A book dewarping system by boundary-based 3d surface reconstruction. In: 2013 12th International Conference on Document Analysis and Recognition, pp 403–407, DOI https://doi.org/10.1109/ICDAR.2013.88
Jiang HF, Han CC, Fan KC (1997) A fast approach to the detection and correction of skew documents. Pattern Recognit Lett 18(7):675–686. https://doi.org/10.1016/S0167-8655(97)00032-9
Kil T, Seo W, Koo HI, Cho NI (2017) Robust document image dewarping method using text-lines and line segments. In: 2017 14Th IAPR international conference on document analysis and recognition (ICDAR), vol 01, pp 865–870. https://doi.org/10.1109/ICDAR.2017.146
Kim BS, Koo HI, Cho NI (2015) Document dewarping via text-line based optimization. Pattern Recognit 48(11):3600–3614. https://doi.org/10.1016/j.patcog.2015.04.026
Li S, Shen Q, Sun J (2007) Skew detection using wavelet decomposition and projection profile analysis. Pattern Recognit Lett 28(5):555–562. https://doi.org/10.1016/j.patrec.2006.10.002
Liang J, DeMenthon D, Doermann D (2008) Geometric rectification of camera-captured document images. IEEE Trans Pattern Anal Mach Intell 30(4):591–605. https://doi.org/10.1109/TPAMI.2007.70724
Liu C, Zhang Y, Wang B, Ding X (2015) Restoring camera-captured distorted document images. Int J Document Anal Recognit (IJDAR) 18 (2):111–124. https://doi.org/10.1007/s10032-014-0233-8
Liu H, Wu Q, Zha H, Liu X (2008) Skew detection for complex document images using robust borderlines in both text and non-text regions. Pattern Recognit Lett 29(13):1893–1900. https://doi.org/10.1016/j.patrec.2008.06.008
Liu X, Meng G, Fan B, Xiang S, Pan C (2020) Geometric rectification of document images using adversarial gated unwarping network. Pattern Recognit 108:107576. https://doi.org/10.1016/j.patcog.2020.107576
Lu S, Tan CL (2006) Document flattening through grid modeling and regularization. In: 18th international conference on pattern recognition (ICPR’06), vol 1, pp 971–974. https://doi.org/10.1109/ICPR.2006.458
Lu S, Chen BM, Ko CC (2005) Perspective rectification of document images using fuzzy set and morphological operations. Image Vision Comput 23 (5):541–553. https://doi.org/10.1016/j.imavis.2005.01.003
Lu Y, Tan CL (2003) A nearest-neighbor chain based approach to skew estimation in document images. Pattern Recognit Lett 24(14):2315–2323. https://doi.org/10.1016/S0167-8655(03)00057-6
Masalovitch A, Mestetskiy L (2007) Usage of continuous skeletal image representation for document images de- warping
Meng G, Pan C, Xiang S, Duan J (2012) Metric rectification of curved document images. IEEE Trans Pattern Anal Mach Intell 34(4):707–722. https://doi.org/10.1109/TPAMI.2011.151
Meng G, Su Y, Wu Y, Xiang S, Pan C (2018) Exploiting vector fields for geometric rectification of distorted document images. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision – ECCV, vol 2018. Springer International Publishing, Cham, pp 180–195
Mohammad K, Qaroush A, Washha M, Agaian S, Tumar I (2020) An adaptive text-line extraction algorithm for printed arabic documents with diacritics. Multimed Tools Appl https://doi.org/10.1007/s11042-020-09737-1
Sahare P, Dhok SB (2017) Review of text extraction algorithms for scene-text and document images. IETE Tech Rev 34(2):144–164. https://doi.org/10.1080/02564602.2016.1160805
Sanasam I, Choudhary P, Singh KM (2020) Line and word segmentation of handwritten text document by mid-point detection and gap trailing. Multimed Tools Appl https://doi.org/10.1007/s11042-020-09416-1
Sauvola J, Pietikäinen M (2000) Adaptive document image binarization. Pattern Recognit 33(2):225–236. https://doi.org/10.1016/S0031-3203(99)00055-2
Shafait F (2007) Document image dewarping contest. In: 2nd Int. workshop on camera-based document analysis and recognition, pp 181–188
Shafii M, Sid-Ahmed M (2015) Skew detection and correction based on an axes-parallel bounding box. Int J Document Anal Recognit (IJDAR) 18 (1):59–71. https://doi.org/10.1007/s10032-014-0230-y
Stamatopoulos N (2012) Performance evaluation methodology for document image dewarping techniques. IET Image Process 6(7):738–745
Stamatopoulos N, Gatos B, Pratikakis I, Perantonis SJ (2011) Goal-oriented rectification of camera-based document images. IEEE Trans Image Process 20(4):910–920. https://doi.org/10.1109/TIP.2010.2080280
Tian Y, Narasimhan SG (2011) Rectification and 3d reconstruction of curved document images. In: CVPR, vol 2011, pp 377–384. https://doi.org/10.1109/CVPR.2011.5995540
Ulges A, Lampert CH, Breuel TM (2005) Document image dewarping using robust estimation of curled text lines. In: Eighth international conference on document analysis and recognition (ICDAR’05), vol 2, pp 1001–1005, DOI https://doi.org/10.1109/ICDAR.2005.90
Wagdy M, Faye I, Rohaya D (2014) Document image skew detection and correction method based on extreme points. In: 2014 international conference on computer and information sciences (ICCOINS), pp 1–5, DOI https://doi.org/10.1109/ICCOINS.2014.6868412
Wolberg G (1989) Skeleton-based image warping. Vis Comput 5 (1):95–108. https://doi.org/10.1007/BF01901485
Wu E, Zheng X (2003) Composition of novel views through an efficient image warping. Visual Comput 19(5):319–328. https://doi.org/10.1007/s00371-002-0183-x
Yamashita A, Kawarago A, Kaneko T, Miura KT (2004) Shape reconstruction and image restoration for non-flat surfaces of documents with a stereo vision system. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, vol 1, pp 482–485, DOI https://doi.org/10.1109/ICPR.2004.1334171
Yang P (2017) Effective geometric restoration of distorted historical document for large-scale digitisation. IET Image Process 11(12):841–853
Yau-Chat T, Brown MS (2004) Geometric and shading correction for images of printed materials: A unified approach using boundary. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004, vol 1, pp I–I, DOI https://doi.org/10.1109/CVPR.2004.1315038
You S, Matsushita Y, Sinha S, Bou Y, Ikeuchi K (2017) Multiview rectification of folded documents. IEEE Trans Pattern Anal Mach Intell PP(99) 1–1 https://doi.org/10.1109/TPAMI.2017.2675980
Yousef M, Hussain KF, Mohammed US (2020) Accurate, data-efficient, unconstrained text recognition with convolutional neural networks. Pattern Recognit 108:107482. https://doi.org/10.1016/j.patcog.2020.107482
Zhang L, Tan CL (2006) Restoringwarped document images using shape-from-shading and surface interpolation. In: 18Th international conference on pattern recognition (ICPR’06), vol 1, pp 642–645. https://doi.org/10.1109/ICPR.2006.997
Zhang Y, Liu C, Ding X, Wang K (2009) Restoring warped document image through segmentation and full page interpolation. In: Berkner K , Likforman-Sulem L (eds) Document recognition and retrieval XVI, international society for optics and photonics, SPIE, vol 7247, pp 241–248, DOI https://doi.org/10.1117/12.805424
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Garai, A., Biswas, S., Mandal, S. et al. Dewarping of document images: A semi-CNN based approach. Multimed Tools Appl 80, 36009–36032 (2021). https://doi.org/10.1007/s11042-021-10507-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10507-w