Skip to main content
Log in

Dewarping of document images: A semi-CNN based approach

  • 1171: Real-time 2D/ 3D Image Processing with Deep Learning
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The camera-captured digital documents may be often distorted and warped due to various document surfaces or camera angles. Also, the OCR systems find difficulty in reading such distorted images. In this paper, a framework for dewarping the images based on estimating the change of pixel-positions due to the unevenness of the surface is proposed. Here, at first, the changes of pixel-positions are measured using the warping factors, which depend on warping position and control parameters. The warping control parameters are calculated from the top and bottom text lines of the document. The warping positional parameters are estimated using the convolution neural network (CNN) that needs many images for training. Capturing such a large number of images is very difficult. For this purpose, we synthetically generated a warped document image dataset. The proposed dewarping technique works for both alphabetic and alpha-syllabary scripts. The results on Bangla (alphasyllabary) and English (alphabetic) are encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Arpan G, Samit B, Sekhar M, Chaudhuri BB (2020) Automatic rectification of warped bangla document images. IET Image Process 14(9):74–83

    Google Scholar 

  2. Brown MS, Seales WB (2004) Image restoration of arbitrarily warped documents. IEEE Trans Pattern Anal Mach Intell 26(10):1295–1306. https://doi.org/10.1109/TPAMI.2004.87

    Article  Google Scholar 

  3. Bukhari SS, Shafait F, Breuel TM (2009) T.m.: Dewarping of document images using coupled-snakes. In: Proceedings of third international workshop on camera-based document analysis and recognition, pp 34–41

  4. Bukhari SS, Shafait F, Breuel TM (2012a) Border noise removal of camera-captured document images using page frame detection. In: Iwamura M, Shafait F (eds) Camera-based document analysis and recognition. Springer, Berlin, pp 126–137

  5. Bukhari SS, Shafait F, Breuel TM (2012b) The IUPR dataset of Camera-Captured document images. Springer, Berlin, pp 164–171

    Google Scholar 

  6. Cao H, Ding X, Liu C (2003) A cylindrical surface model to rectify the bound document image. In: Proceedings Ninth IEEE international conference on computer vision, vol 1, pp 228–233 , DOI https://doi.org/10.1109/ICCV.2003.1238346

  7. Chaudhuri A, Chaudhuri S (1997) Robust detection of skew in document images. IEEE Trans Image Process 6(2):344–349. https://doi.org/10.1109/83.551708

    Article  Google Scholar 

  8. Dai X (2010) A novel approach for the restoration of camera images of planar and curled document. In: 2010 5th international conference on computer science education, pp 1373–1376. https://doi.org/10.1109/ICCSE.2010.5593717

  9. Das S (2019) A statistical tool based binarization method for document images. Multimed Tools Appl 78:27449–27462. https://doi.org/10.1007/s11042-019-07857-x

    Article  Google Scholar 

  10. Diwakar M, Kumar M (2018) Ct image denoising using nlm and correlation-based wavelet packet thresholding. IET Image Process 12(5):708–715

    Article  Google Scholar 

  11. Diwakar M, Singh P (2020) Ct image denoising using multivariate model and its method noise thresholding in non-subsampled shearlet domain. Biomed Signal Process Cont 57:101754. https://doi.org/10.1016/j.bspc.2019.101754. http://www.sciencedirect.com/science/article/pii/S1746809419303350

    Article  Google Scholar 

  12. Dutta A, Garai A, Biswa S (2018) Segmentation of meaningful text-regions from camera captured document images. In: 2018 fifth international conference on emerging applications of information technology (EAIT), pp 1–4, DOI https://doi.org/10.1109/EAIT.2018.8470403

  13. Egozi A, Dinstein I (2011) Statistical mixture model for documents skew angle estimation. Pattern Recognit Lett 32(14):1912–1921. https://doi.org/10.1016/j.patrec.2011.07.004

    Article  Google Scholar 

  14. El BH, Zatni A (2019) Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network. Multimed Tools Appl 78:26453–26481. https://doi.org/10.1007/s11042-019-07855-z

    Article  Google Scholar 

  15. Ezaki H, Uchida S, Asano A, Sakoe H (2005) Dewarping of document image by global optimization. In: Eighth international conference on document analysis and recognition (ICDAR’05), vol 1, pp 302–306, DOI https://doi.org/10.1109/ICDAR.2005.87

  16. Fan H, Zhu L, Tang Y (2010) Skew detection in document images based on rectangular active contour. Int J Document Anal Recognit (IJDAR) 13 (4):261–269. https://doi.org/10.1007/s10032-010-0119-3

    Article  Google Scholar 

  17. Fu B, Wu M, Li R, Li W, Xu Z, Yang C (2007) A model-based book dewarping method using text line detection. In: 2nd Int. workshop on camera-based document analysis and recognition

  18. Fu B, Li W, Wu M, Li R, Xu Z (2012) A document rectification approach dealing with both perspective distortion and warping based on text flow curve fitting. Int J Image Graphics 12(01):1250002. https://doi.org/10.1142/S0219467812500027

    Article  MathSciNet  Google Scholar 

  19. Garai A, Biswas S (2020) Dewarping of single-folded camera captured bangla document images. In: Das A K, Nayak J, Naik B, Pati S K, Pelusi D (eds) Computational intelligence in pattern recognition. Springer, Singapore, pp 647–656

  20. Garai A, Biswas S, Mandal S, Chaudhuri BB (2017) Automatic dewarping of camera captured born-digital bangla document images. In: 2017 Ninth international conference on advances in pattern recognition (ICAPR), pp 1–6, DOI https://doi.org/10.1109/ICAPR.2017.8593157

  21. Garai A, Biswas S, Mandal S (2021) A theoretical justification of warping generation for dewarping using cnn. Pattern Recognit 109:107621. https://doi.org/10.1016/j.patcog.2020.107621. http://www.sciencedirect.com/science/article/pii/S0031320320304246

    Article  Google Scholar 

  22. Gatos B, Pratikakis I, Ntirogiannis K (2007) Segmentation based recovery of arbitrarily warped document images. In: Ninth international conference on document analysis and recognition (ICDAR 2007), vol 2, pp 989–993. https://doi.org/10.1109/ICDAR.2007.4377063

  23. Guan Y (2012) Fast and robust skew estimation in document images through bilinear filtering model. IET Image Process 6 (6):761–769. https://doi.org/10.1049/iet-ipr.2011.0236

    Article  Google Scholar 

  24. He Y, Pan P, Xie S, Sun J, Naoi S (2013) A book dewarping system by boundary-based 3d surface reconstruction. In: 2013 12th International Conference on Document Analysis and Recognition, pp 403–407, DOI https://doi.org/10.1109/ICDAR.2013.88

  25. Jiang HF, Han CC, Fan KC (1997) A fast approach to the detection and correction of skew documents. Pattern Recognit Lett 18(7):675–686. https://doi.org/10.1016/S0167-8655(97)00032-9

    Article  Google Scholar 

  26. Kil T, Seo W, Koo HI, Cho NI (2017) Robust document image dewarping method using text-lines and line segments. In: 2017 14Th IAPR international conference on document analysis and recognition (ICDAR), vol 01, pp 865–870. https://doi.org/10.1109/ICDAR.2017.146

  27. Kim BS, Koo HI, Cho NI (2015) Document dewarping via text-line based optimization. Pattern Recognit 48(11):3600–3614. https://doi.org/10.1016/j.patcog.2015.04.026

    Article  Google Scholar 

  28. Li S, Shen Q, Sun J (2007) Skew detection using wavelet decomposition and projection profile analysis. Pattern Recognit Lett 28(5):555–562. https://doi.org/10.1016/j.patrec.2006.10.002

    Article  Google Scholar 

  29. Liang J, DeMenthon D, Doermann D (2008) Geometric rectification of camera-captured document images. IEEE Trans Pattern Anal Mach Intell 30(4):591–605. https://doi.org/10.1109/TPAMI.2007.70724

    Article  Google Scholar 

  30. Liu C, Zhang Y, Wang B, Ding X (2015) Restoring camera-captured distorted document images. Int J Document Anal Recognit (IJDAR) 18 (2):111–124. https://doi.org/10.1007/s10032-014-0233-8

    Article  Google Scholar 

  31. Liu H, Wu Q, Zha H, Liu X (2008) Skew detection for complex document images using robust borderlines in both text and non-text regions. Pattern Recognit Lett 29(13):1893–1900. https://doi.org/10.1016/j.patrec.2008.06.008

    Article  Google Scholar 

  32. Liu X, Meng G, Fan B, Xiang S, Pan C (2020) Geometric rectification of document images using adversarial gated unwarping network. Pattern Recognit 108:107576. https://doi.org/10.1016/j.patcog.2020.107576

    Article  Google Scholar 

  33. Lu S, Tan CL (2006) Document flattening through grid modeling and regularization. In: 18th international conference on pattern recognition (ICPR’06), vol 1, pp 971–974. https://doi.org/10.1109/ICPR.2006.458

  34. Lu S, Chen BM, Ko CC (2005) Perspective rectification of document images using fuzzy set and morphological operations. Image Vision Comput 23 (5):541–553. https://doi.org/10.1016/j.imavis.2005.01.003

    Article  Google Scholar 

  35. Lu Y, Tan CL (2003) A nearest-neighbor chain based approach to skew estimation in document images. Pattern Recognit Lett 24(14):2315–2323. https://doi.org/10.1016/S0167-8655(03)00057-6

    Article  Google Scholar 

  36. Masalovitch A, Mestetskiy L (2007) Usage of continuous skeletal image representation for document images de- warping

  37. Meng G, Pan C, Xiang S, Duan J (2012) Metric rectification of curved document images. IEEE Trans Pattern Anal Mach Intell 34(4):707–722. https://doi.org/10.1109/TPAMI.2011.151

    Article  Google Scholar 

  38. Meng G, Su Y, Wu Y, Xiang S, Pan C (2018) Exploiting vector fields for geometric rectification of distorted document images. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision – ECCV, vol 2018. Springer International Publishing, Cham, pp 180–195

  39. Mohammad K, Qaroush A, Washha M, Agaian S, Tumar I (2020) An adaptive text-line extraction algorithm for printed arabic documents with diacritics. Multimed Tools Appl https://doi.org/10.1007/s11042-020-09737-1

  40. Sahare P, Dhok SB (2017) Review of text extraction algorithms for scene-text and document images. IETE Tech Rev 34(2):144–164. https://doi.org/10.1080/02564602.2016.1160805

    Article  Google Scholar 

  41. Sanasam I, Choudhary P, Singh KM (2020) Line and word segmentation of handwritten text document by mid-point detection and gap trailing. Multimed Tools Appl https://doi.org/10.1007/s11042-020-09416-1

  42. Sauvola J, Pietikäinen M (2000) Adaptive document image binarization. Pattern Recognit 33(2):225–236. https://doi.org/10.1016/S0031-3203(99)00055-2

    Article  Google Scholar 

  43. Shafait F (2007) Document image dewarping contest. In: 2nd Int. workshop on camera-based document analysis and recognition, pp 181–188

  44. Shafii M, Sid-Ahmed M (2015) Skew detection and correction based on an axes-parallel bounding box. Int J Document Anal Recognit (IJDAR) 18 (1):59–71. https://doi.org/10.1007/s10032-014-0230-y

    Article  Google Scholar 

  45. Stamatopoulos N (2012) Performance evaluation methodology for document image dewarping techniques. IET Image Process 6(7):738–745

    Article  Google Scholar 

  46. Stamatopoulos N, Gatos B, Pratikakis I, Perantonis SJ (2011) Goal-oriented rectification of camera-based document images. IEEE Trans Image Process 20(4):910–920. https://doi.org/10.1109/TIP.2010.2080280

    Article  MathSciNet  Google Scholar 

  47. Tian Y, Narasimhan SG (2011) Rectification and 3d reconstruction of curved document images. In: CVPR, vol 2011, pp 377–384. https://doi.org/10.1109/CVPR.2011.5995540

  48. Ulges A, Lampert CH, Breuel TM (2005) Document image dewarping using robust estimation of curled text lines. In: Eighth international conference on document analysis and recognition (ICDAR’05), vol 2, pp 1001–1005, DOI https://doi.org/10.1109/ICDAR.2005.90

  49. Wagdy M, Faye I, Rohaya D (2014) Document image skew detection and correction method based on extreme points. In: 2014 international conference on computer and information sciences (ICCOINS), pp 1–5, DOI https://doi.org/10.1109/ICCOINS.2014.6868412

  50. Wolberg G (1989) Skeleton-based image warping. Vis Comput 5 (1):95–108. https://doi.org/10.1007/BF01901485

    Article  Google Scholar 

  51. Wu E, Zheng X (2003) Composition of novel views through an efficient image warping. Visual Comput 19(5):319–328. https://doi.org/10.1007/s00371-002-0183-x

    Article  Google Scholar 

  52. Yamashita A, Kawarago A, Kaneko T, Miura KT (2004) Shape reconstruction and image restoration for non-flat surfaces of documents with a stereo vision system. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, vol 1, pp 482–485, DOI https://doi.org/10.1109/ICPR.2004.1334171

  53. Yang P (2017) Effective geometric restoration of distorted historical document for large-scale digitisation. IET Image Process 11(12):841–853

    Article  Google Scholar 

  54. Yau-Chat T, Brown MS (2004) Geometric and shading correction for images of printed materials: A unified approach using boundary. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004, vol 1, pp I–I, DOI https://doi.org/10.1109/CVPR.2004.1315038

  55. You S, Matsushita Y, Sinha S, Bou Y, Ikeuchi K (2017) Multiview rectification of folded documents. IEEE Trans Pattern Anal Mach Intell PP(99) 1–1 https://doi.org/10.1109/TPAMI.2017.2675980

  56. Yousef M, Hussain KF, Mohammed US (2020) Accurate, data-efficient, unconstrained text recognition with convolutional neural networks. Pattern Recognit 108:107482. https://doi.org/10.1016/j.patcog.2020.107482

    Article  Google Scholar 

  57. Zhang L, Tan CL (2006) Restoringwarped document images using shape-from-shading and surface interpolation. In: 18Th international conference on pattern recognition (ICPR’06), vol 1, pp 642–645. https://doi.org/10.1109/ICPR.2006.997

  58. Zhang Y, Liu C, Ding X, Wang K (2009) Restoring warped document image through segmentation and full page interpolation. In: Berkner K , Likforman-Sulem L (eds) Document recognition and retrieval XVI, international society for optics and photonics, SPIE, vol 7247, pp 241–248, DOI https://doi.org/10.1117/12.805424

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arpan Garai.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Garai, A., Biswas, S., Mandal, S. et al. Dewarping of document images: A semi-CNN based approach. Multimed Tools Appl 80, 36009–36032 (2021). https://doi.org/10.1007/s11042-021-10507-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-10507-w

Keywords

Navigation