Skip to main content
Log in

An intelligent threats solution for object detection and resource perspective rectification of distorted anomaly identification card images in cloud environments

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Optical character recognition (OCR) in general and identity card recognition (IDOCR) in particular, embedded in mobile cameras, are being interested in and attracting the research community in Vietnam. Due to the variety of camera devices that capture images, the IDOCR systems often face a lot of difficulties, typically as the input images are distorted, rotated, scaled, translated, or sheared lead to weak recognition accuracy results. In this paper, we focus on the distortion document image problem of Vietnamese IDOCR system and propose an effective method to solve this problem. Our solution includes three main stages: (i) ROI detection; (ii) Image segmentation and corner points detection. (iii) distorted image area rectification. The accuracy and execution time of the method are verified on a large amount of data collected from the real environment with very differently lighting condition, shooting distance, camera angle and image size. The experimental results show that our method gains high accuracy, real-time calculation and is able to deal with the distorted input images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Tan NTT, Trong KN (2019) A method for segmentation of vietnamese identification card text fields. Int J Adv Comput Sci Appl 10(10):415–421

  2. Tan NTT, Lam LH, Nam NH (2020) An efficient method for automatic recognizing text fields on identification card. Vnu J Sci Math – Phys 36(1):64–70

    Article  Google Scholar 

  3. Van Hoai DP, Duong H-T, Hoang VT (2021) Text recognition for Vietnamese identity card based on deep features network. International Journal on Document Analysis and Recognition (IJDAR). https://doi.org/10.1007/s10032-021-00363-7

  4. Hung PD, Linh DQ (2019) Implementing an android application for automatic vietnamese business card recognition. Pattern Recogn Image Anal 29(1):156–166

    Article  Google Scholar 

  5. Shafait F, Breuel TM (2007) Document image dewarping contest. In: 2nd International Workshop on Camera-Based Document Analysis and Recognition, Curitiba, pp 181–188

  6. Duan L-Y, Ji R, Chen Z, Huang T, Gao W (2014) Towards mobiledocument image retrieval for digital library. IEEE Trans Multimed 16(2):346–359

    Article  Google Scholar 

  7. Cutter M, Manduchi R (2015) Towards mobile ocr: How to take a good picture of a document without sight. In: Proc of the 15th ACM SIGWEB International Symposium on Document Engineering

  8. Fang X, Fu X, Xu X (2017) ID card identification system based on image recognition. In: 2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA), pp 1488–1492. https://doi.org/10.1109/ICIEA.2017.8283074

  9. Ravneet K (2018) Text recognition applications for mobile devices. J Glob Res Comput Sci 9 (4):20–24

    Google Scholar 

  10. Satyawan W, Octaviano Pratama M, Jannati R, Muhammad G, Fajar B, Hamzah H, Fikri R, Kristian K (2019) Citizen id card detection using image processing and optical character recognition. J Phys Conf Ser 1235:012049

  11. Baek J, Kim G, Lee J, Park S, Han D, Yun S, Oh SJ, Lee H (2019) What Is Wrong With Scene Text Recognition Model Comparisons?. Dataset and Model Analysis, arXiv:1904.01906

  12. Sourvanos N, Tsatiris G (2018) Challenges In Input Preprocessing for Mobile OCR Applications: A Realistic Testing Scenario. 9th International Conference on Information, Intelligence, Systems and Applications (IISA). https://doi.org/10.1109/iisa.2018.8633688

  13. Bulatov K, Matalov D, Arlazarov V (2020) MIDV-2019: challenges of the modern mobile-based document OCR. Proceedings of SPIE 11433, Twelfth International Conference on Machine Vision (ICMV 2019), pp 114332N. https://doi.org/10.1117/12.2558438

  14. Fu B, Wu M, Li R, Li W, Xu Z, Yang C (2007) A model-based book dewarping method using text line detection. In: Proc. Int. Workshop Camera-Based Document Anal. Recognit., Curitiba, pg 63–70

  15. Zhang L, Zhang Y, Tan C (2008) An improved physically-based method for geometric restoration of distorted document images. IEEE Trans Pattern Anal Mach Intell 30(4):728–734

    Article  Google Scholar 

  16. Liang J, DeMenthon D, Doermann D (2008) Geometric rectification of Camera-Captured document images. IEEE Trans Pattern Anal Mach Intell 30(4):591–605. https://doi.org/10.1109/TPAMI.2007.70724

    Article  Google Scholar 

  17. Williem CS, Park IK (2015) Correcting geometric and photometric distortion of document images on a smartphone. J Electron Imaging 24(1):013038. https://doi.org/10.1117/1.JEI.24.1.013038

  18. Kim BS, Koo HI, Cho NI (2015) Document dewarping via text line based optimization. Pattern Recogn 48(11):3600–3614

    Article  Google Scholar 

  19. ChangJun ML A Mathematic Morphology Approach for Radial Lens Correction of Document Image. 2010 International Conference On Computer Design And Appliations, pp 465–s468

  20. Luo S, Fang X, Zhao C, Luo Y (2011) Text Line Based Correction of Distorted Document Images. 2011 International Joint Conference of IEEE TrustCom-11/IEEE ICESS-11/FCST-11, pp 1494–s1499

  21. Kil T, Seo W, Koo HI, Cho NI (2017) Robust Document Image Dewarping Method Using Text-Lines and Line Segments. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol 1. IEEE, pp 865–870

  22. Zhang L, Yip AM, Brown MS, Tan C (2009) A Unified Framework for Document Restoration Using Inpainting and Shape-from-shading. Pattern Recogn 42(11), pp 2961–2978

  23. Zhang L, Zhang Y, Tan C (2008) An improved physically based method for geometric restoration of distorted document images. IEEE Trans Pattern Anal Mach Intell 30(4):728–734

  24. Ostlund J, Varol A, Ngo DT, Fua P (2012) Laplacian meshes for monocular 3D shape recovery. In: Proceedings of the European Conference on Computer Vision. Springer, pp 412–425

  25. Meng G, Wang Y, Qu S, Xiang S, Pan C (2014) Active flattening of curved document images via two structured beams. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 3890–3897

  26. You S, Matsushita Y, Sinha S, Bou Y, Ikeuchi K (2018) Multiview rectification of folded documents. IEEE Trans Pattern Anal Mach Intell 40:505–511

    Article  Google Scholar 

  27. Takezawa Y, Hasegawa M, Tabbone S (2016) Camera-captured document image perspective distortion correction using vanishing point detection based on radon transform. In: International Conference on Pattern Recognition, pp 3968–3974

  28. Takezawa Y, Hasegawa M, Tabbone S (2017) Robust Perspective Rectification of Camera-Captured Document Images. 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 27–32

  29. Sheshkus A, Ingacheva A, Arlazarov V, Nikolaev D (2019) HoughNet: neural network architecture for vanishing points detection, arXiv:1909.03812

  30. Yue L, Li H, Zheng X (2019) Distorted building image matching with automatic viewpoint rectification and fusion. Sensors 19(23):5205

    Article  Google Scholar 

  31. Das S, Mishra G, Sudharshana A, Shilkrot R (2017) The Common Fold: Utilizing the Four-Fold to Dewarp Printed Documents from a Single Image. In: Proceedings of the 2017 ACM Symposium on Document Engineering, pp 125–128

  32. Ke M, Shu Z, Bai X, Wang J, Samaras D (2018) Docunet: Document image unwarping via a stacked unet. International Conference on Computer Vision (ICCV), pp 4700– 4709

  33. Das S, Ma K, Shu Z, Samaras D, Shilkrot R (2019) DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks. IEEE/CVF International Conference on Computer Vision (ICCV), pp 131–140

  34. Liua X, Meng G, Fana B, Xianga S, Pan C (2020) Geometric rectification of document images using adversarial gated unwarping network. Pattern Recogn 108

  35. Zou Z, Shi Z, Guo Y, Ye J Object detection in 20 years: A survey. [Online]. Available: https://arxiv.org/abs/1905.05055v2

  36. Zhao Z, Zheng P, Xu S, Wu X (2019) Object detection with deep learning: a review, vol 30. https://doi.org/10.1109/TNNLS.2018.2876865

  37. Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access:1–1. https://doi.org/10.1109/access.2019.2939201

  38. Long X, Deng K, Wang G, Zhang Y, Dang Q, Gao Y, Shen H, Ren J, Han S, Ding E et al (2020) Pp-yolo: An effective and efficient implementation of object detector. 2007.12099

  39. Ren S, He K, Girshick R, Sun J (2017) Faster r-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  40. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. Proc IEEE Conf Comput Vis Pattern Recognit:779–788

  41. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single shot multibox detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision—ECCV. Springer, Cham, pp 21–37

  42. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-CNN. In: Proceedings of IEEE ICCV, pp 2980–2988

  43. Gonzalez RC, Woods RE (2018) Digital Image Processing, 4th edn. Pearson

  44. Lakshmi S, Sankaranarayanan V (2011) A study of edge detection techniques for segmentation computing approaches. IJCA Special Issue on CASCT, pp 35–41

  45. Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: A metric and a loss for bounding box regression. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  46. Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: Faster and better learning for bounding box regression. In: AAAI

Download references

Acknowledgments

This research was supported by Energy Cloud R&D Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (2019M3F2A1073387), and this research was supported by Institute for Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government (MSIT) (No.2018-0-01456, AutoMaTa: Autonomous Management framework based on artificial intelligent Technology for adaptive and disposable IoT). This research also was supported by the project “Research on contentbased image retrieval using relevance feedback with sparse representation classification” code CS21.04 of Institute of Information Technology (IOIT), Vietnam Academy of Science and Technology (VAST). Any correspondence related to this paper should be addressed to Prof. Dohyeun Kim and Prof. Le Anh Ngoc. We are also thankful to VietNam National Technology Program (KC 4.0) and the Electric Power University (EPU) for their sponsorship of this research.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Do Hyeun Kim or Le Anh Ngoc.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, N.T.T., Van Huy, H., Kim, D.H. et al. An intelligent threats solution for object detection and resource perspective rectification of distorted anomaly identification card images in cloud environments. Appl Intell 53, 385–404 (2023). https://doi.org/10.1007/s10489-022-03261-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03261-5

Keywords

Navigation