An intelligent threats solution for object detection and resource perspective rectification of distorted anomaly identification card images in cloud environments

Tan, Nguyen Thi Thanh; Van Huy, Huynh; Kim, Do Hyeun; Ngoc, Le Anh

doi:10.1007/s10489-022-03261-5

An intelligent threats solution for object detection and resource perspective rectification of distorted anomaly identification card images in cloud environments

Published: 18 April 2022

Volume 53, pages 385–404, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Nguyen Thi Thanh Tan¹,
Huynh Van Huy²,
Do Hyeun Kim³ &
…
Le Anh Ngoc ORCID: orcid.org/0000-0003-3515-5443⁴

347 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Optical character recognition (OCR) in general and identity card recognition (IDOCR) in particular, embedded in mobile cameras, are being interested in and attracting the research community in Vietnam. Due to the variety of camera devices that capture images, the IDOCR systems often face a lot of difficulties, typically as the input images are distorted, rotated, scaled, translated, or sheared lead to weak recognition accuracy results. In this paper, we focus on the distortion document image problem of Vietnamese IDOCR system and propose an effective method to solve this problem. Our solution includes three main stages: (i) ROI detection; (ii) Image segmentation and corner points detection. (iii) distorted image area rectification. The accuracy and execution time of the method are verified on a large amount of data collected from the real environment with very differently lighting condition, shooting distance, camera angle and image size. The experimental results show that our method gains high accuracy, real-time calculation and is able to deal with the distorted input images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 7

Fig. 10

Fig. 11

An End-to-End System for Text Extraction in Indian Identity Cards

Improving personal information detection using OCR feature recognition rate

Article 29 May 2018

Web Knowledge Base Improved OCR Correction for Chinese Business Cards

References

Tan NTT, Trong KN (2019) A method for segmentation of vietnamese identification card text fields. Int J Adv Comput Sci Appl 10(10):415–421
Tan NTT, Lam LH, Nam NH (2020) An efficient method for automatic recognizing text fields on identification card. Vnu J Sci Math – Phys 36(1):64–70
Article Google Scholar
Van Hoai DP, Duong H-T, Hoang VT (2021) Text recognition for Vietnamese identity card based on deep features network. International Journal on Document Analysis and Recognition (IJDAR). https://doi.org/10.1007/s10032-021-00363-7
Hung PD, Linh DQ (2019) Implementing an android application for automatic vietnamese business card recognition. Pattern Recogn Image Anal 29(1):156–166
Article Google Scholar
Shafait F, Breuel TM (2007) Document image dewarping contest. In: 2nd International Workshop on Camera-Based Document Analysis and Recognition, Curitiba, pp 181–188
Duan L-Y, Ji R, Chen Z, Huang T, Gao W (2014) Towards mobiledocument image retrieval for digital library. IEEE Trans Multimed 16(2):346–359
Article Google Scholar
Cutter M, Manduchi R (2015) Towards mobile ocr: How to take a good picture of a document without sight. In: Proc of the 15th ACM SIGWEB International Symposium on Document Engineering
Fang X, Fu X, Xu X (2017) ID card identification system based on image recognition. In: 2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA), pp 1488–1492. https://doi.org/10.1109/ICIEA.2017.8283074
Ravneet K (2018) Text recognition applications for mobile devices. J Glob Res Comput Sci 9 (4):20–24
Google Scholar
Satyawan W, Octaviano Pratama M, Jannati R, Muhammad G, Fajar B, Hamzah H, Fikri R, Kristian K (2019) Citizen id card detection using image processing and optical character recognition. J Phys Conf Ser 1235:012049
Baek J, Kim G, Lee J, Park S, Han D, Yun S, Oh SJ, Lee H (2019) What Is Wrong With Scene Text Recognition Model Comparisons?. Dataset and Model Analysis, arXiv:1904.01906
Sourvanos N, Tsatiris G (2018) Challenges In Input Preprocessing for Mobile OCR Applications: A Realistic Testing Scenario. 9th International Conference on Information, Intelligence, Systems and Applications (IISA). https://doi.org/10.1109/iisa.2018.8633688
Bulatov K, Matalov D, Arlazarov V (2020) MIDV-2019: challenges of the modern mobile-based document OCR. Proceedings of SPIE 11433, Twelfth International Conference on Machine Vision (ICMV 2019), pp 114332N. https://doi.org/10.1117/12.2558438
Fu B, Wu M, Li R, Li W, Xu Z, Yang C (2007) A model-based book dewarping method using text line detection. In: Proc. Int. Workshop Camera-Based Document Anal. Recognit., Curitiba, pg 63–70
Zhang L, Zhang Y, Tan C (2008) An improved physically-based method for geometric restoration of distorted document images. IEEE Trans Pattern Anal Mach Intell 30(4):728–734
Article Google Scholar
Liang J, DeMenthon D, Doermann D (2008) Geometric rectification of Camera-Captured document images. IEEE Trans Pattern Anal Mach Intell 30(4):591–605. https://doi.org/10.1109/TPAMI.2007.70724
Article Google Scholar
Williem CS, Park IK (2015) Correcting geometric and photometric distortion of document images on a smartphone. J Electron Imaging 24(1):013038. https://doi.org/10.1117/1.JEI.24.1.013038
Kim BS, Koo HI, Cho NI (2015) Document dewarping via text line based optimization. Pattern Recogn 48(11):3600–3614
Article Google Scholar
ChangJun ML A Mathematic Morphology Approach for Radial Lens Correction of Document Image. 2010 International Conference On Computer Design And Appliations, pp 465–s468
Luo S, Fang X, Zhao C, Luo Y (2011) Text Line Based Correction of Distorted Document Images. 2011 International Joint Conference of IEEE TrustCom-11/IEEE ICESS-11/FCST-11, pp 1494–s1499
Kil T, Seo W, Koo HI, Cho NI (2017) Robust Document Image Dewarping Method Using Text-Lines and Line Segments. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol 1. IEEE, pp 865–870
Zhang L, Yip AM, Brown MS, Tan C (2009) A Unified Framework for Document Restoration Using Inpainting and Shape-from-shading. Pattern Recogn 42(11), pp 2961–2978
Zhang L, Zhang Y, Tan C (2008) An improved physically based method for geometric restoration of distorted document images. IEEE Trans Pattern Anal Mach Intell 30(4):728–734
Ostlund J, Varol A, Ngo DT, Fua P (2012) Laplacian meshes for monocular 3D shape recovery. In: Proceedings of the European Conference on Computer Vision. Springer, pp 412–425
Meng G, Wang Y, Qu S, Xiang S, Pan C (2014) Active flattening of curved document images via two structured beams. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 3890–3897
You S, Matsushita Y, Sinha S, Bou Y, Ikeuchi K (2018) Multiview rectification of folded documents. IEEE Trans Pattern Anal Mach Intell 40:505–511
Article Google Scholar
Takezawa Y, Hasegawa M, Tabbone S (2016) Camera-captured document image perspective distortion correction using vanishing point detection based on radon transform. In: International Conference on Pattern Recognition, pp 3968–3974
Takezawa Y, Hasegawa M, Tabbone S (2017) Robust Perspective Rectification of Camera-Captured Document Images. 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 27–32
Sheshkus A, Ingacheva A, Arlazarov V, Nikolaev D (2019) HoughNet: neural network architecture for vanishing points detection, arXiv:1909.03812
Yue L, Li H, Zheng X (2019) Distorted building image matching with automatic viewpoint rectification and fusion. Sensors 19(23):5205
Article Google Scholar
Das S, Mishra G, Sudharshana A, Shilkrot R (2017) The Common Fold: Utilizing the Four-Fold to Dewarp Printed Documents from a Single Image. In: Proceedings of the 2017 ACM Symposium on Document Engineering, pp 125–128
Ke M, Shu Z, Bai X, Wang J, Samaras D (2018) Docunet: Document image unwarping via a stacked unet. International Conference on Computer Vision (ICCV), pp 4700– 4709
Das S, Ma K, Shu Z, Samaras D, Shilkrot R (2019) DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks. IEEE/CVF International Conference on Computer Vision (ICCV), pp 131–140
Liua X, Meng G, Fana B, Xianga S, Pan C (2020) Geometric rectification of document images using adversarial gated unwarping network. Pattern Recogn 108
Zou Z, Shi Z, Guo Y, Ye J Object detection in 20 years: A survey. [Online]. Available: https://arxiv.org/abs/1905.05055v2
Zhao Z, Zheng P, Xu S, Wu X (2019) Object detection with deep learning: a review, vol 30. https://doi.org/10.1109/TNNLS.2018.2876865
Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access:1–1. https://doi.org/10.1109/access.2019.2939201
Long X, Deng K, Wang G, Zhang Y, Dang Q, Gao Y, Shen H, Ren J, Han S, Ding E et al (2020) Pp-yolo: An effective and efficient implementation of object detector. 2007.12099
Ren S, He K, Girshick R, Sun J (2017) Faster r-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. Proc IEEE Conf Comput Vis Pattern Recognit:779–788
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single shot multibox detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision—ECCV. Springer, Cham, pp 21–37
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-CNN. In: Proceedings of IEEE ICCV, pp 2980–2988
Gonzalez RC, Woods RE (2018) Digital Image Processing, 4th edn. Pearson
Lakshmi S, Sankaranarayanan V (2011) A study of edge detection techniques for segmentation computing approaches. IJCA Special Issue on CASCT, pp 35–41
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: A metric and a loss for bounding box regression. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: Faster and better learning for bounding box regression. In: AAAI

Download references

Acknowledgments

This research was supported by Energy Cloud R&D Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (2019M3F2A1073387), and this research was supported by Institute for Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government (MSIT) (No.2018-0-01456, AutoMaTa: Autonomous Management framework based on artificial intelligent Technology for adaptive and disposable IoT). This research also was supported by the project “Research on contentbased image retrieval using relevance feedback with sparse representation classification” code CS21.04 of Institute of Information Technology (IOIT), Vietnam Academy of Science and Technology (VAST). Any correspondence related to this paper should be addressed to Prof. Dohyeun Kim and Prof. Le Anh Ngoc. We are also thankful to VietNam National Technology Program (KC 4.0) and the Electric Power University (EPU) for their sponsorship of this research.

Author information

Authors and Affiliations

Electric Power University, Ha Noi, Vietnam
Nguyen Thi Thanh Tan
Ba Ria Vung Tau University, Ba Ria Vung Tau Province, Vietnam
Huynh Van Huy
Department of Computer Engineering (and Research Center of Advance Technology), Jeju National University, Jeju Special Self-Governing Province 63243, Jeju-si, Republic of Korea
Do Hyeun Kim
Swinburne Vietnam, FPT University, Hanoi, Vietnam
Le Anh Ngoc

Authors

Nguyen Thi Thanh Tan
View author publications
You can also search for this author in PubMed Google Scholar
Huynh Van Huy
View author publications
You can also search for this author in PubMed Google Scholar
Do Hyeun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Le Anh Ngoc
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Do Hyeun Kim or Le Anh Ngoc.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tan, N.T.T., Van Huy, H., Kim, D.H. et al. An intelligent threats solution for object detection and resource perspective rectification of distorted anomaly identification card images in cloud environments. Appl Intell 53, 385–404 (2023). https://doi.org/10.1007/s10489-022-03261-5

Download citation

Accepted: 18 January 2022
Published: 18 April 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10489-022-03261-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An intelligent threats solution for object detection and resource perspective rectification of distorted anomaly identification card images in cloud environments

Abstract

Access this article

Similar content being viewed by others

An End-to-End System for Text Extraction in Indian Identity Cards

Improving personal information detection using OCR feature recognition rate

Web Knowledge Base Improved OCR Correction for Chinese Business Cards

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An intelligent threats solution for object detection and resource perspective rectification of distorted anomaly identification card images in cloud environments

Abstract

Access this article

Similar content being viewed by others

An End-to-End System for Text Extraction in Indian Identity Cards

Improving personal information detection using OCR feature recognition rate

Web Knowledge Base Improved OCR Correction for Chinese Business Cards

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation