IIRM: Intelligent Information Retrieval Model for Structured Documents by One-Shot Training Using Computer Vision

Guha, Abhijit; Samanta, Debabrata; Islam, SK Hafizul

doi:10.1007/s13369-022-06735-3

IIRM: Intelligent Information Retrieval Model for Structured Documents by One-Shot Training Using Computer Vision

Research Article-Computer Engineering and Computer Science
Published: 31 March 2022

Volume 48, pages 1285–1301, (2023)
Cite this article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

353 Accesses
1 Citation
Explore all metrics

Abstract

Various information retrieval algorithms have matured in recent years to facilitate data extraction from structured (with a predefined template) digital document images, primarily to manage and automate different organizations’ invoice and bill reimbursement processes. The algorithms are designated either rule-based or machine-learning-based. Both approaches have respective advantages and disadvantages. The rule-based algorithms struggle to generalize and need periodic adjustments, whereas machine learning-based supervised approaches need extensive data for training and substantial time and effort for manual annotation. The proposed system attempts to address both problems by providing a one-shot training approach using image processing, template matching, and optical character recognition. The model is extensible for any structured documents such as closing disclosure, bill, tax receipt, besides invoices. The model is validated against six different structured document types obtained from a reputed title insurance (TI) company. The comprehensive analysis of the experimental results confirms entity-wise extraction accuracy between 73.91 and 100% and straight through pass 81.81%, which is within business acceptable precision for a live environment. Out of total 32 tested entities, 17 outperformed all state-of-the-art techniques, where max accuracy has been \(93\%\) with only invoices or sales receipts. The system has been set operational to assist the robotic process automation of the TI mentioned above based on the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

End-to-End Object Detection with Transformers

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

References

Hameed, I.M.; Abdulhussain, S.H.; Mahmmod, B.M.: Content-based image retrieval: a review of recent trends. Cogent Eng. 8, 1927469 (2021)
Article Google Scholar
Hameed, I.M.; Abdulhussain, S.H.: An efficient multistage CBIR based on squared Krawtchouk–Tchebichef polynomials. In: IOP Conference Series: Materials Science and Engineering, p. 012100. IOP Publishing (2021)
Holt, X.; Chisholm, A.: Extracting structured data from invoices. Proc. Australas. Lang. Technol. Assoc. Workshop 2018, 53–59 (2018)
Google Scholar
Piskorski, J.; Yangarber, R.: Information extraction: past, present and future. In: Multi-source, Multilingual Information Extraction and Summarization, pp. 23–49. Springer (2013)
Guha, A.; Samanta, D.: Hybrid approach to document anomaly detection: an application to facilitate RPA in title insurance. Int. J. Autom. Comput. 18, 55–72 (2021)
Article Google Scholar
Sunder, V.; Srinivasan, A.; Vig, L.; Shroff, G.; Rahul, R.: One-shot information extraction from document images using neuro-deductive program synthesis. CoRR arXiv:abs/1906.02427 (2019)
Jiang, J.: Information Extraction from Text. In: Aggarwal, C., Zhai, C. (eds) Mining Text Data. Springer, Boston, MA (2012). https://doi.org/10.1007/978-1-4614-3223-4_2
Chambers, N.; Jurafsky, D.: Template-based information extraction without the templates. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 976–986 (2011)
Schmitz, M.; Soderland, S.; Bart, R.; Etzioni, O.; et al.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 523–534 (2012)
Hobbs, J.R.; Riloff, E.: Information extraction. Handb. Nat. Lang. Process. 15, 16 (2010)
Google Scholar
Grishman, R.: Information extraction. IEEE Intell. Syst. 30, 8–15 (2015)
Article Google Scholar
Dhakal, P.; Munikar, M.; Dahal, B.: One-shot template matching for automatic document data capture. In: 2019 Artificial Intelligence for Transforming Business and Society (AITB), pp. 1–6. IEEE (2019)
Prabhakar, N.; Vaithiyanathan, V.; Sharma, A.P.; Singh, A.; Singhal, P.: Object tracking using frame differencing and template matching. Res. J. Appl. Sci. Eng. Technol. 4, 5497–5501 (2012)
Google Scholar
Sun, Y.; Mao, X.; Hong, S.; Xu, W.; Gui, G.: Template matching-based method for intelligent invoice information identification. IEEE Access 7, 28392–28401 (2019)
Article Google Scholar
Korman, S.; Reichman, D.; Tsur, G.; Avidan, S.: Fast-match: fast affine template matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2331–2338 (2013)
Sibiryakov, A.: Fast and high-performance template matching method. In: CVPR 2011, pp. 1417–1424. IEEE (2011)
Mahmood, A.; Khan, S.: Correlation-coefficient-based fast template matching through partial elimination. IEEE Trans. Image Process. 21, 2099–2108 (2011)
Article MathSciNet MATH Google Scholar
Hisham, M.; Yaakob, S.N.; Raof, R.A.; Nazren, A.A.; Embedded, N.W.: Template matching using sum of squared difference and normalized cross correlation. In: 2015 IEEE Student Conference on Research and Development (SCOReD), pp. 100–104. IEEE (2015)
Raoui-Outach, R.; Million-Rousseau, C.; Benoit, A.; Lambert, P.: Deep learning for automatic sale receipt understanding. In: 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–6. IEEE (2017)
Le, A.D.; Van Pham, D.; Nguyen, T.A.: Deep learning approach for receipt recognition. In: International Conference on Future Data and Security Engineering, pp. 705–712. Springer (2019)
Chien, P.; Lee, G.C.: A template-based method for identifying input regions in survey forms. Pattern Recognit. Image Anal. 21, 469 (2011)
Article Google Scholar
Lohani, D.; Belaïd, A.; Belaïd, Y.: An invoice reading system using a graph convolutional network. In: Asian Conference on Computer Vision, pp. 144–158. Springer (2018)
Majumder, B.P.; Potti, N.; Tata, S.; Wendt, J.B.; Zhao, Q.; Najork, M.: Representation learning for information extraction from form-like documents. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6495–6504 (2020)
Ryan, M.; Hanafiah, N.: An examination of character recognition on id card using template matching approach. Procedia Comput. Sci. 59, 520–529 (2015)
Article Google Scholar
Jayanthi, N.; Indu, S.: Comparison of image matching techniques. Int. J. Latest Trends Eng. Technol. 7, 396–401 (2016)
Google Scholar
Puranic, A.; Deepak, K.; Umadevi, V.: Vehicle number plate recognition system: a literature review and implementation using template matching. Int. J. Comput. Appl. 134, 12–16 (2016)
Google Scholar
Thakar, K.; Kapadia, D.; Natali, F.; Sarvaiya, J.: Implementation and analysis of template matching for image registration on DevKit-8500D. Optik 130, 935–944 (2017)
Article Google Scholar
Shah, N.N.; Agarwal, K.R.; Singapuri, H.M.: Implementation of sum of absolute difference using optimized partial summation term reduction. In: 2013 International Conference on Advanced Electronic Systems (ICAES), pp. 192–196. IEEE (2013)
Mahalakshmi, T.; Muthaiah, R.; Swaminathan, P.: Image processing. Res. J. Appl. Sci. Eng. Technol. 4, 5469–5473 (2012)
Google Scholar
Wu, T.; Toet, A.: Speed-up template matching through integral image based weak classifiers. J. Pattern Recognit. Res. 1, 1–12 (2014)
Google Scholar
Singh, C.; Bhatia, N.; Kaur, A.: Hough transform based fast skew detection and accurate skew correction methods. Pattern Recognit. 41, 3528–3546 (2008)
Article MATH Google Scholar
Sun, C.; Si, D.: Skew and slant correction for document images using gradient direction. In: Proceedings of the Fourth International Conference on Document Analysis and Recognition, pp. 142–146. IEEE (1997)
Zhao, C.; Sahni, S.: String correction using the Damerau–Levenshtein distance. BMC Bioinform. 20, 277 (2019)
Article Google Scholar
Oktaviyani, E.D.; Christina, S.; Ronaldo, D.: Keywords search correction using Damerau Levenshtein distance algorithm. In: Conference SENATIK STT Adisutjipto Yogyakarta, pp. 167–176 (2019)
Baek, G.; Kim, S.: Two step template matching method with correlation coefficient and genetic algorithm. In: International Conference on Intelligent Computing, pp. 85–90. Springer (2009)

Download references

Acknowledgements

The authors extend their appreciation to First American India Private Limited and Christ (Deemed to be) University, Bangalore, Karnataka, India, and Indian Institute of Information Technology Kalyani, Kalyani, West Bengal, India.

Funding

There is no funding for this research.

Author information

Authors and Affiliations

Department of Computer Science, Christ (Deemed to be) University, Bangalore, Karnataka, 560029, India
Abhijit Guha & Debabrata Samanta
First American India Private Limited, Bangalore, Karnataka, 560038, India
Abhijit Guha
Department of Computer Science and Engineering, Indian Institute of Information Technology Kalyani, Kalyani, West Bengal, 741235, India
SK Hafizul Islam

Authors

Abhijit Guha
View author publications
You can also search for this author in PubMed Google Scholar
Debabrata Samanta
View author publications
You can also search for this author in PubMed Google Scholar
SK Hafizul Islam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to SK Hafizul Islam.

Ethics declarations

Conflict of interest

There is no conflict of interest between authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guha, A., Samanta, D. & Islam, S.H. IIRM: Intelligent Information Retrieval Model for Structured Documents by One-Shot Training Using Computer Vision. Arab J Sci Eng 48, 1285–1301 (2023). https://doi.org/10.1007/s13369-022-06735-3

Download citation

Received: 24 September 2021
Accepted: 17 February 2022
Published: 31 March 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s13369-022-06735-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

IIRM: Intelligent Information Retrieval Model for Structured Documents by One-Shot Training Using Computer Vision

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

IIRM: Intelligent Information Retrieval Model for Structured Documents by One-Shot Training Using Computer Vision

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation