Skip to main content
Log in

IIRM: Intelligent Information Retrieval Model for Structured Documents by One-Shot Training Using Computer Vision

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

Various information retrieval algorithms have matured in recent years to facilitate data extraction from structured (with a predefined template) digital document images, primarily to manage and automate different organizations’ invoice and bill reimbursement processes. The algorithms are designated either rule-based or machine-learning-based. Both approaches have respective advantages and disadvantages. The rule-based algorithms struggle to generalize and need periodic adjustments, whereas machine learning-based supervised approaches need extensive data for training and substantial time and effort for manual annotation. The proposed system attempts to address both problems by providing a one-shot training approach using image processing, template matching, and optical character recognition. The model is extensible for any structured documents such as closing disclosure, bill, tax receipt, besides invoices. The model is validated against six different structured document types obtained from a reputed title insurance (TI) company. The comprehensive analysis of the experimental results confirms entity-wise extraction accuracy between 73.91 and 100% and straight through pass 81.81%, which is within business acceptable precision for a live environment. Out of total 32 tested entities, 17 outperformed all state-of-the-art techniques, where max accuracy has been \(93\%\) with only invoices or sales receipts. The system has been set operational to assist the robotic process automation of the TI mentioned above based on the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

References

  1. Hameed, I.M.; Abdulhussain, S.H.; Mahmmod, B.M.: Content-based image retrieval: a review of recent trends. Cogent Eng. 8, 1927469 (2021)

    Article  Google Scholar 

  2. Hameed, I.M.; Abdulhussain, S.H.: An efficient multistage CBIR based on squared Krawtchouk–Tchebichef polynomials. In: IOP Conference Series: Materials Science and Engineering, p. 012100. IOP Publishing (2021)

  3. Holt, X.; Chisholm, A.: Extracting structured data from invoices. Proc. Australas. Lang. Technol. Assoc. Workshop 2018, 53–59 (2018)

    Google Scholar 

  4. Piskorski, J.; Yangarber, R.: Information extraction: past, present and future. In: Multi-source, Multilingual Information Extraction and Summarization, pp. 23–49. Springer (2013)

  5. Guha, A.; Samanta, D.: Hybrid approach to document anomaly detection: an application to facilitate RPA in title insurance. Int. J. Autom. Comput. 18, 55–72 (2021)

    Article  Google Scholar 

  6. Sunder, V.; Srinivasan, A.; Vig, L.; Shroff, G.; Rahul, R.: One-shot information extraction from document images using neuro-deductive program synthesis. CoRR arXiv:abs/1906.02427 (2019)

  7. Jiang, J.: Information Extraction from Text. In: Aggarwal, C., Zhai, C. (eds) Mining Text Data. Springer, Boston, MA (2012). https://doi.org/10.1007/978-1-4614-3223-4_2

  8. Chambers, N.; Jurafsky, D.: Template-based information extraction without the templates. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 976–986 (2011)

  9. Schmitz, M.; Soderland, S.; Bart, R.; Etzioni, O.; et al.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 523–534 (2012)

  10. Hobbs, J.R.; Riloff, E.: Information extraction. Handb. Nat. Lang. Process. 15, 16 (2010)

    Google Scholar 

  11. Grishman, R.: Information extraction. IEEE Intell. Syst. 30, 8–15 (2015)

    Article  Google Scholar 

  12. Dhakal, P.; Munikar, M.; Dahal, B.: One-shot template matching for automatic document data capture. In: 2019 Artificial Intelligence for Transforming Business and Society (AITB), pp. 1–6. IEEE (2019)

  13. Prabhakar, N.; Vaithiyanathan, V.; Sharma, A.P.; Singh, A.; Singhal, P.: Object tracking using frame differencing and template matching. Res. J. Appl. Sci. Eng. Technol. 4, 5497–5501 (2012)

    Google Scholar 

  14. Sun, Y.; Mao, X.; Hong, S.; Xu, W.; Gui, G.: Template matching-based method for intelligent invoice information identification. IEEE Access 7, 28392–28401 (2019)

    Article  Google Scholar 

  15. Korman, S.; Reichman, D.; Tsur, G.; Avidan, S.: Fast-match: fast affine template matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2331–2338 (2013)

  16. Sibiryakov, A.: Fast and high-performance template matching method. In: CVPR 2011, pp. 1417–1424. IEEE (2011)

  17. Mahmood, A.; Khan, S.: Correlation-coefficient-based fast template matching through partial elimination. IEEE Trans. Image Process. 21, 2099–2108 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  18. Hisham, M.; Yaakob, S.N.; Raof, R.A.; Nazren, A.A.; Embedded, N.W.: Template matching using sum of squared difference and normalized cross correlation. In: 2015 IEEE Student Conference on Research and Development (SCOReD), pp. 100–104. IEEE (2015)

  19. Raoui-Outach, R.; Million-Rousseau, C.; Benoit, A.; Lambert, P.: Deep learning for automatic sale receipt understanding. In: 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–6. IEEE (2017)

  20. Le, A.D.; Van Pham, D.; Nguyen, T.A.: Deep learning approach for receipt recognition. In: International Conference on Future Data and Security Engineering, pp. 705–712. Springer (2019)

  21. Chien, P.; Lee, G.C.: A template-based method for identifying input regions in survey forms. Pattern Recognit. Image Anal. 21, 469 (2011)

    Article  Google Scholar 

  22. Lohani, D.; Belaïd, A.; Belaïd, Y.: An invoice reading system using a graph convolutional network. In: Asian Conference on Computer Vision, pp. 144–158. Springer (2018)

  23. Majumder, B.P.; Potti, N.; Tata, S.; Wendt, J.B.; Zhao, Q.; Najork, M.: Representation learning for information extraction from form-like documents. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6495–6504 (2020)

  24. Ryan, M.; Hanafiah, N.: An examination of character recognition on id card using template matching approach. Procedia Comput. Sci. 59, 520–529 (2015)

    Article  Google Scholar 

  25. Jayanthi, N.; Indu, S.: Comparison of image matching techniques. Int. J. Latest Trends Eng. Technol. 7, 396–401 (2016)

    Google Scholar 

  26. Puranic, A.; Deepak, K.; Umadevi, V.: Vehicle number plate recognition system: a literature review and implementation using template matching. Int. J. Comput. Appl. 134, 12–16 (2016)

    Google Scholar 

  27. Thakar, K.; Kapadia, D.; Natali, F.; Sarvaiya, J.: Implementation and analysis of template matching for image registration on DevKit-8500D. Optik 130, 935–944 (2017)

    Article  Google Scholar 

  28. Shah, N.N.; Agarwal, K.R.; Singapuri, H.M.: Implementation of sum of absolute difference using optimized partial summation term reduction. In: 2013 International Conference on Advanced Electronic Systems (ICAES), pp. 192–196. IEEE (2013)

  29. Mahalakshmi, T.; Muthaiah, R.; Swaminathan, P.: Image processing. Res. J. Appl. Sci. Eng. Technol. 4, 5469–5473 (2012)

    Google Scholar 

  30. Wu, T.; Toet, A.: Speed-up template matching through integral image based weak classifiers. J. Pattern Recognit. Res. 1, 1–12 (2014)

    Google Scholar 

  31. Singh, C.; Bhatia, N.; Kaur, A.: Hough transform based fast skew detection and accurate skew correction methods. Pattern Recognit. 41, 3528–3546 (2008)

    Article  MATH  Google Scholar 

  32. Sun, C.; Si, D.: Skew and slant correction for document images using gradient direction. In: Proceedings of the Fourth International Conference on Document Analysis and Recognition, pp. 142–146. IEEE (1997)

  33. Zhao, C.; Sahni, S.: String correction using the Damerau–Levenshtein distance. BMC Bioinform. 20, 277 (2019)

    Article  Google Scholar 

  34. Oktaviyani, E.D.; Christina, S.; Ronaldo, D.: Keywords search correction using Damerau Levenshtein distance algorithm. In: Conference SENATIK STT Adisutjipto Yogyakarta, pp. 167–176 (2019)

  35. Baek, G.; Kim, S.: Two step template matching method with correlation coefficient and genetic algorithm. In: International Conference on Intelligent Computing, pp. 85–90. Springer (2009)

Download references

Acknowledgements

The authors extend their appreciation to First American India Private Limited and Christ (Deemed to be) University, Bangalore, Karnataka, India, and Indian Institute of Information Technology Kalyani, Kalyani, West Bengal, India.

Funding

There is no funding for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to SK Hafizul Islam.

Ethics declarations

Conflict of interest

There is no conflict of interest between authors.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guha, A., Samanta, D. & Islam, S.H. IIRM: Intelligent Information Retrieval Model for Structured Documents by One-Shot Training Using Computer Vision. Arab J Sci Eng 48, 1285–1301 (2023). https://doi.org/10.1007/s13369-022-06735-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-022-06735-3

Keywords

Navigation