Abstract
Various information retrieval algorithms have matured in recent years to facilitate data extraction from structured (with a predefined template) digital document images, primarily to manage and automate different organizations’ invoice and bill reimbursement processes. The algorithms are designated either rule-based or machine-learning-based. Both approaches have respective advantages and disadvantages. The rule-based algorithms struggle to generalize and need periodic adjustments, whereas machine learning-based supervised approaches need extensive data for training and substantial time and effort for manual annotation. The proposed system attempts to address both problems by providing a one-shot training approach using image processing, template matching, and optical character recognition. The model is extensible for any structured documents such as closing disclosure, bill, tax receipt, besides invoices. The model is validated against six different structured document types obtained from a reputed title insurance (TI) company. The comprehensive analysis of the experimental results confirms entity-wise extraction accuracy between 73.91 and 100% and straight through pass 81.81%, which is within business acceptable precision for a live environment. Out of total 32 tested entities, 17 outperformed all state-of-the-art techniques, where max accuracy has been \(93\%\) with only invoices or sales receipts. The system has been set operational to assist the robotic process automation of the TI mentioned above based on the experimental results.
Similar content being viewed by others
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
References
Hameed, I.M.; Abdulhussain, S.H.; Mahmmod, B.M.: Content-based image retrieval: a review of recent trends. Cogent Eng. 8, 1927469 (2021)
Hameed, I.M.; Abdulhussain, S.H.: An efficient multistage CBIR based on squared Krawtchouk–Tchebichef polynomials. In: IOP Conference Series: Materials Science and Engineering, p. 012100. IOP Publishing (2021)
Holt, X.; Chisholm, A.: Extracting structured data from invoices. Proc. Australas. Lang. Technol. Assoc. Workshop 2018, 53–59 (2018)
Piskorski, J.; Yangarber, R.: Information extraction: past, present and future. In: Multi-source, Multilingual Information Extraction and Summarization, pp. 23–49. Springer (2013)
Guha, A.; Samanta, D.: Hybrid approach to document anomaly detection: an application to facilitate RPA in title insurance. Int. J. Autom. Comput. 18, 55–72 (2021)
Sunder, V.; Srinivasan, A.; Vig, L.; Shroff, G.; Rahul, R.: One-shot information extraction from document images using neuro-deductive program synthesis. CoRR arXiv:abs/1906.02427 (2019)
Jiang, J.: Information Extraction from Text. In: Aggarwal, C., Zhai, C. (eds) Mining Text Data. Springer, Boston, MA (2012). https://doi.org/10.1007/978-1-4614-3223-4_2
Chambers, N.; Jurafsky, D.: Template-based information extraction without the templates. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 976–986 (2011)
Schmitz, M.; Soderland, S.; Bart, R.; Etzioni, O.; et al.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 523–534 (2012)
Hobbs, J.R.; Riloff, E.: Information extraction. Handb. Nat. Lang. Process. 15, 16 (2010)
Grishman, R.: Information extraction. IEEE Intell. Syst. 30, 8–15 (2015)
Dhakal, P.; Munikar, M.; Dahal, B.: One-shot template matching for automatic document data capture. In: 2019 Artificial Intelligence for Transforming Business and Society (AITB), pp. 1–6. IEEE (2019)
Prabhakar, N.; Vaithiyanathan, V.; Sharma, A.P.; Singh, A.; Singhal, P.: Object tracking using frame differencing and template matching. Res. J. Appl. Sci. Eng. Technol. 4, 5497–5501 (2012)
Sun, Y.; Mao, X.; Hong, S.; Xu, W.; Gui, G.: Template matching-based method for intelligent invoice information identification. IEEE Access 7, 28392–28401 (2019)
Korman, S.; Reichman, D.; Tsur, G.; Avidan, S.: Fast-match: fast affine template matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2331–2338 (2013)
Sibiryakov, A.: Fast and high-performance template matching method. In: CVPR 2011, pp. 1417–1424. IEEE (2011)
Mahmood, A.; Khan, S.: Correlation-coefficient-based fast template matching through partial elimination. IEEE Trans. Image Process. 21, 2099–2108 (2011)
Hisham, M.; Yaakob, S.N.; Raof, R.A.; Nazren, A.A.; Embedded, N.W.: Template matching using sum of squared difference and normalized cross correlation. In: 2015 IEEE Student Conference on Research and Development (SCOReD), pp. 100–104. IEEE (2015)
Raoui-Outach, R.; Million-Rousseau, C.; Benoit, A.; Lambert, P.: Deep learning for automatic sale receipt understanding. In: 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–6. IEEE (2017)
Le, A.D.; Van Pham, D.; Nguyen, T.A.: Deep learning approach for receipt recognition. In: International Conference on Future Data and Security Engineering, pp. 705–712. Springer (2019)
Chien, P.; Lee, G.C.: A template-based method for identifying input regions in survey forms. Pattern Recognit. Image Anal. 21, 469 (2011)
Lohani, D.; Belaïd, A.; Belaïd, Y.: An invoice reading system using a graph convolutional network. In: Asian Conference on Computer Vision, pp. 144–158. Springer (2018)
Majumder, B.P.; Potti, N.; Tata, S.; Wendt, J.B.; Zhao, Q.; Najork, M.: Representation learning for information extraction from form-like documents. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6495–6504 (2020)
Ryan, M.; Hanafiah, N.: An examination of character recognition on id card using template matching approach. Procedia Comput. Sci. 59, 520–529 (2015)
Jayanthi, N.; Indu, S.: Comparison of image matching techniques. Int. J. Latest Trends Eng. Technol. 7, 396–401 (2016)
Puranic, A.; Deepak, K.; Umadevi, V.: Vehicle number plate recognition system: a literature review and implementation using template matching. Int. J. Comput. Appl. 134, 12–16 (2016)
Thakar, K.; Kapadia, D.; Natali, F.; Sarvaiya, J.: Implementation and analysis of template matching for image registration on DevKit-8500D. Optik 130, 935–944 (2017)
Shah, N.N.; Agarwal, K.R.; Singapuri, H.M.: Implementation of sum of absolute difference using optimized partial summation term reduction. In: 2013 International Conference on Advanced Electronic Systems (ICAES), pp. 192–196. IEEE (2013)
Mahalakshmi, T.; Muthaiah, R.; Swaminathan, P.: Image processing. Res. J. Appl. Sci. Eng. Technol. 4, 5469–5473 (2012)
Wu, T.; Toet, A.: Speed-up template matching through integral image based weak classifiers. J. Pattern Recognit. Res. 1, 1–12 (2014)
Singh, C.; Bhatia, N.; Kaur, A.: Hough transform based fast skew detection and accurate skew correction methods. Pattern Recognit. 41, 3528–3546 (2008)
Sun, C.; Si, D.: Skew and slant correction for document images using gradient direction. In: Proceedings of the Fourth International Conference on Document Analysis and Recognition, pp. 142–146. IEEE (1997)
Zhao, C.; Sahni, S.: String correction using the Damerau–Levenshtein distance. BMC Bioinform. 20, 277 (2019)
Oktaviyani, E.D.; Christina, S.; Ronaldo, D.: Keywords search correction using Damerau Levenshtein distance algorithm. In: Conference SENATIK STT Adisutjipto Yogyakarta, pp. 167–176 (2019)
Baek, G.; Kim, S.: Two step template matching method with correlation coefficient and genetic algorithm. In: International Conference on Intelligent Computing, pp. 85–90. Springer (2009)
Acknowledgements
The authors extend their appreciation to First American India Private Limited and Christ (Deemed to be) University, Bangalore, Karnataka, India, and Indian Institute of Information Technology Kalyani, Kalyani, West Bengal, India.
Funding
There is no funding for this research.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest between authors.
Rights and permissions
About this article
Cite this article
Guha, A., Samanta, D. & Islam, S.H. IIRM: Intelligent Information Retrieval Model for Structured Documents by One-Shot Training Using Computer Vision. Arab J Sci Eng 48, 1285–1301 (2023). https://doi.org/10.1007/s13369-022-06735-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-022-06735-3