Abstract
While Industries are growing strong with their digital transformation, advanced analytics are making them stronger through data driven decisions. At the same time traditional automation is getting matured and emerging as cognitive automation. In the era of Industry 4.0, handshake of business process automation, advance analytics and cognitive services have laid down a strong platform for ‘Cognitive Bots’. Enterprises can leverage the advent of powerful technologies and approaches, anticipating ultimate goal of the business through more adaptive, self-learning, and contextual applications. This paper explains one of such cognitive bot for finance department where invoices are of utmost importance for the function. The mentioned bot is intended for amount detection and verification; additionally, it can also extract various entities like organization name, location and date which contributes to perform analytics to a great extent. The application reward business in reducing turnaround time and human errors. The accuracy of specially customized trained neural model has achieved state of the art results on the current set of learning data. The proposed framework makes use of Optical Character Recognition and PDFMiner for text extraction from scanned invoices. A Quality Classifier that will reject hand written invoices for any further processing. A spaCy’s Name-Entity-Recognition predicts the amount, date, organization name and location from extracted unstructured text.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ming, D., Liu, J., Tian, J.: Research on Chinese financial invoice recognition technology. Pattern Recogn. Lett. 24(1), 489–497 (2003)
Emambakhsh, M., He, Y., Nabney, I.: Handwritten and machine-printed text discrimination using a template matching approach. In: Proceedings of the 12th IAPR International Workshop on Document Analysis Systems DAS, vol. 2016, no. 101779, pp. 399–404 (2016)
Emambakhsh, M., He, Y., Nabney, I.: Handwritten and Machine-Printed Text Discrimination Using a Template Matching Approach (2016). https://doi.org/10.1109/DAS.2016.22.
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), Antalya, pp. 1–6 (2017). https://doi.org/10.1109/ICEngTechnol.2017.8308186
Vincent, L.: Announcing Tesseract OCR (2006). https://googlecode.blogspot.com/2006/08/announcing-tesseract-ocr.html. Accessed 30 Aug 2006
Alginahi, Y.: Preprocessing Techniques in Character Recognition (2010). https://doi.org/10.5772/9776
Abdu, A.: Enhanced radon transform skew estimation and correction algorithm for scanned multiple-choice forms, pp. 444–454 (2019). https://doi.org/10.15405/epsbs.2019.05.02.44.
Honnibal, M.: Introducing spaCy (2015). https://explosion.ai/blog/introducing-spacy. Accessed 19 Feb 2015
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., Weischedel, R.: OntoNotes: the 90% solution. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, NAACL-Short 2006, pp. 57–60. Association for Computational Linguistics, Stroudsburg (2006)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)
Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., Liu, C.: A survey on deep transfer learning. arXiv:1808.01974 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Dwivedi, A., Vijayan, P., Gupta, R., Ramdasi, P. (2021). Enhancing Enterprise Business Processes Through AI Based Approach for Entity Extraction – An Overview of an Application. In: Santosh, K.C., Gawali, B. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2020. Communications in Computer and Information Science, vol 1380. Springer, Singapore. https://doi.org/10.1007/978-981-16-0507-9_32
Download citation
DOI: https://doi.org/10.1007/978-981-16-0507-9_32
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0506-2
Online ISBN: 978-981-16-0507-9
eBook Packages: Computer ScienceComputer Science (R0)