Abstract
Automating organizational processes typically involves document processing techniques for a large document set. For that purpose, the Intelligent Document Processing (IDP) paradigm has been studied for decades. With the fast emergence of Robotic Process Automation (RPA) in the process automation landscape, the industrial solution of IDP with RPA integration has risen significantly in the last few years. However, there is no up-to-date overview of the available knowledge in this area. Therefore, this chapter studies the current scientific knowledge about IDP and its integration into RPA through a systematic literature review that analyzed 77 primary studies. In addition, an industry review was performed, analyzing and characterizing 37 industrial tools. Although the results confirm the growth in the research interest in IDP in different dimensions, they also identify a lack of proposals that integrate IDP and RPA paradigms in confrontation with the industrial solutions that have increasingly led to its integration.
This research has been supported by the NICO project (PID2019-105455GB-C31) of the Spanish Ministry of Science, Innovation and Universities and the CODICE project (EXP 00130458/IDI-20210319 - P018-20/E09) of the Center for the Development of Industrial Technology (CDTI) and by the FPU scholarship program, granted by the Spanish Ministry of Education and Vocational Training (FPU20/05984).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
A complete analysis of the analysis executed can be found on the sheet with the title “Scientific” within the Excel document available at https://doi.org/10.5281/zenodo.6400519.
- 4.
- 5.
- 6.
- 7.
The relationship between these tools and the characteristics described in Table 5.11 are shown in the sheet with the title “Industrial” in the Excel document available at https://doi.org/10.5281/zenodo.6400519.
References
Abdallah, A., Berendeyev, A., Nuradin, I., Nurseitov, D.: Tncr: Table net detection and classification dataset. Neurocomputing (2021)
Ahmed, R., Gogate, M., Tahir, A., Dashtipour, K., Al-Tamimi, B., Hawalah, A., El-Affendi, M.A., Hussain, A.: Deep neural network-based contextual recognition of Arabic handwritten scripts. Entropy 23(3), 340 (2021)
Alaei, A., Conte, D., Martineau, M., Raveaux, R.: Blind document image quality prediction based on modification of quality aware clustering method integrating a patch selection strategy. Exp. Syst. Appl. 108, 183–192 (2018)
Annabestani, M., Saadatmand-Tarzjan, M.: A new threshold selection method based on fuzzy expert systems for separating text from the background of document images. Iran. J. Sci. Technol. Trans. Electr. Eng. 43(1), 219–231 (2019)
Au, W., Ait-Azzi, A., Kang, J.: Finsbd-2021: the 3rd shared task on structure boundary detection in unstructured text in the financial domain. In: Companion Proceedings of the Web Conference, pp. 276–279 (2021)
Baidya, A.: Document analysis and classification: a robotic process automation (RPA) and machine learning approach. In: 2021 4th International Conference on Information and Computer Technologies (ICICT). pp. 33–37. IEEE (2021)
Banumathi, K.L., Jagadeesh Chandra, A.P.: An approach to estimate skew angle in printed document images. In: 2019 1st International Conference on Advances in Information Technology (ICAIT), pp. 480–484. IEEE (2019)
Baviskar, D., Ahirrao, S., Kotecha, K.: A bibliometric survey on cognitive document processing. Libr. Philos. Pract. 1–31 (2020)
Baviskar, D., Ahirrao, S., Potdar, V., Kotecha, K.: Efficient automated processing of the unstructured documents using artificial intelligence: a systematic literature review and future directions. IEEE Access (2021)
Bhowmik, S., Sarkar, R., Nasipuri, M., Doermann, D.: Text and non-text separation in offline document images: a survey. Int. J. Doc. Anal. Recogn. (IJDAR) 21(1), 1–20 (2018)
Biswas, S., Riba, P., Lladós, J., Pal, U.: Beyond document object detection: instance-level segmentation of complex layouts. Int. J. Doc. Anal. Recogn. (IJDAR) 24(3), 269–281 (2021)
Boroş, E., Romero, V., Maarand, M., Zenklová, K., Křečková, J., Vidal, E., Stutzmann, D., Kermorvant, C.: A comparison of sequential and combined approaches for named entity recognition in a corpus of handwritten medieval charters. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 79–84. IEEE (2020)
Bourbakis, N., Mertoguno, S.: A holistic approach for automatic deep understanding and protection of technical documents. Int. J. Artif. Intell. Tools 29(06), 2050007 (2020)
Bukhari, S.S., Kadi, A., Jouneh, M.A., Mir, F.M., Dengel, A.: anyocr: An open-source OCR system for historical archives. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 305–310. IEEE (2017)
Burget, R.: Model-based integration of unstructured web data sources using graph representation of document contents. In: WEBIST, pp. 326–333 (2019)
Calvo-Zaragoza, J., Castellanos, F.J., Vigliensoni, G., Fujinaga, I.: Deep neural networks for document processing of music score images. Appl. Sci. 8(5), 654 (2018)
Can, Y.S., Kabadayı, M.E.: Line segmentation of individual demographic data from Arabic handwritten population registers of ottoman empire. In: International Conference on Document Analysis and Recognition, pp. 312–321. Springer (2021)
Chakraborti, T., Isahagian, V., Khalaf, R., Khazaeni, Y., Muthusamy, V., Rizk, Y., Unuvar, M.: From robotic process automation to intelligent process automation. In: International Conference on Business Process Management, pp. 215–228. Springer (2020)
Cristani, M., Bertolaso, A., Scannapieco, S., Tomazzoli, C.: Future paradigms of automated processing of business documents. Int. J. Inform. Manag. 40, 67–75 (2018)
Degtyarenko, I., Deriuga, I., Grygoriev, A., Polotskyi, S., Melnyk, V., Zakharchuk, D., Radyvonenko, O.: Hierarchical recurrent neural network for handwritten strokes classification. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2865–2869. IEEE (2021)
Enríquez, J.G., Jimenez-Ramirez, A., Dominguez-Mayo, F., Garcia-Garcia, J.: Robotic process automation: a scientific and industrial systematic mapping study. IEEE Access 8, 39113–39129 (2020)
Fawzi, A., Pastor, M., Martínez-Hinarejos, C.D.: Baseline detection on Arabic handwritten documents. In: Proceedings of the 2017 ACM Symposium on Document Engineering, pp. 193–196 (2017)
Feng, D., Chen, H.: A small samples training framework for deep learning-based automatic information extraction: case study of construction accident news reports analysis. Adv. Eng. Inform. 47, 101256 (2021)
Fenton, K., Simske, S.: Engineering of an artificial intelligence safety data sheet document processing system for environmental, health, and safety compliance. In: Proceedings of the 21st ACM Symposium on Document Engineering, pp. 1–4 (2021)
Fernandes, J., Simsek, M., Kantarci, B., Khan, S.: Tabledet: an end-to-end deep learning approach for table detection and table image classification in data sheet images. Neurocomputing 468, 317–334 (2022)
Fugini, M., Finocchi, J.: Quality evaluation for documental big data. In: Proceedings of the 22nd International Conference on Enterprise Information Systems—Volume 1: ICEIS, pp. 132–139. INSTICC, SciTePress (2020). https://doi.org/10.5220/0009394301320139
Gatos, B., Louloudis, G., Stamatopoulos, N., Sfikas, G.: Historical document processing. In: Proceedings of the 2017 ACM Symposium on Document Engineering, pp. 1–2 (2017)
Gómez-Pérez, P., Phan, T.N., Küeng, J.: Agricultural knowledge extraction from text sources using a distributed mapreduce cluster. In: 2016 27th International Workshop on Database and Expert Systems Applications (DEXA), pp. 29–33. IEEE (2016)
Gorai, M., Nene, M.J.: Layout and text extraction from document images using neural networks. In: 2020 5th International Conference on Communication and Electronics Systems (ICCES), pp. 1107–1112. IEEE (2020)
Grygoriev, A., Degtyarenko, I., Deriuga, I., Polotskyi, S., Melnyk, V., Zakharchuk, D., Radyvonenko, O.: Hcrnn: a novel architecture for fast online handwritten stroke classification. In: International Conference on Document Analysis and Recognition, pp. 193–208. Springer (2021)
Guerry, C., Coüasnon, B., Lemaitre, A.: Combination of deep learning and syntactical approaches for the interpretation of interactions between text-lines and tabular structures in handwritten documents. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 858–863. IEEE (2019)
Ha, H., Horák, A.: Information extraction from scanned invoice images using text analysis and layout features. Signal Proc.: Image Commun., 116601 (2021)
Hadjadji, B., Chibani, Y., Nemmour, H.: An efficient open system for offline handwritten signature identification based on curvelet transform and one-class principal component analysis. Neurocomputing 265, 66–77 (2017)
Hammarström, H.: Inventory and content separation in grammatical descriptions of languages of the world. In: International Conference on Theory and Practice of Digital Libraries, pp. 29–40. Springer (2021)
Holeček, M.: Learning from similarity and information extraction from structured documents. Int. J. Doc. Anal. Recogn. (IJDAR), 1–17 (2021)
Ivančić, L., Suša Vugec, D., Bosilj Vukšić, V.: Robotic process automation: systematic literature review. In: International Conference on Business Process Management, pp. 280–295. Springer (2019)
Jalali, F., Ebrahimi, A.: A novel mixed approach for detecting overlap in document images. In: 2017 Iranian Conference on Electrical Engineering (ICEE), pp. 1701–1707. IEEE (2017)
Jiang, J., Simsek, M., Kantarci, B., Khan, S.: High precision deep learning-based tabular position detection. In: 2020 IEEE Symposium on Computers and Communications (ISCC), pp. 1–7. IEEE (2020)
Jun, C., Suhua, Y., Shaofeng, J.: Automatic classification and recognition of complex documents based on faster RCNN. In: 2019 14th IEEE International Conference on Electronic Measurement & Instruments (ICEMI), pp. 573–577. IEEE (2019)
Kajla, N.I., Missen, M.M.S., Luqman, M.M., Coustaty, M., Mehmood, A., Choi, G.S.: Additive angular margin loss in deep graph neural network classifier for learning graph edit distance. IEEE Access 8, 201752–201761 (2020)
Kara, E., Traquair, M., Kantarci, B., Khan, S.: Deep learning for recognizing the anatomy of tables on datasheets. In: 2019 IEEE Symposium on Computers and Communications (ISCC), pp. 1–6. IEEE (2019)
Kara, E., Traquair, M., Simsek, M., Kantarci, B., Khan, S.: Holistic design for deep learning-based discovery of tabular structures in datasheet images. Eng. Appl. Artif. Intell. 90, 103551 (2020)
Keeling, R., Chhatwal, R., Huber-Fliflet, N., Zhang, J., Wei, F., Zhao, H., Shi, Y., Qin, H.: Empirical comparisons of CNN with other learning algorithms for text classification in legal document review. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 2038–2042. IEEE (2019)
Kitchenham, B., Charters, S.: Guidelines for performing systematic literature reviews in software engineering (2007)
Lenc, L., Martínek, J., Král, P., Nicolao, A., Christlein, V.: Hdpa: historical document processing and analysis framework. Evol. Syst. 12, 177–190 (2021)
Li, D., Wu, Y., Zhou, Y.: Linecounter: Learning handwritten text line segmentation by counting (2021). arXiv preprint arXiv:2105.11307
Li, J., Lin, C.M., Hu, S.x.: Intelligent document processing method based on robot process automation. In: 2021 Global Reliability and Prognostics and Health Management (PHM-Nanjing), pp. 1–6. IEEE (2021)
Ljajić, A., Stanković, M., Marovac, U.: Detection of negation in the Serbian language. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, pp. 1–6 (2018)
Long, S., He, X., Yao, C.: Scene text detection and recognition: the deep learning era. Int. J. Comput. Vis. 129(1), 161–184 (2021)
Mansar, Y., Kang, J., Maarouf, I.E.: The finsim-2 2021 shared task: learning semantic similarities for the financial domain. In: Companion Proceedings of the Web Conference 2021, pp. 288–292 (2021)
Martínez-Rojas, A., Sánchez-Oliva, J., López-Carnicer, J., Jiménez-Ramírez, A.: Airpa: An architecture to support the execution and maintenance of AI-powered RPA robots. In: International Conference on Business Process Management, pp. 38–48. Springer (2021)
Memon, J., Sami, M., Khan, R.A., Uddin, M.: Handwritten optical character recognition (OCR): a comprehensive systematic literature review (SLR). IEEE Access 8, 142642–142668 (2020)
Mercier, D., Rizvi, S.T.R., Rajashekar, V., Dengel, A., Ahmed, S.: Impactcite: An xlnet-based solution enabling qualitative citation impact analysis utilizing sentiment and intent. In: ICAART (2), pp. 159–168 (2021)
Mijangos, V., Sierra, G., Montes, A.: Sentence level matrix representation for document spectral clustering. Pattern Recogn. Lett. 85, 29–34 (2017)
Mittal, R., Garg, A.: Text extraction using OCR: a systematic review. In: 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 357–362. IEEE (2020)
Moreno, M., Schirmer, L., Bayser, M., Brandão, R., Cerqueira, R.: Understanding documents with hyperknowledge specifications. In: Proceedings of the ACM Symposium on Document Engineering 2018, pp. 1–4 (2018)
Neji, H., Halima, M.B., Hamdani, T.M., Nogueras-Iso, J., Alimi, A.M.: Blur2sharp: a GAN-based model for document image deblurring. Int. J. Comput. Intell. Syst. 14(1), 1315–1321 (2021)
Ng, K.K., Chen, C.H., Lee, C.K., Jiao, J.R., Yang, Z.X.: A systematic literature review on intelligent automation: aligning concepts from theory, practice, and future perspectives. Adv. Eng. Inform. 47, 101246 (2021)
Nguyen, M.T., Le, D.T., Le, L.: Transformers-based information extraction with limited data for domain-specific business documents. Eng. Appl. Artif. Intell. 97, 104100 (2021)
Obukhov, A., Krasnyanskiy, M.: Application of machine learning for document classification and processing in adaptive information systems. In: Computer Science On-line Conference, pp. 291–300. Springer (2020)
Oliveira, S.A., Seguin, B., Kaplan, F.: dhsegment: A generic deep-learning approach for document segmentation. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 7–12. IEEE (2018)
Oral, B., Emekligil, E., Arslan, S., Eryiǧit, G.: Information extraction from text intensive and visually rich banking documents. Inform. Proc. Manag. 57(6), 102361 (2020)
Pappu, A., Blanco, R., Mehdad, Y., Stent, A., Thadani, K.: Lightweight multilingual entity extraction and linking. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 365–374 (2017)
Park, C., Shin, J., Park, S., Lim, J., Lee, C.: Fast end-to-end coreference resolution for Korean. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pp. 2610–2624 (2020)
Pastor, M.: Text baseline detection, a single page trained system. Pattern Recogn. 94, 149–161 (2019)
Philips, J., Tabrizi, N.: Historical document processing: A survey of techniques, tools, and trends. In: KDIR, pp. 341–349 (2020)
Poddar, A., Chakraborty, A., Mukhopadhyay, J., Biswas, P.K.: Detection and localisation of struck-out-strokes in handwritten manuscripts. In: International Conference on Document Analysis and Recognition, pp. 98–112. Springer (2021)
Poddar, A., Chakraborty, A., Mukhopadhyay, J., Biswas, P.K.: Texrgan: a deep adversarial framework for text restoration from deformed handwritten documents. In: Proceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing, pp. 1–9 (2021)
Qasim, S.R., Mahmood, H., Shafait, F.: Rethinking table recognition using graph neural networks. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 142–147. IEEE (2019)
Qurashi, A.W., Holmes, V., Johnson, A.P.: Document processing: methods for semantic text similarity analysis. In: 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pp. 1–6. IEEE (2020)
Rabby, A.S.A., Islam, M.M., Hasan, N., Nahar, J., Rahman, F.: A novel deep learning character-level solution to detect language and printing style from a bilingual scanned document. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 5218–5226. IEEE (2020)
Research, E.G.: Everest group peak matrix tm for intelligent document processing (IDP) (2021)
Ribeiro, J., Lima, R., Paiva, S.: Document classification in robotic process automation using artificial intelligence—a preliminary literature review. Commun. Intell. Syst., 211–221 (2021)
Rusticus, D., Goldmann, L., Reisser, M., Villegas, M.: Document domain adaptation with generative adversarial networks. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1432–1437. IEEE (2019)
Santoro, A., De Stefano, C., Marcelli, A.: Assisted transcription of historical documents by keyword spotting: a performance model. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 971–976. IEEE (2017)
Shen, Z., Zhang, R., Dell, M., Lee, B.C.G., Carlson, J., Li, W.: Layoutparser: A unified toolkit for deep learning based document image analysis (2021). arXiv preprint arXiv:2103.15348
Shidaganti, G., Salil, S., Anand, P., Jadhav, V.: Robotic process automation with AI and OCR to improve business process. In: 2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. 1612–1618. IEEE (2021)
Sirajudeen, M., Anitha, R.: Forgery document detection in information management system using cognitive techniques. J. Intell. Fuzzy Syst. 39(6), 8057–8068 (2020)
Syed, R., Suriadi, S., Adams, M., Bandara, W., Leemans, S.J., Ouyang, C., ter Hofstede, A.H., van de Weerd, I., Wynn, M.T., Reijers, H.A.: Robotic process automation: contemporary themes and challenges. Comput. Indus. 115, 103162 (2020)
Tafti, A.P., Baghaie, A., Assefi, M., Arabnia, H.R., Yu, Z., Peissig, P.: OCR as a service: an experimental evaluation of Google Docs OCR, Tesseract, ABBYY Finereader, and Transym. In: International Symposium on Visual Computing, pp. 735–746. Springer (2016)
Tensmeyer, C., Martinez, T.: Document image binarization with fully convolutional neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 99–104. IEEE (2017)
Tensmeyer, C., Martinez, T.: Confirm-clustering of noisy form images using robust matching. Pattern Recogn. 87, 1–16 (2019)
Tomoiaga, C., Feng, P., Salzmann, M., Jayet, P.: Field typing for improved recognition on heterogeneous handwritten forms. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 487–493. IEEE (2019)
Toselli, A.H., Romero, V., Vidal, E.: Word graphs size impact on the performance of handwriting document applications. Neural Comput. Appl. 28(9), 2477–2487 (2017)
Tran, M.T., Trieu, L.Q., Tran, H.Q.: Document representation and classification with twitter-based document embedding, adversarial domain-adaptation, and query expansion. J. Heurist., 1–23 (2019)
Ubul, K., Tursun, G., Aysa, A., Impedovo, D., Pirlo, G., Yibulayin, T.: Script identification of multi-script documents: a survey. IEEE Access 5, 6546–6559 (2017)
Vinjit, B., Bhojak, M.K., Kumar, S., Chalak, G.: A review on handwritten character recognition methods and techniques. In: 2020 International Conference on Communication and Signal Processing (ICCSP), pp. 1224–1228. IEEE (2020)
Wang, J., Si, S., Hong, Z., Qu, X., Zhu, X., Xiao, J.: Case study of few-shot learning in text recognition models. In: International Conference on Web Information Systems Engineering, pp. 394–401. Springer (2021)
Würsch, M., Ingold, R., Liwicki, M.: Divaservices-a restful web service for document image analysis methods. Digit. Scholarship Human. 32(suppl_1), i150–i156 (2017)
Xiong, Z., Shen, Q., Wang, Y., Zhu, C.: Paragraph vector representation based on word to vector and CNN learning. Comput. Mater. Continua 55(2), 213–227 (2018)
Yamazaki, A., Sando, K., Suzuki, T., Aiba, A.: A handwritten Japanese historical kana reprint support system: Development of a graphical user interface. In: Proceedings of the ACM Symposium on Document Engineering 2018, pp. 1–4 (2018)
Yang, Y., Feng, Y., Ge, J., Zhou, Y., Zeng, J., Li, C., Luo, B.: Checking the statutes in Chinese judgment document based on editing distance algorithm. In: 2017 14th Web Information Systems and Applications Conference (WISA), pp. 197–200. IEEE (2017)
Zhao, M., Hochuli, A.G., Cheddad, A.: End-to-end approach for recognition of historical digit strings (2021). arXiv preprint arXiv:2104.13666
Zhu, X., Wang, J., Hong, Z., Xia, T., Xiao, J.: Federated learning of unsegmented chinese text recognition model. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1341–1345. IEEE (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A: List of Abbreviations
Appendix A: List of Abbreviations
- RPA:
-
Robotic Process Automation
- IDP:
-
Intelligent Document Processing
- AI:
-
Artificial Intelligence
- OCR:
-
Optical Character Recognition
- SLR:
-
Systematic Literature Review
- RQs:
-
Research Questions
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Martínez-Rojas, A., López-Carnicer, J.M., González-Enríquez, J., Jiménez-Ramírez, A., Sánchez-Oliva, J.M. (2023). Intelligent Document Processing in End-to-End RPA Contexts: A Systematic Literature Review. In: Bhattacharyya, S., Banerjee, J.S., De, D. (eds) Confluence of Artificial Intelligence and Robotic Process Automation. Smart Innovation, Systems and Technologies, vol 335. Springer, Singapore. https://doi.org/10.1007/978-981-19-8296-5_5
Download citation
DOI: https://doi.org/10.1007/978-981-19-8296-5_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8295-8
Online ISBN: 978-981-19-8296-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)