Abstract
The huge amount of document-based processes has considerably contributed to the need of automated systems which are able to appropriately digitize text in documents concerning forms. For example, the text in scanned administrative forms is not accessible without an adequate conversion from pixels to editable text. Against this background, many organizations tap the potential of Optical Character Recognition (OCR) as it is capable of supporting the digitization of text in documents. However, there is still a lack of integrated OCR approaches, considering both handwritten and machine printed texts, which are both of major importance in the context of digitizing text in forms. To address this problem, we propose a new hybrid OCR approach recognizing handwritten and machine printed text based on neural networks in an integrated perspective. We demonstrate the practical applicability of our approach using publicly available forms on which the approach could be successfully applied. Finally, we evaluate our novel hybrid approach in comparison to existing state-of-the-art approaches.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Manyika, J., Chui, M., Miremadi, M., et al.: A future that works: automation, employment, and productivity. McKinsey Global Institute (2017)
Geissbauer, R., Khurana, A., Arora, J.: Industry 4.0: Building the Digital Industrial Enterprise. PwC (2016)
Allianz Deutschland AG. https://www.allianz.de/gesundheit/private-krankenversicherung/rechnung-einreichen/#app. Accessed 30 Jan 2019
Weintraub, A., Le Clair, C.: The Forrester Wave™. Multichannel Capture, Q3 2012. Forrester Research, Inc. (2012)
Rehman, A., Saba, T.: Neural networks for document image preprocessing: state of the art. Artif. Intell. Rev. 42(2), 253–273 (2014)
Ahmad, I., Mahmoud, S.A.: Arabic bank check processing. State of the art. J. Comput. Sci. Technol. 28(2), 285–299 (2013)
Palacios, R., Gupta, A.: A system for processing handwritten bank checks automatically. Image Vis. Comput. 26(10), 1297–1313 (2008)
Department of the Treasury Internal Revenue Service: Internal Revenue Service Data Book. https://www.irs.gov/pub/irs-soi/17databk.pdf. Accessed 14 Jan 2019
McKinsey & Company: Bots, algorithms, and the future of the finance function. https://mck.co/2LcvwaM. Accessed 30 Jan 2019
Chaudhuri, A., Mandaviya, K., Badelia, P., Ghosh, S.K.: Optical Character Recognition Systems for Different Languages with Soft Computing. SFSC, vol. 352. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-50252-6
Singh, A., Desai, S.: Optical character recognition using template matching and back propagation algorithm. In: 3rd ICICT, pp. 1–6. IEEE (2016)
Dohrmann, T., Pinshaw, G.: The Road to Improved Compliance – A McKinsey Benchmarking Study of Tax Administrations. McKinsey & Company, Washington, D.C. (2009)
Xue, Y.: Optical Character Recognition. Department of Biomedical Engineering, University of Michigan (2014)
Balci, B., Saadati, D., Shiferaw, D.: Handwritten Text Recognition Using Deep Learning. CS231n: Convolutional Neural Networks for Visual Recognition, Stanford University, Course Project Report (2017)
Graves, A., Liwicki, M., Fernández, S., et al.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)
Su, B., Zhang, X., Lu, S., et al.: Segmented handwritten text recognition with recurrent neural network classifiers. In: 13th ICDAR, Tunis, Tunisia, pp. 386–390. IEEE (2015)
Shkarupa, Y., Mencis, R., Sabatelli, M.: Offline handwriting recognition using LSTM recurrent neural networks. In: 28th BNAIC, pp. 88–95. Springer (2016)
Salvi, D., Zhou, J., Waggoner, J., et al.: Handwritten text segmentation using average longest path algorithm. In: WACV, pp. 505–512. IEEE (2013)
Lee, S.-W., Kim, S.-Y.: Integrated segmentation and recognition of handwritten numerals with cascade neural network. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 29(2), 285–290 (1999)
El-Yacoubi, A., Gilloux, M., Sabourin, R., et al.: An HMM-based approach for off-line unconstrained handwritten word modeling and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21(8), 752–760 (1999)
Chakraborty, B., Mukherjee, P.S., Bhattacharya, U.: Bangla online handwriting recognition using recurrent neural network architecture. In: 10th ICVGIP. ACM (2016)
Kaltenmeier, A., Caesar, T., Gloger, J.M., et al.: Sophisticated topology of hidden Markov models for cursive script recognition. In: 2nd ICDAR, pp. 139–142. IEEE (1993)
Al-Muhtaseb, H.A., Mahmoud, S.A., Qahwaji, R.S.: Recognition of off-line printed Arabic text using Hidden Markov Models. Sig. Process. 88(12), 2902–2912 (2008)
Din, I.U., Siddiqi, I., Khalid, S., et al.: Segmentation-free optical character recognition for printed Urdu text. Eur. Assoc. Sig. Process. J. Image Video Process. 2017(62), 1–18 (2017)
Breuel, T.M., Ul-Hasan, A., Al-Azawi, M.A., et al.: High-performance OCR for printed English and Fraktur using LSTM networks. In: 12th ICDAR, pp. 683–687. IEEE (2013)
Naz, S., Hayat, K., Razzak, M.I., et al.: The optical character recognition of Urdu-like cursive scripts. Pattern Recogn. 47(3), 1229–1248 (2014)
MLP Finanzberatung SE. https://mlp.de/lebenssituationen/beruf/berufsunfaehigkeitsschutz-risikoanfrage-bei-zweifeln/. Accessed 30 Jan 2019
Peffers, K., Tuunanen, T., Rothenberger, M.A., et al.: A design science research methodology for information systems research. JMIS 24(3), 45–77 (2007)
Grother, P., Hanaoka, K.: NIST special database 19 handprinted forms and characters 2nd Edition. National Institute of Standards and Technology, Technical report (2016)
Srihari, S.N.: Recognition of handwritten and machine-printed text for postal address interpretation. Pattern Recogn. Lett. 14(4), 291–302 (1993)
Gorski, N., Anisimov, V., Augustin, E., et al.: Industrial bank check processing. The A2iA CheckReaderTM. IJDAR 3(4), 196–206 (2001)
Eskenazi, S., Gomez-Krämer, P., Ogier, J.-M.: A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recogn. 64, 1–14 (2017)
Clausner, C., Antonacopoulos, A., Pletschacher, S.: ICDAR2017 competition on recognition of documents with complex layouts. In: 14th ICDAR, pp. 1404–1410. IEEE (2017)
Smith, R.W.: Hybrid page layout analysis via tab-stop detection. In: 10th ICDAR, pp. 241–245. IEEE (2009)
Malakar, S., Das, R.K., Sarkar, R., et al.: Handwritten and printed word identification using gray-scale feature vector and decision tree classifier. Procedia Technol. 10, 831–839 (2013)
Srivastva, R., Raj, A., Patnaik, T., et al.: A survey on techniques of separation of machine printed text and handwritten text. IJEAT 2(3), 552–555 (2013)
Saidani, A., Kacem, A., Belaid, A.: Arabic/Latin and machine-printed/handwritten word discrimination using HOG-based shape descriptor. ELCVIA 14(2), 1–23 (2015)
Zagoris, K., Pratikakis, I., Antonacopoulos, A., et al.: Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recogn. 47(3), 1051–1062 (2014)
Marti, U., Bunke, H.: Text line segmentation and word recognition in a system for general writer independent handwriting recognition. In: 6th ICDAR, pp. 159–163. IEEE (2001)
Graves, A., Fernández, S., Gomez, F., et al.: Connectionist temporal classification. Labelling unsegmented sequence data with recurrent neural networks. In: 23rd ICML, pp. 369–376. ACM (2006)
Jacobs, C., Simard, P.Y., Viola, P., et al.: Text recognition of low-resolution document images. In: 8th ICDAR, pp. 695–699. IEEE Computer Society (2005)
Amin, A.: Recognition of printed Arabic text based on global features and decision tree learning techniques. Pattern Recogn. 33(8), 1309–1323 (2000)
Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 14th ICDAR, pp. 67–72. IEEE (2017)
Tran, T.A., Na, I.-S., Kim, S.-H.: Hybrid page segmentation using multilevel homogeneity structure. In: 9th IMCOM, pp. 78:1–78:6. ACM (2015)
He, L., Ren, X., Gao, Q., et al.: The connected-component labeling problem. A review of state-of-the-art algorithms. Pattern Recogn. 70, 25–43 (2017)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 CVPR, pp. 886–893. IEEE Computer Society (2005)
Park, D.C., El-Sharkawi, M.A., Marks, R.J., et al.: Electric load forecasting using an artificial neural network. IEEE Trans. Power Syst. 6(2), 442–449 (1991)
Bloomberg, D.S., Kopec, G.E., Dasari, L.: Measuring document image skew and orientation. In: Document Recognition II, vol. 2422, pp. 302–317 (1995)
The Tesseract open source OCR engine. https://github.com/tesseract-ocr/tesseract. Accessed 30 Jan 2019
Bengio, Y., LeCun, Y., et al.: Scaling learning algorithms towards AI. In: Large-Scale Kernel Machines, vol. 34, no. 5, pp. 1–41 (2007)
Abby FinerReader. https://www.abbyy.com/de-de/finereader/. Accessed 30 Jan 2019
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Graef, R., Morsy, M.M.N. (2019). A Novel Hybrid Optical Character Recognition Approach for Digitizing Text in Forms. In: Tulu, B., Djamasbi, S., Leroy, G. (eds) Extending the Boundaries of Design Science Theory and Practice. DESRIST 2019. Lecture Notes in Computer Science(), vol 11491. Springer, Cham. https://doi.org/10.1007/978-3-030-19504-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-19504-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19503-8
Online ISBN: 978-3-030-19504-5
eBook Packages: Computer ScienceComputer Science (R0)