Multi-Type-TD-TSR – Extracting Tables from Document Images Using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: From OCR to Structured Table Representations

Fischer, Pascal; Smajic, Alen; Abrami, Giuseppe; Mehler, Alexander

doi:10.1007/978-3-030-87626-5_8

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12873))

Included in the following conference series:

German Conference on Artificial Intelligence (Künstliche Intelligenz)

1099 Accesses
8 Citations

Abstract

As global trends are shifting towards data-driven industries, the demand for automated algorithms that can convert images of scanned documents into machine readable information is rapidly growing. In addition to digitization there is an improvement toward process automation that used to require manual inspection of documents. Although optical character recognition (OCR) technologies mostly solved the task of converting human-readable characters from images, the task of extracting tables has been less focused on. This recognition consists of two sub-tasks: table detection and table structure recognition. Most prior work on this problem focuses on either task without offering an end-to-end solution or paying attention to real application conditions like rotated images or noise artefacts. Recent work shows a clear trend towards deep learning using transfer learning for table structure recognition due to the lack of sufficiently large datasets. We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for table recognition. It utilizes state-of-the-art deep learning models and differentiates between three types of tables based on their borders. For the table structure recognition we use a deterministic non-data driven algorithm, which works on all three types. In addition, we present an algorithm for non-bordered tables and one for bordered ones as the basis of our table structure detection algorithm. We evaluate Multi-Type-TD-TSR on a self annotated subset of the ICDAR 2019 table structure recognition dataset [5] and achieve a new state-of-the-art. Source code is available under https://github.com/Psarpei/Multi-Type-TD-TSR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Image-Based Relation Classification Approach for Table Structure Recognition

TableStrRec: framework for table structure recognition in data sheet images

Article 08 September 2023

TRACE: Table Reconstruction Aligned to Corner and Edges

References

Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Bühler, B., Paulheim, H.: Web table classification based on visual features. CoRR abs/2103.05110 (2021). https://arxiv.org/abs/2103.05110
Cohen, W.W., Hurst, M., Jensen, L.S.: A flexible learning system for wrapping tables and lists in html documents. In: Proceedings of the 11th International Conference on World Wide Web, WWW 2002, pp. 232–241. Association for Computing Machinery, New York (2002). https://doi.org/10.1145/511446.511477
Cortes, C., Vapnik, V.: Support vector machine. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Gao, L., et al.: ICDAR 2019 competition on table detection and recognition (CTDAR). In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1510–1515 (2019). https://doi.org/10.1109/ICDAR.2019.00243
Gatterbauer, W., Bohunsky, P., Herzog, M., Krüpl, B., Pollak, B.: Towards domain-independent information extraction from web tables. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 71–80. Association for Computing Machinery, New York (2007). https://doi.org/10.1145/1242572.1242583
Gilani, A., Qasim, S.R., Malik, I., Shafait, F.: Table detection using deep learning. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 771–776. IEEE (2017)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014)
Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1449–1453 (2013). https://doi.org/10.1109/ICDAR.2013.292
Kasar, T., Barlas, P., Adam, S., Chatelain, C., Paquet, T.: Learning to detect tables in scanned document images using line information. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1185–1189. IEEE (2013)
Google Scholar
Kieninger, T., Dengel, A.: The T-recs table recognition and analysis system. In: Lee, S.-W., Nakano, Y. (eds.) DAS 1998. LNCS, vol. 1655, pp. 255–270. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48172-9_21
Chapter Google Scholar
Li, M., Cui, L., Huang, S., Wei, F., Zhou, M., Li, Z.: TableBank: table benchmark for image-based table detection and recognition. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 1918–1925. European Language Resources Association, Marseille (2020). https://www.aclweb.org/anthology/2020.lrec-1.236
Lu, T., Dooms, A.: Probabilistic homogeneity for document image segmentation. Pattern Recognit. 109, 1–14 (2021). https://doi.org/10.1016/j.patcog.2020.107591
Article Google Scholar
Prasad, D., Gadpal, A., Kapadni, K., Visave, M., Sultanpure, K.: CascadeTabNet: an approach for end to end table detection and structure recognition from image-based documents. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 572–573 (2020)
Google Scholar
Pyreddi, P., Croft, W.B.: A system for retrieval in text tables. In: ACM DL (1997)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497 (2015)
Reza, M.M., Bukhari, S.S., Jenckel, M., Dengel, A.: Table localization and segmentation using GAN and CNN. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 5, pp. 152–157. IEEE (2019)
Google Scholar
Riba, P., Dutta, A., Goldmann, L., Fornés, A., Ramos, O., Lladós, J.: Table detection in invoice documents by graph neural networks. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 122–127 (2019). https://doi.org/10.1109/ICDAR.2019.00028
Rosebrock, A.: Text skew correction with opencv and python (2017). pyImageSearch, https://www.pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/. Accessed 17 Feb 2021
Schreiber, S., Agne, S., Wolf, I., Dengel, A., Ahmed, S.: DeepDeSRT: deep learning for detection and structure recognition of tables in document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1162–1167. IEEE (2017)
Google Scholar
Seo, W., Koo, H.I., Cho, N.I.: Junction-based table detection in camera-captured document images. Int. J. Doc. Anal. Recognit. (IJDAR) 18, 1–11 (2014). https://doi.org/10.1007/s10032-014-0226-7
Article Google Scholar
Subramanyam, V.S.: Iou (intersection over union) (2017). Medium. https://medium.com/analytics-vidhya/iou-intersection-over-union-705a39e7acef. Accessed 09 July 2021
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Text Technology Lab, Goethe University Frankfurt, Frankfurt, Germany
Pascal Fischer, Alen Smajic, Giuseppe Abrami & Alexander Mehler

Authors

Pascal Fischer
View author publications
You can also search for this author in PubMed Google Scholar
Alen Smajic
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Abrami
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Mehler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pascal Fischer .

Editor information

Editors and Affiliations

Czech Technical University in Prague, Prague, Czech Republic
Stefan Edelkamp
University of Lübeck, Lübeck, Germany
Ralf Möller
University of Leoben, Leoben, Austria
Elmar Rueckert

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fischer, P., Smajic, A., Abrami, G., Mehler, A. (2021). Multi-Type-TD-TSR – Extracting Tables from Document Images Using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: From OCR to Structured Table Representations. In: Edelkamp, S., Möller, R., Rueckert, E. (eds) KI 2021: Advances in Artificial Intelligence. KI 2021. Lecture Notes in Computer Science(), vol 12873. Springer, Cham. https://doi.org/10.1007/978-3-030-87626-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-87626-5_8
Published: 30 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87625-8
Online ISBN: 978-3-030-87626-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-Type-TD-TSR – Extracting Tables from Document Images Using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: From OCR to Structured Table Representations

Abstract

Access this chapter

Similar content being viewed by others

Image-Based Relation Classification Approach for Table Structure Recognition

TableStrRec: framework for table structure recognition in data sheet images

TRACE: Table Reconstruction Aligned to Corner and Edges

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Multi-Type-TD-TSR – Extracting Tables from Document Images Using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: From OCR to Structured Table Representations

Abstract

Access this chapter

Similar content being viewed by others

Image-Based Relation Classification Approach for Table Structure Recognition

TableStrRec: framework for table structure recognition in data sheet images

TRACE: Table Reconstruction Aligned to Corner and Edges

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation