TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition

Khan, Umar; Zahid, Sohaib; Ali, Muhammad Asad; Ul-Hasan, Adnan; Shafait, Faisal

doi:10.1007/978-3-030-86331-9_38

Umar Khan¹¹,
Sohaib Zahid¹¹,
Muhammad Asad Ali¹¹,
Adnan Ul-Hasan¹¹ &
…
Faisal Shafait^11,12

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12822))

Included in the following conference series:

International Conference on Document Analysis and Recognition

3568 Accesses
3 Citations

Abstract

Table Structure Recognition is an essential part of end-to-end tabular data extraction in document images. The recent success of deep learning model architectures in computer vision remains to be non-reflective in table structure recognition, largely because extensive datasets for this domain are still unavailable while annotating new data is expensive and time-consuming. Traditionally, in computer vision, these challenges are addressed by standard augmentation techniques that are based on image transformations like color jittering and random cropping. As demonstrated by our experiments, these techniques are not effective for the task of table structure recognition. In this paper, we propose TabAug, a re-imagined Data Augmentation technique that produces structural changes in table images through replication and deletion of rows and columns. It also consists of a data-driven probabilistic model that allows control over the augmentation process. To demonstrate the efficacy of our approach, we perform experimentation on ICDAR 2013 dataset where our approach shows consistent improvements in all aspects of the evaluation metrics, with cell-level correct detections improving from 92.16% to 96.11% over the baseline.

U. Khan and S. Zahid—These authors have contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Arif, S., Shafait, F.: Table detection in document images using foreground and background features. Digital Image Comput. Tech. Appl. 2018, 1–8 (2018)
Google Scholar
Bansal, A., Harit, G., Dutta Roy, S.: Table extraction from document images using fixed point model. In: ICVGIP 2014: Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing, pp. 1–8 (2014)
Google Scholar
Chen, J., Lopresti, D.: Table detection in noisy off-line handwritten documents. In: 2011 International Conference on Document Analysis and Recognition, Beijing, China, pp. 399–403 (2011)
Google Scholar
Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: surprisingly easy synthesis for instance detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1310–1319 (2017)
Google Scholar
Fang, H., Sun, J., Wang, R., Gou, M., Li, Y., Lu, C.: InstaBoost: boosting instance segmentation via probability map guided copy-pasting. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 682–691 (2019)
Google Scholar
Gatos, B., Danatsas, D., Pratikakis, I., Perantonis, S.J.: Automatic table detection in document images. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 609–618. Springer, Heidelberg (2005). https://doi.org/10.1007/11551188_67
Chapter Google Scholar
Ghiasi, G., et al.: Simple copy-paste is a strong data augmentation method for instance segmentation. ArXiv (2020)
Google Scholar
Gilani, A., Qasim, S.R., Malik, I., Shafait, F.: Table detection using deep learning. In: 14th International Conference on Document Analysis and Recognition, pp. 771–776 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Kasar, T., Barlas, P., Adam, S., Chatelain, C., Paquet, T.: Learning to detect tables in scanned document images using line information. In: Twelfth International Conference on Document Analysis and Recognition, pp. 1185–1189 (2013)
Google Scholar
Kieninger, T., Dengel, A.: A paper-to-HTML table converting system. In: Proceedings of Document Analysis Systems, pp. 356–365 (1998)
Google Scholar
Kieninger, T., Dengel, A.: Table recognition and labeling using intrinsic layout features. In: International Conference on Advances in Pattern Recognition, pp. 307–316 (1999)
Google Scholar
Kieninger, T., Dengel, A.: Applying the T-Recs table recognition system to the business letter domain. In: International Conference on Document Analysis and Recognition, p. 0518 (2001)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25 (2012)
Google Scholar
Pyreddy, P., Croft, W.B.: TINTI: a system for retrieval in text tables TITLE2: Technical report, University of Massachusetts, USA (1997)
Google Scholar
Qasim, S.R., Mahmood, H., Shafait, F.: Rethinking table recognition using graph neural networks. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 142–147 (2019)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2015)
Google Scholar
Schreiber, S., Agne, S., Wolf, I., Dengel, A., Ahmed, S.: DeepDeSRT: deep learning for detection and structure recognition of tables in document images. In: Fourteenth International Conference on Document Analysis and Recognition, vol. 1, pp. 1162–1167 (2017)
Google Scholar
Shafait, F., Smith, R.: Table detection in heterogeneous documents. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 65–72. Document analysis systems (2010)
Google Scholar
Shahab, A., Shafait, F., Kieninger, T., Dengel, A.: An open approach towards the benchmarking of table structure recognition systems. In: Document Analysis Systems, pp. 113–120 (2010)
Google Scholar
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)
Article Google Scholar
Siddiqui, S., Malik, M., Agne, S., Dengel, A., Ahmed, S.: DeCNT: deep deformable CNN for table detection. IEEE Access 6, 74151–74161 (2018)
Article Google Scholar
Tensmeyer, C., Morariu, V.I., Price, B., Cohen, S., Martinez, T.: Deep splitting and merging for table structure decomposition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 114–121 (2019)
Google Scholar
Tupaj, S., Shi, Z., Chang, D.H.: Extracting tabular information from text files. In: EECS Department, Tufts University (1996)
Google Scholar
Zanibbi, R., Blostein, D., Cordy, J.: A survey of table recognition. IJDAR 7, 1–16 (2004)
Article Google Scholar

Download references

Acknowledgement

This work has been partially funded by the Higher Education Commission of Pakistan’s grant for National Center of Artificial Intelligence (NCAI).

Author information

Authors and Affiliations

Deep Learning Laboratory, National Center of Artificial Intelligence, Islamabad, Pakistan
Umar Khan, Sohaib Zahid, Muhammad Asad Ali, Adnan Ul-Hasan & Faisal Shafait
School of Electrical Engineering and Computer Science, National University of Sciences and Technology (NUST), Islamabad, Pakistan
Faisal Shafait

Authors

Umar Khan
View author publications
You can also search for this author in PubMed Google Scholar
Sohaib Zahid
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Asad Ali
View author publications
You can also search for this author in PubMed Google Scholar
Adnan Ul-Hasan
View author publications
You can also search for this author in PubMed Google Scholar
Faisal Shafait
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Umar Khan .

Editor information

Editors and Affiliations

Universitat Autònoma de Barcelona, Barcelona, Spain
Josep Lladós
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti
Kyushu University, Fukuoka-shi, Japan
Seiichi Uchida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khan, U., Zahid, S., Ali, M.A., Ul-Hasan, A., Shafait, F. (2021). TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12822. Springer, Cham. https://doi.org/10.1007/978-3-030-86331-9_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-86331-9_38
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86330-2
Online ISBN: 978-3-030-86331-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)