Skip to main content

An End-to-End Table Structure Analysis Method Using Graph Attention Networks

  • Conference paper
  • First Online:
Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration (ICADL 2023)

Abstract

This paper proposes an end-to-end table structure analysis method using graph attention networks (GATs) that includes table detection. The proposed method initially identifies tables within documents, estimates whether horizontally adjacent tokens within the table belong to the same cell using GATs, subsequently estimates implicitly ruled lines required for cell separation but not actually drawn, and finally merges the remaining tokens to estimate cells, again using GATs. We have also collected 800 new tables and annotated them with structural information to augment the training data for the proposed method. Evaluation experiments showed that the proposed method achieved an F-measure of 0.984, outperforming other methods including the commercial ABBYY FineReader PDF in accuracy of table structure analysis with table detection. This paper also showed that the 800 newly annotated tables enhanced the proposed method’s accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://dblp.org.

  2. 2.

    https://pdf.abbyy.com.

  3. 3.

    https://github.com/kermitt2/pdfalto.

  4. 4.

    https://github.com/pdfminer/pdfminer.six.

  5. 5.

    https://opencv.org.

  6. 6.

    https://github.com/DevashishPrasad/CascadeTabNet.

  7. 7.

    https://github.com/pyg-team/pytorch_geometric.

  8. 8.

    https://nlp.stanford.edu/projects/glove.

  9. 9.

    https://www.nltk.org.

References

  1. Aoyagi, H., Kanazawa, T., Takasu, A., Uwano, F., Ohta, M.: Table-structure recognition method consisting of plural neural network modules. In: Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, pp. 542–549 (2022)

    Google Scholar 

  2. Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. Newsl. 6(1), 20–29 (2004)

    Google Scholar 

  3. Chi, Z., Huang, H., Xu, H.D., Yu, H., Yin, W., Mao, X.L.: Complicated table structure recognition. arXiv:1908.04729 (2019)

  4. Gao, L., et al.: ICDAR 2019 competition on table detection and recognition (cTDaR). In: Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1510–1515 (2019)

    Google Scholar 

  5. Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1449–1453 (2013)

    Google Scholar 

  6. Göbel, M., Hassan, T., Oro, E., Orsi, G.: A methodology for evaluating algorithms for table understanding in pdf documents. In: Proceedings of the 2012 ACM Symposium on Document Engineering, pp. 45–48 (2012)

    Google Scholar 

  7. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  8. Kuhar, M., Merčun, T.: Exploring user experience in digital libraries through questionnaire and eye-tracking data. Libr. Inf. Sci. Res. 44(3), 101175 (2022)

    Article  Google Scholar 

  9. Li, M., Cui, L., Huang, S., Wei, F., Zhou, M., Li, Z.: TableBank: table benchmark for image-based table detection and recognition. In: Proceedings of the Twelfth Language Resources and Evaluation Conference, pp. 1918–1925 (2020)

    Google Scholar 

  10. Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013)

  11. Ohta, M., Yamada, R., Kanazawa, T., Takasu, A.: Table-structure recognition method using neural networks for implicit ruled line estimation and cell estimation. In: Proceedings of the 21st ACM Symposium on Document Engineering (2021)

    Google Scholar 

  12. Prasad, D., Gadpal, A., Kapadni, K., Visave, M., Sultanpure, K.: CascadeTabNet: an approach for end to end table detection and structure recognition from image-based documents. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2439–2447 (2020)

    Google Scholar 

  13. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: Proceedings of International Conference on Learning Representations (2018)

    Google Scholar 

  14. Yamada, R., Ohta, M., Takasu, A.: An automatic graph generation method for scholarly papers based on table structure analysis. In: Proceedings of the 10th International Conference on Management of Digital EcoSystems, pp. 132–140 (2018)

    Google Scholar 

  15. Zheng, X., Burdick, D., Popa, L., Zhong, X., Wang, N.X.R.: Global table extractor (GTE): a framework for joint table identification and cell structure recognition using visual context. In: Proceedings of 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 697–706 (2021)

    Google Scholar 

  16. Zhong, X., ShafieiBavani, E., Jimeno Yepes, A.: Image-based table recognition: data, model, and evaluation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 564–580. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_34

    Chapter  Google Scholar 

Download references

Acknowledgements

This work was supported by JSPS KAKENHI Grant Number JP22H03904 and ROIS NII Open Collaborative Research 2023 (23FC02).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manabu Ohta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ohta, M., Aoyagi, H., Uwano, F., Kanazawa, T., Takasu, A. (2023). An End-to-End Table Structure Analysis Method Using Graph Attention Networks. In: Goh, D.H., Chen, SJ., Tuarob, S. (eds) Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration. ICADL 2023. Lecture Notes in Computer Science, vol 14458. Springer, Singapore. https://doi.org/10.1007/978-981-99-8088-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8088-8_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8087-1

  • Online ISBN: 978-981-99-8088-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics