Skip to main content

ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2021 (ICDAR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12824))

Included in the following conference series:

Abstract

Tables present important information concisely in many scientific documents. Visual features like mathematical symbols, equations, and spanning cells make structure and content extraction from tables embedded in research documents difficult. This paper discusses the dataset, tasks, participants’ methods, and results of the ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX. Specifically, the task of the competition is to convert a tabula r image to its corresponding  source code. We proposed two subtasks. In Subtask 1, we ask the participants to reconstruct the  structure code from an image. In Subtask 2, we ask the participants to reconstruct the  content code from an image. This report describes the datasets and ground truth specification, details the performance evaluation metrics used, presents the final results, and summarizes the participating methods. Submission by team VCGroup got the highest Exact Match accuracy score of 74% for Subtask 1 and 55% for Subtask 2, beating previous baselines by 5% and 12%, respectively. Although improvements can still be made to the recognition capabilities of models, this competition contributes to the development of fully automated table recognition systems by challenging practitioners to solve problems under specific constraints and sharing their approaches; the platform will remain available for post-challenge submissions at https://competitions.codalab.org/competitions/26979.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://competitions.codalab.org/.

  2. 2.

    http://arxiv.org/.

  3. 3.

    https://github.com/emcconville/wand.

References

  1. Brischoux, F., Legagneux, P.: Don’t format manuscripts. Sci. 23(7), 24 (2009)

    Google Scholar 

  2. Côrte-Real, J., Mantadelis, T., Dutra, I., Roha, R., Burnside, E.: Skill-a stochastic inductive logic learner. In: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pp. 555–558. IEEE (2015)

    Google Scholar 

  3. Deng, Y., Rosenberg, D.S., Mann, G.: Challenges in end-to-end neural scientific table recognition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 894–901 (2019)

    Google Scholar 

  4. Deng, Y., Kanervisto, A., Ling, J., Rush, A.M.: Image-to-markup generation with coarse-to-fine attention. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 980–989. JMLR. org (2017)

    Google Scholar 

  5. Embley, D.W., Hurst, M., Lopresti, D., Nagy, G.: Table-processing paradigms: a research survey. IJDAR 8(2–3), 66–86 (2006). https://doi.org/10.1007/s10032-006-0017-x

    Article  Google Scholar 

  6. Fang, J., Tao, X., Tang, Z., Qiu, R., Liu, Y.: Dataset, ground-truth and performance metrics for table detection evaluation. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 445–449. IEEE (2012)

    Google Scholar 

  7. Feng, X., Yao, H., Yi, Y., Zhang, J., Zhang, S.: Scene text recognition via transformer. arXiv preprint arXiv:2003.08077 (2020)

  8. Gao, L., et al.: ICDAR 2019 competition on table detection and recognition (CTDAR). In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1510–1515. IEEE (2019)

    Google Scholar 

  9. Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1449–1453. IEEE (2013)

    Google Scholar 

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)

    Google Scholar 

  11. He, Y., et al.: PingAn-VCGroup’s solution for ICDAR 2021 competition on scientific table image recognition to latex (2021)

    Google Scholar 

  12. Li, M., Cui, L., Huang, S., Wei, F., Zhou, M., Li, Z.: TableBank: table benchmark for image-based table detection and recognition. In: LREC 2020, May 2020. https://www.microsoft.com/en-us/research/publication/tablebank-table-benchmark-for-image-based-table-detection-and-recognition/

  13. Liu, L., et al.: On the variance of the adaptive learning rate and beyond. CoRR abs/1908.03265 (2019). http://arxiv.org/abs/1908.03265

  14. Lu, N., Yu, W., Qi, X., Chen, Y., Gong, P., Xiao, R.: MASTER: multi-aspect non-local network for scene text recognition. CoRR abs/1910.02562 (2019). http://arxiv.org/abs/1910.02562

  15. Lyu, P., Yang, Z., Leng, X., Wu, X., Li, R., Shen, X.: 2D attentional irregular scene text recognizer. arXiv preprint arXiv:1906.05708 (2019)

  16. Niklaus, C., Cetto, M., Freitas, A., Handschuh, S.: A survey on open information extraction. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3866–3878 (2018)

    Google Scholar 

  17. Siegel, N., Lourie, N., Power, R., Ammar, W.: Extracting scientific figures with distantly supervised neural networks. In: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries, pp. 223–232 (2018)

    Google Scholar 

  18. Singh, M., Sarkar, R., Vyas, A., Goyal, P., Mukherjee, A., Chakrabarti, S.: Automated early leaderboard generation from comparative tables. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11437, pp. 244–257. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15712-8_16

    Chapter  Google Scholar 

  19. Tsourtis, A., Harmandaris, V., Tsagkarogiannis, D.: Parameterization of coarse-grained molecular interactions through potential of mean force calculations and cluster expansion techniques. In: Thermodynamics and Statistical Mechanics of Small Systems, vol. 19, p. 245 (2017)

    Google Scholar 

  20. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  21. Yang, L., et al.: A simple and strong convolutional-attention network for irregular text recognition. arXiv preprint arXiv:1904.01375 (2019)

  22. Yong, H., Huang, J., Hua, X., Zhang, L.: Gradient centralization: a new optimization technique for deep neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 635–652. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_37

    Chapter  Google Scholar 

  23. Zhang, H., et al.: Context encoding for semantic segmentation. CoRR abs/1803.08904 (2018). http://arxiv.org/abs/1803.08904

  24. Zhang, M.R., Lucas, J., Hinton, G.E., Ba, J.: Lookahead optimizer: k steps forward, 1 step back. CoRR abs/1907.08610 (2019). http://arxiv.org/abs/1907.08610

  25. Zhong, X., ShafieiBavani, E., Jimeno Yepes, A.: Image-based table recognition: data, model, and evaluation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 564–580. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_34

    Chapter  Google Scholar 

  26. Zhong, X., Tang, J., Yepes, A.J.: PubLayNet: largest dataset ever for document layout analysis. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1015–1022. IEEE, September 2019. https://doi.org/10.1109/ICDAR.2019.00166

Download references

Acknowledgments

This work was supported by The Science and Engineering Research Board (SERB), under sanction number ECR/2018/000087.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pratik Kayal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kayal, P., Anand, M., Desai, H., Singh, M. (2021). ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86337-1_50

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86336-4

  • Online ISBN: 978-3-030-86337-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics