ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX

Kayal, Pratik; Anand, Mrinal; Desai, Harsh; Singh, Mayank

doi:10.1007/978-3-030-86337-1_50

Pratik Kayal¹¹,
Mrinal Anand¹¹,
Harsh Desai¹¹ &
…
Mayank Singh¹¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12824))

Included in the following conference series:

International Conference on Document Analysis and Recognition

3086 Accesses
2 Citations

Abstract

Tables present important information concisely in many scientific documents. Visual features like mathematical symbols, equations, and spanning cells make structure and content extraction from tables embedded in research documents difficult. This paper discusses the dataset, tasks, participants’ methods, and results of the ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX. Specifically, the task of the competition is to convert a tabula r image to its corresponding source code. We proposed two subtasks. In Subtask 1, we ask the participants to reconstruct the structure code from an image. In Subtask 2, we ask the participants to reconstruct the content code from an image. This report describes the datasets and ground truth specification, details the performance evaluation metrics used, presents the final results, and summarizes the participating methods. Submission by team VCGroup got the highest Exact Match accuracy score of 74% for Subtask 1 and 55% for Subtask 2, beating previous baselines by 5% and 12%, respectively. Although improvements can still be made to the recognition capabilities of models, this competition contributes to the development of fully automated table recognition systems by challenging practitioners to solve problems under specific constraints and sharing their approaches; the platform will remain available for post-challenge submissions at https://competitions.codalab.org/competitions/26979.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Brischoux, F., Legagneux, P.: Don’t format manuscripts. Sci. 23(7), 24 (2009)
Google Scholar
Côrte-Real, J., Mantadelis, T., Dutra, I., Roha, R., Burnside, E.: Skill-a stochastic inductive logic learner. In: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pp. 555–558. IEEE (2015)
Google Scholar
Deng, Y., Rosenberg, D.S., Mann, G.: Challenges in end-to-end neural scientific table recognition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 894–901 (2019)
Google Scholar
Deng, Y., Kanervisto, A., Ling, J., Rush, A.M.: Image-to-markup generation with coarse-to-fine attention. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 980–989. JMLR. org (2017)
Google Scholar
Embley, D.W., Hurst, M., Lopresti, D., Nagy, G.: Table-processing paradigms: a research survey. IJDAR 8(2–3), 66–86 (2006). https://doi.org/10.1007/s10032-006-0017-x
Article Google Scholar
Fang, J., Tao, X., Tang, Z., Qiu, R., Liu, Y.: Dataset, ground-truth and performance metrics for table detection evaluation. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 445–449. IEEE (2012)
Google Scholar
Feng, X., Yao, H., Yi, Y., Zhang, J., Zhang, S.: Scene text recognition via transformer. arXiv preprint arXiv:2003.08077 (2020)
Gao, L., et al.: ICDAR 2019 competition on table detection and recognition (CTDAR). In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1510–1515. IEEE (2019)
Google Scholar
Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1449–1453. IEEE (2013)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Google Scholar
He, Y., et al.: PingAn-VCGroup’s solution for ICDAR 2021 competition on scientific table image recognition to latex (2021)
Google Scholar
Li, M., Cui, L., Huang, S., Wei, F., Zhou, M., Li, Z.: TableBank: table benchmark for image-based table detection and recognition. In: LREC 2020, May 2020. https://www.microsoft.com/en-us/research/publication/tablebank-table-benchmark-for-image-based-table-detection-and-recognition/
Liu, L., et al.: On the variance of the adaptive learning rate and beyond. CoRR abs/1908.03265 (2019). http://arxiv.org/abs/1908.03265
Lu, N., Yu, W., Qi, X., Chen, Y., Gong, P., Xiao, R.: MASTER: multi-aspect non-local network for scene text recognition. CoRR abs/1910.02562 (2019). http://arxiv.org/abs/1910.02562
Lyu, P., Yang, Z., Leng, X., Wu, X., Li, R., Shen, X.: 2D attentional irregular scene text recognizer. arXiv preprint arXiv:1906.05708 (2019)
Niklaus, C., Cetto, M., Freitas, A., Handschuh, S.: A survey on open information extraction. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3866–3878 (2018)
Google Scholar
Siegel, N., Lourie, N., Power, R., Ammar, W.: Extracting scientific figures with distantly supervised neural networks. In: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries, pp. 223–232 (2018)
Google Scholar
Singh, M., Sarkar, R., Vyas, A., Goyal, P., Mukherjee, A., Chakrabarti, S.: Automated early leaderboard generation from comparative tables. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11437, pp. 244–257. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15712-8_16
Chapter Google Scholar
Tsourtis, A., Harmandaris, V., Tsagkarogiannis, D.: Parameterization of coarse-grained molecular interactions through potential of mean force calculations and cluster expansion techniques. In: Thermodynamics and Statistical Mechanics of Small Systems, vol. 19, p. 245 (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Yang, L., et al.: A simple and strong convolutional-attention network for irregular text recognition. arXiv preprint arXiv:1904.01375 (2019)
Yong, H., Huang, J., Hua, X., Zhang, L.: Gradient centralization: a new optimization technique for deep neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 635–652. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_37
Chapter Google Scholar
Zhang, H., et al.: Context encoding for semantic segmentation. CoRR abs/1803.08904 (2018). http://arxiv.org/abs/1803.08904
Zhang, M.R., Lucas, J., Hinton, G.E., Ba, J.: Lookahead optimizer: k steps forward, 1 step back. CoRR abs/1907.08610 (2019). http://arxiv.org/abs/1907.08610
Zhong, X., ShafieiBavani, E., Jimeno Yepes, A.: Image-based table recognition: data, model, and evaluation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 564–580. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_34
Chapter Google Scholar
Zhong, X., Tang, J., Yepes, A.J.: PubLayNet: largest dataset ever for document layout analysis. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1015–1022. IEEE, September 2019. https://doi.org/10.1109/ICDAR.2019.00166

Download references

Acknowledgments

This work was supported by The Science and Engineering Research Board (SERB), under sanction number ECR/2018/000087.

Author information

Authors and Affiliations

Indian Institute of Technology, Gandhinagar, India
Pratik Kayal, Mrinal Anand, Harsh Desai & Mayank Singh

Authors

Pratik Kayal
View author publications
You can also search for this author in PubMed Google Scholar
Mrinal Anand
View author publications
You can also search for this author in PubMed Google Scholar
Harsh Desai
View author publications
You can also search for this author in PubMed Google Scholar
Mayank Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pratik Kayal .

Editor information

Editors and Affiliations

Universitat Autònoma de Barcelona, Barcelona, Spain
Josep Lladós
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti
Kyushu University, Fukuoka-shi, Japan
Seiichi Uchida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kayal, P., Anand, M., Desai, H., Singh, M. (2021). ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_50

Download citation

DOI: https://doi.org/10.1007/978-3-030-86337-1_50
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86336-4
Online ISBN: 978-3-030-86337-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)