Visual and Textual Information Fusion Method for Chart Recognition

Wang, Chen; Cui, Kaixu; Zhang, Suya; Xu, Changliang

doi:10.1007/978-3-030-68793-9_28

Chen Wang^16,17,
Kaixu Cui^16,17,
Suya Zhang¹⁸ &
…
Changliang Xu^16,17

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12668))

Included in the following conference series:

International Conference on Pattern Recognition

1784 Accesses
1 Citations

Abstract

In this report, we present our method in the ICPR 2020 Competition on Harvesting Raw Tables from Infographics, which is composed of Chart Classification, Text Detection/Recognition, Text Role Classification, Axis Analysis, Legend Analysis, Plot Element Detection/Classification and CSV Extraction. The image classification models of ResNet are adopt in Chart Classification. We adopted a two-stage based pipeline for end-to-end recognition, considering detection and recognition as two modules in Text Detection/Recognition. An ensemble model with LayoutLM and object detection model is adopted in Text Role Classification. A two-stage pipeline with two detection model is adopt in Legend Analysis. The final results are discussed.

S. Zang—Intern at XinHua ZhiYun Inc.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Davila, K., Kota, B.U., Setlur, S., et al.: ICDAR 2019 competition on harvesting raw tables from infographics (CHART-Infographics). In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1594–1599. IEEE (2019)
Google Scholar
PMC Homepage. https://www.ncbi.nlm.nih.gov/pmc/
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Deng, J., Dong, W., Socher, R., et al.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)
He, T., Tian, Z., Huang, W., Shen, C., Qiao, Y., Sun, C.: An end-to-end textspotter with explicit alignment and attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5020–5029 (2018)
Google Scholar
Dai, J., et al.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773 (2017)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162 (2018)
Google Scholar
Graves, A., Liwicki, M., Fernandez, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)
Article Google Scholar
Mnih, V., Heess, N., Graves, A.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Google Scholar
Karatzas, D., et al.: ICDAR 2013 robust reading competition. In: Proceedings of ICDAR, pp. 1484–1493. IEEE (2013)
Google Scholar
ICDAR 2017 competition on multilingual scene text detection and script identifification. https://rrc.cvc.uab.es/?ch=8&com=introduction, Accessed 16 Nov 2018
Shi, B., et al.: ICDAR2017 competition on reading chinese text in the wild (RCTW-17). In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1429–1434.11. IEEE (2017)
Google Scholar
Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Xu, Y., Li, M., Cui, L., et al.: Layoutlm: pre-training of text and layout for document image understanding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1192–1200 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

XinHua ZhiYun Inc., Hangzhou, China
Chen Wang, Kaixu Cui & Changliang Xu
State Key Laboratory of Media Convergence Production Technology and Systems, Beijing, China
Chen Wang, Kaixu Cui & Changliang Xu
State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing, China
Suya Zhang

Authors

Chen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kaixu Cui
View author publications
You can also search for this author in PubMed Google Scholar
Suya Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Changliang Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chen Wang .

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Alberto Del Bimbo
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Rita Cucchiara
Department of Computer Science, Boston University, Boston, MA, USA
Stan Sclaroff
Dipartimento di Matematica e Informatica, University of Catania, Catania, Italy
Giovanni Maria Farinella
Cloud & AI, JD.COM, Beijing, China
Tao Mei
Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Marco Bertini
Computational Sciences Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Tonantzintla,, Puebla, Mexico
Hugo Jair Escalante
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Roberto Vezzani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, C., Cui, K., Zhang, S., Xu, C. (2021). Visual and Textual Information Fusion Method for Chart Recognition. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12668. Springer, Cham. https://doi.org/10.1007/978-3-030-68793-9_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-68793-9_28
Published: 21 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68792-2
Online ISBN: 978-3-030-68793-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)