Skip to main content
Log in

Review of chart image detection and classification

  • Survey
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript


This paper presents a complete review of different approaches across all components of the chart image detection and classification up to date. A set of 89 scientific papers is collected, analyzed, and enlisted into four categories: chart-type classification, chart text processing, chart data extraction, and chart description generation. Detailed information about problem formulation and a research field is provided, and an overview of used methods in each category. Each paper's contribution is noted, including the essential information for authors in this research field. In the end, a comparison is made between the reported results. The state-of-the-art methods in each category are described, and a research direction is given. We have also analyzed the open challenges that still exist and require the author's attention.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others


  1. Chen, C., Härdle, W., Unwin, A., Friendly, M.: A brief history of data visualization. In Handbook of Data Visualization, pp. 15–56. Springer Handbooks Comp.Statistics. Springer, Berlin, Heidelberg (2008).

  2. Spence, I.: William playfair and the psychology of graphs. In: JSM - Proceedings of the American Statistical Association, pp. 2426–2436 (2006). Accessed 01 May 2020

  3. Schwartz, S.E., Chester, D., Elzer, S.: Getting Computers to See Information Graphics So Users Do Not Have to, Foundations of Intelligent Systems. ISMIS 2005. Lecture Notes in Computer Science (2005), Springer, Berlin, Heidelberg, vol. 3488 LNAI, pp. 660–668 (2005).

  4. Bajić, F., Job, J., Nenadić, K.: Data visualization classification using simple convolutional neural network model. Int. J. Electr. Comput. Eng. Syst. (2020)

  5. Poco, J., Heer, J.: Reverse-engineering visualizations: recovering visual encodings from chart images. Comput. Graph. Forum 36(3), 353–363 (2017).

    Article  Google Scholar 

  6. Bajić, F., Job, J., Nenadić, K.:Chart classification using simplified VGG model. In Proceedings of the 2019 International Conference on Systems, Signals and Image Processing (IWSSIP), pp. 229–233. ISSN: 2157–8702.

  7. Liu, Y., Lu, X., Qin, Y., Tang, Z., Xu, J.: Review of chart recognition in document images. SPIE, vol. 8654 (2013).

  8. Davila, K., Setlur, S., Doermann, D., Kota, B.U., Govindaraju, V.: Chart mining: a survey of methods for automated chart analysis. Trans. Pattern Anal. Mach. Intell. (2020). Accessed 30 Aug 2020

  9. Shahira, K.C., Lijiya, A.: Towards assisting the visually impaired: a review on techniques for decoding the visual data from chart images. IEEE Access 9, 52926–52943 (2021).

    Article  Google Scholar 

  10. Battle, L., Duan, P., Miranda, Z., Mukusheva, D., Chang, R., Stonebraker, M.: Beagle: automated extraction and interpretation of visualizations from the Web. In Conference on Human Factors in Computing Systems - Proceedings, vol. 2018-April, pp. 1–8 (2018). Accessed 26 Sept 2021

  11. Lin, A.Y., Ford, J., Adar, E., Hecht, B.: VizByWiki: mining data visualizations from the web to enrich news articles. In The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018, pp. 873–882 (2018).

  12. Chen, Z., Cafarella, M., Adar, E.: DiagramFlyer: a search engine for data-driven diagrams. In WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web, pp. 183–186 (2015).

  13. Choudhury, S.R., Giles, C.L.: An architecture for information extraction from figures in digital libraries. In WWW 2015 Companion - Proceedings of the 24th International Conference on World Wide Web, pp. 667–672 (2015).

  14. Al-Zaidy, R.A., Choudhury, S.R., Giles, C.L.: Automatic summary generation for scientific data charts. In Workshops at the Thirtieth AAAI Conference on Artificial Intelligence, 2016. Accessed 26 Sept 2021

  15. Balaji, A., Ramanathan, T., Sonathi, V.: Chart-text: a fully automated chart image descriptor (2018). arXiv:1812.10636. Accessed 26 Sept 2021

  16. Choi, J., Jung, S., Park, D.G., Choo, J., Elmqvist, N.: Visualizing for the non-visual: Enabling the visually impaired to use visualization. Comput. Graph. Forum 38(3), 249–260 (2019).

    Article  Google Scholar 

  17. Liu, X., Klabjan, D., NBless, P.: Data Extraction from Charts via Single Deep Neural Network. arXiv preprint (2019). Accessed 26 Sept 2021

  18. Savva, M., Kong, N., Chhajta, A., Li, F F., Agrawala, M., Heer, J.: ReVision: automated classification, analysis and redesign of chart images. In UIST’11 - Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, pp. 393–402 (2011).

  19. Shukla, S., Samal, A.: Recognition and quality assessment of data charts in mixed-mode documents. Int. J. Doc. Anal. Recogn. 11(3), 111–126 (2008).

    Article  Google Scholar 

  20. Leo, F., Gitte, L., Livia, S., Bruce, T.: Evaluating a tool for improving accessibility to charts and graphs. ACM Trans. Comput.-Human Interact. (TOCHI) 20(5), 1–32 (2013).

    Article  Google Scholar 

  21. Jung, D., Kim, W., Song, H., Hwang, J., Lee, B., Kim, B.H., Seo, J.: ChartSense: interactive data extraction from chart images. In Conference on Human Factors in Computing Systems - Proceedings, vol. 2017-May, pp. 6706–6717 (2017).

  22. Fasciano M., Lapalme, G.: PostGraphe: a system for the generation of statistical graphics and text. In International Natural Language Generation Conference (1996). Accessed 26 Sept 2021

  23. Siegel, N., Horvitz, Z., Levin, R., Divvala, S., Farhadi, A.: FigureSeer: parsing result-figures in research papers. Lecture Notes in Computer Science, vol. 9911 LNCS, pp. 664–680 (2016).

  24. Jobin, K.V., Mondal, A., Jawahar, C.V.: DocFigure: a dataset for scientific document figure classification. In 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), pp. 74–79 (2019).

  25. Prasad, V.S.N., Siddiquie, B. Golbeck, J., Davis, L.S.: Classifying computer generated charts. In CBMI’2007 - 2007 International Workshop on Content-Based Multimedia Indexing, Proceedings, pp. 85–92 (2007).

  26. Amara, J., Kaur, P., Owonibi, M., Bouaziz, B.: Convolutional neural network based chart image classification,” 25th International Conference in Central Europe on Computer Graphics (2017)

  27. Chagas, P., Akiyama, R., Meiguins, A., Santos, C., Saraiva, F., Meiguins, B., Morais, J.: Evaluation of convolutional neural network architectures for chart image classification. In Proceedings of the International Joint Conference on Neural Networks, vol. 2018 (2018).

  28. Shahira, K.C., Lijiya, A.: Document image classification: towards assisting visually impaired. In IEEE Region 10 Annual International Conference, Proceedings/TENCON, vol. 2019-October, pp. 852–857 (2019).

  29. Kaur, P., Kiesel, D., Combining image and caption analysis for classifying charts in biodiversity texts. VISIGRAPP 2020 - Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, vol. 3, pp. 157–168 (2020).

  30. Zhou, Y.P., Tan, C.L.: Bar charts recognition using hough based syntactic segmentation. Lecture Notes in Computer Science, pp. 494–497 (2000).

  31. Zhou, Y.P., Tan, C.L.: Hough technique for bar charts detection and recognition in document images. IEEE International Conference on Image Processing 2, 605–608 (2000).

    Article  Google Scholar 

  32. Redeke, I.: Image & graphic reader. IEEE International Conference on Image Processing 1, 806–809 (2001).

    Article  Google Scholar 

  33. Huang, W., Zong, S., Tan, C.L.: Chart image classification using multiple-instance learning. In Proceedings - IEEE Workshop on Applications of Computer Vision, WACV 2007 (2007).

  34. Karthikeyani, V., Nagarajan, S.: Machine learning classification algorithms to recognize chart types in portable document format (PDF) files. Int. J. Comput. Appl. 39(2), 1–5 (2012).

    Article  Google Scholar 

  35. Liu, X., Tang, B., Wang, Z., Xu, X., Pu, S., Tao, D., Song, M.: Chart classification by combining deep convolutional networks and deep belief networks. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, vol. 2015-November, pp. 801–805 (2015).

  36. Choudhury, S.R., Wang, S., Mitra, P.: Automated data extraction from scholarly line graphs. GREC 2015 (2015). Accessed 26 Sept 2021

  37. Chagas, P., Freitas, A.A., Akiyama, R D., Miranda, B.: Architecture proposal for data extraction of chart images using convolutional neural network. In Proceedings - 2017 21st International Conference Information Visualisation, iV 2017, pp. 318–323 (2017).

  38. Shi, Y., Wei, Y., Wu, T., Liu, Q. Statistical graph classification in intelligent mathematics problem solving system for high school student. ICCSE 2017 - 12th International Conference on Computer Science and Education, pp. 645–650 (2017).

  39. Kavasidis, I., Palazzo, S., Spampinato, C., Pino, C., Giordano, D., Giuffrida, D., Messina, P.: A Saliency-based Convolutional Neural Network for Table and Chart Detection in Digitized Documents. Lecture Notes in Computer Science, vol. 11752 LNCS, pp. 292–302 (2018). Accessed 26 Sept 2021

  40. Gokhan, A.: “DeepGraphNet: grafiklerin sınıflandırılmasında derin öğrenme modelleri”. Avrupa Bilim ve Teknoloji Dergisi, pp. 319–329 (2019).

  41. Huang, S.: An Image Classification Tool of Wikimedia Commons. Berlin (2020). Accessed 26 Sept 2021

  42. Kosemen, C., Birant, D.: Multi-label classification of line chart images using convolutional neural networks. SN Appl. Sci. 2(7), 1–20 (2020).

    Article  Google Scholar 

  43. Ishihara, T., Morita, K., Shirai, N.C., Wakabayashi, T., Ohyama, W.: Chart-type classification using convolutional neural network for scholarly figures. Lecture Notes in Computer Science, vol. 12047 LNCS, pp. 252–261 (2020).

  44. Dai, W., Wang, M., Niu, Z., Zhang, J.: Chart decoder: Generating textual and numeric information from chart images automatically. J. Vis. Lang. Comput. 48, 101–109 (2018).

    Article  Google Scholar 

  45. Al-Zaidy, R.A., Giles, C.L.: A machine learning approach for semantic structuring of scientific charts in scholarly documents. Twenty-Ninth IAAI Conference (2017)

  46. Vougiouklis, P., Carr, L.,Simperl, E.: Pie chart or pizza: identifying chart types and their virality on Twitter. In Proceedings of the International AAAI Conference on Web and Social Media, vol. 14, pp. 694–704 (2020). Accessed 26 Sept 2021

  47. Araújo, T., Chagas, P., Alves, J., Santos, C., Santos, B.S., Meiguins, B.S.: A real-world approach on the problem of chart recognition using classification, detection and perspective correction. Sensors 2020, vol. 20, no. 16 (2020).

  48. Dadhich, K., Daggubati, S., Sreevalsan-Nair, J.: BarChartAnalyzer: digitizing images of bar charts. IMPROVE, pp. 17–28 (2021).

  49. Ma, W., Zhang, H., Yan, S., Yao, G., Hiang, Y., Li, H., Wu, Y., Jin, L.: Towards an efficient framework for Data Extraction from Chart Images (2021). Accessed 26 Sept 2021

  50. Thiyam, J., Singh, S.R., Bora, P.K.: Challenges in chart image classification. In Proceedings of the 21st ACM Symposium on Document Engineering, pp. 1–4 (2021).

  51. Rane, C., Subramanya, S., Endluri, D., Wu, J., Giles, C.L.: ChartReader: automatic parsing of bar-plots. Accessed 26 Sept 2021

  52. Gao, J., Zhou, Y., Barner, K.E.: View: Visual Information Extraction Widget for improving chart images accessibility. In Proceedings - International Conference on Image Processing, ICIP, pp. 2865–2868 (2012).

  53. Nair, R.R., Sankaran, N., Nwogu, I., Govindaraju, V.: Automated analysis of line plots in documents. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, vol. 2015, pp. 796–800 (2015).

  54. Mishchenko, A., Vassilieva, N.: Chart image understanding and numerical data extraction. In 6th International Conference on Digital Information Management, ICDIM 2011, pp. 115–120 (2011).

  55. Mishchenko, A., Vassilieva, N.: Model-based recognition and extraction of information from chart images. J. Multim. Process. Technol. 2(2), 76–89 (2011)

  56. A. Mishchenko and N. Vassilieva, “Model-based chart image classification,” Lecture Notes in Computer Science, vol. 6939 LNCS, no. PART 2, pp. 476–485, 2011.

  57. Weihua, H.: Scientific chart image recognition and interpretation, Singapore (2008). Accessed 26 Sept 2021

  58. Karthikeyani, V., Nagarajan, S.: Scientific chart image property identification by connected component labeling in PDF files. ICECT 2011–2011 3rd International Conference on Electronics Computer Technology, vol. 4, pp. 209–212 (2011).

  59. Mishra, P., Kumar, S., Chaube, M.K.: ChartFuse: a novel fusion method for chart classification using heterogeneous microstructures. Multim. Tools Appl. 80(7), 10417–10439 (2021).

    Article  Google Scholar 

  60. Huang, W., Tan, C.L., Leow, W.K.: Associating text and graphics for scientific chart understanding. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR 2005, 580–584 (2005).

    Article  Google Scholar 

  61. Zhou, Y., Tan, C.L.: Chart analysis and recognition in document images. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, vol. 2001-January, pp. 1055–1058 (2001).

  62. Zhou, Y., Zhou, Y., Tan, C.L.: Learning-based scientific chart recognition. In 4th IAPR International Workshop on Graphics Recognition, GREC2001, vol. 4, pp. 482–492 (2001). Accessed 26 Sept 2021

  63. Davila, K., Kota, B.U., Setlur, S., Govindaraju, V., Tensmeyer, C., Shekhar, S.,Chaudhry, R.: “ICDAR 2019 Competition on Harvesting Raw Tables from Infographics (CHART-Infographics). In 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1594–1599 (2019). Accessed 30 Aug 2020

  64. Davila, K., Tensmeyer, C., Shekhar, S., Singh, H., Setlur, S., Govindaraju, V.: ICPR 2020 - Competition on harvesting raw tables from infographics. Lect. Notes Comput. Sci. 12668, 361–380 (2021).

    Article  Google Scholar 

  65. Gao, J., Zhou, Y., Sensing, K.B.: Classifying chart images with sparse coding. Compressive Sensing, vol. 8365 (2012). Accessed 26 Sept 2021

  66. Yang, L., Huang, W., Tan, C.L.: Semi-automatic ground truth generation for chart image recognition. Lecture Notes in Computer Science, vol. 3872 LNCS, pp. 324–335 (2006).

  67. Liu, R., Huang, W., Chew, L.T.: Extraction of vectorized graphical information from scientific chart images. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR 1, 521–525 (2007).

    Article  Google Scholar 

  68. Svendsen, J., Albu, A.B.: Document segmentation via oblique cuts. Document Recognition and Retrieval XX, vol. 8658 (2013). Accessed 26 Sept 2021

  69. Al-Zaidy, A. Rabah, and C. L. Giles, “Automatic extraction of data from bar charts,” Proceedings of the 8th International Conference on Knowledge Capture, K-CAP 2015, pp. 1–4, Oct. 2015.

  70. Zhou, F., Zhao, Y., Chen, W., Tan, Y., Xu, Y., Chen, Y., Licu, C., Zhao, Y.: Reverse-engineering bar charts using neural networks. J. Visual. 24, 419–435 (2021)

    Article  Google Scholar 

  71. M. Cliche, D. Rosenberg, D. Madeka, and C. Yee, “Scatteract: Automated extraction of data from scatter plots,” Lecture Notes in Computer Science, vol. 10534 LNAI, pp. 135–150, Apr. 2017.

  72. Chen, L., Zhao, K.: An approach for chart description generation in cyber–physical–social system. Symmetry 13(9), 1552 (2021).

    Article  Google Scholar 

  73. Huang, W., Tan, C.L., Leow, W.K.: Model-based chart image recognition. Lect. Notes Comput. Sci. 3088, 87–99 (2003).

    Article  Google Scholar 

  74. Lu, X., Wang, J.Z., Mitra, P., Giles, C.L.: Automatic extraction of data from 2-D plots in documents. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR 1, 188–192 (2007).

    Article  Google Scholar 

  75. De, P.: Automatic data extraction from 2D and 3D pie chart images. In Proceedings of the 8th International Advance Computing Conference, IACC 2018, pp. 20–25 (2018).

  76. Sohn, C., Choi, H., Kim, K., Park, J., Noh, J.: Line chart understanding with convolutional neural network. Electronics 10(6), 749 (2021).

    Article  Google Scholar 

  77. Obeid, J., Hoque, E.: Chart-to-Text: generating natural language descriptions for charts by adapting the transformer model. arXiv preprint (2020). Accessed 26 Sept 2021

  78. Liu, C., Xie, L., Han, Y., Wei, A., Yuan, X.: AutoCaption: an approach to generate natural language description from visualization automatically. IEEE Pacific Visualization Symposium, vol. 2020, pp. 191–195 (2020).

  79. Zhu, J., Ran, J., Lee, R.K., Choo, K., Li, Z.: AutoChart: A dataset for chart-to-text generation task (2021). Accessed 26 Sept 2021

  80. Ferres, L., Verkhogliad, P., Lindgaard, G., Boucher, L., Chretien, A., Lachance, M.: Improving accessibility to statistical graphs: The iGraph-lite system, ASSETS’07: Proceedings of the Ninth International ACM SIGACCESS Conference on Computers and Accessibility, pp. 67–74 (2007).

  81. Demir, S., Schwartz, S., Burns, R., Carberry, S.: What is being measured in an information graphic? In International Conference on Intelligent Text Processing and Computational Linguistics, vol. 7816 LNCS, no. PART 1, pp. 501–512 (2013).

  82. Elzer, S., Schwartz, E., Carberry, S., Chester, D., Demir, S., Wu, P.: Accessible bar charts for visually impaired users. In Fourth Annual IASTED Intl. Conf. on Telehealth and Assistive Technologies, pp. 55–60 (2008)

  83. Elzer, S., Schwartz, E., Carberry, S., Chester, D., Demir, S., Wu, P.: A Browser Extension for Providing Visually Impaired Users Access to the Content of Bar Charts on The Web,” WEBIST, pp. 59–66 (2007). Accessed 26 Sept 2021

  84. Wu, P., Carberry, S., Elzer, S., Chester, D.: Recognizing the intended message of line graphs. Lecture Notes in Computer Science, vol. 6170 LNAI, pp. 220–234 (2010).

  85. Demir, S., Oliver, D., Schwartz, E., S. Elzer, S. Carberry, and K. F. McCoy, “Interactive SIGHT into information graphics,” W4A 2010 - International Cross Disciplinary Conference on Web Accessibility Raleigh 2010, pp. 1–10, 2010.

  86. Elzer, S., Carberry, S., Zukerman, I.: The automated understanding of simple bar charts. Artif. Intell. 175(2), 526–555 (2011).

    Article  MathSciNet  Google Scholar 

  87. Demir, S., Carberry, S., McCoy, K.F.: Summarizing information graphics textually. Comput. Linguist. 38(3), 527–574 (2012).

    Article  Google Scholar 

  88. Balawejder, E., Traub, T., Burns, R.: Exploring the automatic recognition of pie chart information messages, Accessed 26 Sept 202)

  89. Sai, A.B., Mohankumar, A.K., Khapra, M.M.: A survey of evaluation metrics used for NLG systems (2020).

  90. Telea, A.C., Maccari, A., Claudio Riva: An open toolkit for prototyping reverse engineering visualizations—Eindhoven University of Technology research portal. In Proceedings of the symposium on Data Visualization, vol. VisSym’02, pp. 241–249 (2002). Accessed 26 Sept 2021

  91. Hamraz, H.: Classification of chart images. Lexington (2014).

    Article  Google Scholar 

  92. Carderas, A., Yuan, Y., Livnat, I., Yanagihara, R., Saul, R., Oca, G., Zheng, K., Browne, A.W.: Automated data extraction of bar chart raster images. arXiv preprint (2020). Accessed 26 Sept 2021

  93. Sreevalsan-Nair, J., Dadhich, K., Daggubati, S.C.: Tensor fields for data extraction from chart images: bar charts and scatter plots. arXiv, no. Figure 1, pp. 1–17 (2020). Accessed 26 Sept 2021

  94. Huang, D., Wang, J., Wang, G., Lin, C.-Y.: Visual style extraction from chart images for chart restyling. In: International Association of Pattern Recognition, pp. 7625–7632 (2021). Accessed 26 Sept 2021

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Filip Bajić.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bajić, F., Job, J. Review of chart image detection and classification. IJDAR 26, 453–474 (2023).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: