Skip to main content
Log in

Classifying Chart Based on Structural Dissimilarities using Improved Regularized Loss Function

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Classification of charts is a major challenge because each chart class has variations due to the styles, appearances, structure, and noises caused due to changing data values. These variations differ across all chart types and sub-types. Hence, it becomes difficult for any model to learn the diversity of chart classes with the changing structures due to lack of association between the features from similar-dissimilar regions and its varied structure. In this paper, we present a novel dissimilarity based learning model for similar structured but diverse chart classification, by improving the loss function. Our approach jointly learns the features of both dissimilar and similar regions using notion of homogeneity. The loss function of the model is improved, which is fused by a variation aware dissimilarity index and incorporated with regularization parameters, making model more prone towards dissimilar regions and similar structure charts. Extensive comparative evaluations demonstrate that our approach significantly outperforms other benchmark methods, including both traditional and deep learning models, over publicly available datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Karthikeyani V, Nagarajan S (2012) Machine learning classification algorithms to recognize chart types in portable document format (pdf) files. Int J Comput Appl 39(2):1–5

    Google Scholar 

  2. Chen M, Golan A (2015) What may visualization processes optimize? IEEE Trans Visual Comput Graph 22(12):2619–2632

    Article  Google Scholar 

  3. Khan M, Khan SS (2011) Data and information visualization methods, and interactive mechanisms: a survey. Int J Comput Appl 34(1):1–14

    Google Scholar 

  4. Huang W, Tan CL (2007) A system for understanding imaged infographics and its applications. In: Proceedings of the 2007 ACM symposium on Document engineering, pp 9–18

  5. Shukla S, Samal A (2008) Recognition and quality assessment of data charts in mixed-mode documents. Int J Doc Anal Recognit (IJDAR) 11(3):111

    Article  Google Scholar 

  6. Mishchenko A, Vassilieva N (2011) Model-based chart image classification. In: International symposium on visual computing. Springer, pp 476–485

  7. Siegel N, Horvitz Z, Levin R, Divvala S, Farhadi A (2016) Figureseer: parsing result-figures in research papers. In: European conference on computer vision. Springer, pp 664–680

  8. Zhou Y, Tan CL (2001) Learning-based scientific chart recognition. In: 4th IAPR international workshop on graphics recognition. GREC, Citeseer, pp 482–492

  9. Cheng B, Stanley RJ, Antani S, Thoma GR (2013) Graphical figure classification using data fusion for integrating text and image features. In: 2013 12th international conference on document analysis and recognition. IEEE, pp 693–697

  10. Kim D, Ramesh BP, Yu H (2011) Automatic figure classification in bioscience literature. J Biomed Inform 44(5):848–858

    Article  Google Scholar 

  11. Shao M, Futrelle R (2005) Graphics recognition in pdf documents. In: Proceddings of GREC

  12. Huang W, Tan CL, Leow WK (2004) Elliptic arc vectorization for 3d pie chart recognition. In: 2004 international conference on image processing. ICIP’04., Vol 5, IEEE, 2004, pp 2889–2892

  13. Savva M, Kong N, Chhajta A, Fei-Fei L, Agrawala M, Heer J (2011) Revision: automated classification, analysis and redesign of chart images. In: Proceedings of the 24th annual ACM symposium on User interface software and technology, pp 393–402

  14. Bar chart. https://study.com/academy/lesson/what-is-a-stacked-bar-chart.html

  15. Oracle docs. https://docs.oracle.com/javase/8/javafx/user-interface-tutorial/scatter-chart.htm

  16. Lucid charts https://www.lucidchart.com/blog/how-to-make-a-bubble-chart-in-excel

  17. Pie chart and donut chart https://code.tutsplus.com/tutorials/how-to-draw-a-pie-chart-and-doughnut-chart-using-javascript-and-html5-canvas--cms-27197

  18. Data visualization https://visage.co/data-visualization-101-area-charts/

  19. Huang W, Zong S, Tan CL (2007) Chart image classification using multiple-instance learning. In: IEEE workshop on applications of computer vision (WACV’07). IEEE 2007:27–27

  20. Chagas P, Akiyama R, Meiguins A, Santos C, Saraiva F, Meiguins B, Morais J (2018) Evaluation of convolutional neural network architectures for chart image classification. In: 2018 international joint conference on neural networks (IJCNN), IEEE, pp 1–8

  21. Jung D, Kim W, Song H, Hwang J-I, Lee B, Kim B, Seo J (2017) Chartsense: interactive data extraction from chart images. In: Proceedings of the 2017 chi conference on human factors in computing systems, pp 6706–6717

  22. Tang B, Liu X, Lei J, Song M, Tao D, Sun S, Dong F (2016) Deepchart: combining deep convolutional networks and deep belief networks in chart classification. Signal Process 124:156–161

    Article  Google Scholar 

  23. Pandey RK, Ramakrishnan A, Karmakar S (2019) Effects of modifying the input features and the loss function on improving emotion classification. In: TENCON 2019-2019 IEEE Region 10 conference (TENCON), IEEE, pp 1159–1162

  24. Demirkaya A, Chen J, Oymak S (2020) Exploring the role of loss functions in multiclass classification. In: 2020 54th annual conference on information sciences and systems (CISS). IEEE, pp 1–5

  25. Cao J, Qiu Y, Chang D, Li X, Ma Z (2019) Dynamic attention loss for small-sample image classification. In: Asia-Pacific signal and information processing association annual summit and conference (APSIPA ASC). IEEE 75–79

  26. Song K, Li F, Long F, Wang J, Ling Q (2018) Discriminative deep feature learning for semantic-based image retrieval. IEEE Access 6:44268–44280

    Article  Google Scholar 

  27. Wang L, Wang C, Sun Z, Cheng S, Guo L (2020) Class balanced loss for image classification. IEEE Access 8:81142–81153

    Article  Google Scholar 

  28. Abouelenien M, Yuan X (2013) Boosting for learning from multiclass data sets via a regularized loss function. In: 2013 IEEE international conference on granular computing (GrC). IEEE, pp 4–9

  29. Zheng Q, Tian X, Yang M, Wu Y, Su H (2019) Pac-bayesian framework based drop-path method for 2d discriminative convolutional network pruning. Multidimens Syst Signal Process 1–35

  30. Liu W, Ma X, Zhou Y, Tao D, Cheng J (2018) \( p \)-laplacian regularization for scene recognition. IEEE Trans Cybern 49(8):2927–2940

    Article  Google Scholar 

  31. Yuan Y, Mou L, Lu X (2015) Scene recognition by manifold regularized deep learning architecture. IEEE Trans Neural Netw Learn Syst 26(10):2222–2233

    Article  MathSciNet  Google Scholar 

  32. Xu Y, Zhong Z, Yang J, You J, Zhang D (2016) A new discriminative sparse representation method for robust face recognition via l_ 2 regularization. IEEE Trans Neural Netw Learn Syst 28(10):2233–2242

    Article  MathSciNet  Google Scholar 

  33. Hong C, Yu J, You J, Chen X, Tao D (2015) Multi-view ensemble manifold regularization for 3d object recognition. Inf Sci 320:395–405

    Article  MathSciNet  Google Scholar 

  34. Zhang S, Lei M, Ma B, Xie L (2019) Robust audio-visual speech recognition using bimodal dfsmn with multi-condition training and dropout regularization. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6570–6574

  35. Zheng Q, Zhao P, Li Y, Wang H, Yang Y (2020) Spectrum interference-based two-level data augmentation method in deep learning for automatic modulation classification. Neural Comput Appl 1–23

  36. Amara J, Kaur P, Owonibi M, Bouaziz B Convolutional neural network based chart image classification 83–88

  37. Liu Y, Lu X, Qin Y, Tang Z, Xu J (2013) Review of chart recognition in document images. In: Visualization and data analysis 2013, Vol 8654. International Society for Optics and Photonics, p 865410

  38. Jurio A, Bustince H, Pagola M, Couto P, Pedrycz W (2014) New measures of homogeneity for image processing: an application to fingerprint segmentation. Soft Comput 18(6):1055–1066

    Article  Google Scholar 

  39. Sun P, Chen G, Luke G, Shang Y Salience biased loss for object detection in aerial images. arXiv:1810.08103

  40. Chang D, Ding Y, Xie J, Bhunia AK, Li X, Ma Z, Wu M, Guo J, Song Y-Z (2020) The devil is in the channels: mutual-channel loss for fine-grained image classification. IEEE Trans Image Process 29:4683–4695

    Article  Google Scholar 

  41. Simonyan K, Zisserman A Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  42. Hinton G (2018) Neural networks for machine learning online course. https://www.coursera.org/learn/neural-networks/home/welcome

  43. Haghighat M, Abdel-Mottaleb M, Alhalabi W (2016) Discriminant correlation analysis: real-time feature level fusion for multimodal biometric recognition. IEEE Trans Inf Forensics Secur 11(9):1984–1996

    Article  Google Scholar 

  44. Kotu V, Deshpande B (2018) Data science: concepts and practice. Morgan Kaufmann, Burlington

    Google Scholar 

  45. Kahou SE, Michalski V, Atkinson A, Kádár Á, Trischler A, Bengio Y Figureqa: an annotated figure dataset for visual reasoning. arXiv:1710.07300

  46. Poco J, Heer J (2017) Reverse-engineering visualizations: recovering visual encodings from chart images. In: Computer graphics forum, Vol 36. Wiley Online Library, pp 353–363

  47. Choi J, Jung S, Park DG, Choo J, Elmqvist N (2019) Visualizing for the non-visual: enabling the visually impaired to use visualization. In: Computer graphics forum, Vol 38. Wiley Online Library, pp 249–260

  48. Li X, Shen H, Li H, Zhang L (2016) Patch matching-based multitemporal group sparse representation for the missing information reconstruction of remote-sensing images. IEEE J Sel Topics Appl Earth Observ Remote Sens 9(8):3629–3641

    Article  Google Scholar 

  49. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition (CVPR’05), Vol 1. IEEE 886–893

  50. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987

    Article  Google Scholar 

  51. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175

    Article  Google Scholar 

  52. Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, Vol 2. Ieee, pp 1150–1157

  53. Setty S, Husain M, Beham P, Gudavalli J, Kandasamy M, Vaddi R, Hemadri V, Karure J, Raju R, Rajan B (2013) Indian movie face database: a benchmark for face recognition under wide variations. In: et al Fourth national conference on computer vision, pattern recognition, image processing and graphics (NCVPRIPG). IEEE 1–5

  54. Nilsback M-E, Zisserman A (2006) A visual vocabulary for flower classification. In: IEEE computer society conference on computer vision and pattern recognition (CVPR’06), Vol 2. IEEE, pp 1447–1454

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prerna Mishra.

Ethics declarations

Conflict of interest

Authors declare that they do not have any conflict of interest.

Funding

Authors declare that they have no relevant financial or non-financial interest to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A. Performance on Other Public Databases

We tested our approach over datasets apart from chart images. We analyzed the generalizability and robustness of our approach. We tested with two datasets that are known for their highly distinctive and feature-rich images.

1.1 Appendix A.0.1. Datasets

To further analyze the performance of the proposed approach on other public databases, and to test the generalizability of the approach, we choose two publicly available dataset - Indian movie face database(IMFDB) [53] (2380 training images and 107 testing images) and 17-category flower dataset [54] (1260 training images and 100 testing images). Randomly only 10 categories from each database were considered. For the flower dataset, we randomly selected 10 classes as Buttercup, Colts’ Foot, Daffodil, Daisy, Dandelion, Fritillary, Iris, Sunflower, Bluebell, and Tulip. Similarly, for IMFDB, we selected 5 males and 5 female actors randomly. Small size and blur images from IMFDB was eliminated. IMFDB dataset has a facial image of actors that were acquired from their different movie videos. The actor’s images include the facial images from his different age, with and without makeup, and different hairstyles. This dataset has visually similar images, but appearances and structural features are different. The same feature property was also seen in the flower dataset.

1.1.1 Appendix A.0.2. Results

The results shown in following sub-sections were obtained with hyper parameters settings for the both all chart datasets as, learning rate = 0.001, optimization algorithm = RMSProp, \(\beta \) = 0.9, batch size = 25, activation function = ReLu, no of epochs = 150. While values of \(L_1\) and \(L_2\) are, \(L_1\) = 0.0003, \(L_2\) = 0.0005 for IMFDB, \(L_1\) = 0.0002, \(L_2\) = 0.0003 for 17-flower dataset. Dropout with a rate of 0.5 is employed on the fully connected layers.

For our analysis, we compared results obtained on public datasets using our approach, results shown in Table table10. We evaluated the classification performance of these models with and without influence of dissimilarity index (DI).

It was noticed that, with DI, the classification performance was improved by almost 2% ranging mostly from 96 to 99%. InceptionV3 and ResNet50 gave a good accuracy rate in comparison to other pretrained models. In spite ofDespite variation amongst actor’s facial images, with DI the accuracy rate of the learning model increased by 1–2%. Thus, we can state that our approach has generalized ability to classify images apart from chart images based on their similar and dissimilar features both.

Table 8 Average accuracy (%) of learning models as backbone architecture on public datasets
Table 9 Accuracy (%) of the proposed approach for the other public datasets

Table 10a, b show the overall and category wise performance of the proposed approach for IMFDB and Flower dataset using VGG16 as backbone architecture. For the flower dataset, elastic net+DI gave better results with an increase of 2% in the range of 97–98%, in comparison to the elastic net without DI. While for the face dataset, the accuracy range was of 95–97%. With DI, the accuracy rate improved by 4–6%. The model was able to learn flower database in a better way than IMFDB, might be due to the reason that IMFDB had images extracted from video sources having different resolutions and sizes. Those with good resolution were classified correctly, while those with poor resolutions only were misclassified. Figure 8 shows the performance rate of the IMFDB and flower database. Even with high variability, the model was able to classify the images satisfactorily.

Table 10 Performance of proposed approach over datasets with respect to regularization function
Fig. 8
figure 8

Performance rate of the model (Training Vs. Validation) for the public dataset

Appendix B. Analysis of Dissimilarity Index by using Other Distance Norm

Dissimilarity index (Eqn.11) is computed using L1 distance. To further analyze the impact of other distance norms, we used L2 distance, canberra and cosine distance [48] indicator instead of L1 distance. Results of which are shown in Table 11. We followed same hyper parameters setting as mentioned in Sect. 4.2. However, for cosine distance the values of \(L_1\) and \(L_2\) were slightly increased to avoid the model being overfit. The values are \(L_1\) = 0.001 and \(L_2\) = 0.003 for our web chart dataset, \(L_1\) = 0.002 and \(L_2\) = 0.003 for FigureQA, \(L_1\) = 0.001 and \(L_2\) = 0.002 for ReVision corpus.

Table 11 Accuracy rate(%) for chart datasets obtained over different distance indicators

On using L2 and Canberra distance, the classification accuracy obtained was more than 96% which is fairly good. For cosine distance, it is said that larger the value than better the results. Due to this, it was noticed that a large value on being fused with weight increased the model weights, causing model to overfit for similar chart types. Model performed well for FigureQA and ReVision, but the performance rate was affected in case of web chart images. To avoid this, we had to increase the regularization parameter, after which classification accuracy obtained was more than 93%. Overall it can be stated that, using L1 and L2 distance performs very well with the proposed model. Canberra distance gave good results, while cosine distance with slight high regularized parameter yield better results. It can be concluded that, the model learns competently on features from dissimilar regions in terms of chart style, appearance, and structural patterns even using other distance indicators.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mishra, P., Kumar, S. & Chaube, M.K. Classifying Chart Based on Structural Dissimilarities using Improved Regularized Loss Function. Neural Process Lett 54, 2385–2411 (2022). https://doi.org/10.1007/s11063-021-10735-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10735-z

Keywords

Navigation