Abstract
Classification of charts is a major challenge because each chart class has variations due to the styles, appearances, structure, and noises caused due to changing data values. These variations differ across all chart types and sub-types. Hence, it becomes difficult for any model to learn the diversity of chart classes with the changing structures due to lack of association between the features from similar-dissimilar regions and its varied structure. In this paper, we present a novel dissimilarity based learning model for similar structured but diverse chart classification, by improving the loss function. Our approach jointly learns the features of both dissimilar and similar regions using notion of homogeneity. The loss function of the model is improved, which is fused by a variation aware dissimilarity index and incorporated with regularization parameters, making model more prone towards dissimilar regions and similar structure charts. Extensive comparative evaluations demonstrate that our approach significantly outperforms other benchmark methods, including both traditional and deep learning models, over publicly available datasets.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-021-10735-z/MediaObjects/11063_2021_10735_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-021-10735-z/MediaObjects/11063_2021_10735_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-021-10735-z/MediaObjects/11063_2021_10735_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-021-10735-z/MediaObjects/11063_2021_10735_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-021-10735-z/MediaObjects/11063_2021_10735_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-021-10735-z/MediaObjects/11063_2021_10735_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11063-021-10735-z/MediaObjects/11063_2021_10735_Fig7_HTML.png)
Similar content being viewed by others
References
Karthikeyani V, Nagarajan S (2012) Machine learning classification algorithms to recognize chart types in portable document format (pdf) files. Int J Comput Appl 39(2):1–5
Chen M, Golan A (2015) What may visualization processes optimize? IEEE Trans Visual Comput Graph 22(12):2619–2632
Khan M, Khan SS (2011) Data and information visualization methods, and interactive mechanisms: a survey. Int J Comput Appl 34(1):1–14
Huang W, Tan CL (2007) A system for understanding imaged infographics and its applications. In: Proceedings of the 2007 ACM symposium on Document engineering, pp 9–18
Shukla S, Samal A (2008) Recognition and quality assessment of data charts in mixed-mode documents. Int J Doc Anal Recognit (IJDAR) 11(3):111
Mishchenko A, Vassilieva N (2011) Model-based chart image classification. In: International symposium on visual computing. Springer, pp 476–485
Siegel N, Horvitz Z, Levin R, Divvala S, Farhadi A (2016) Figureseer: parsing result-figures in research papers. In: European conference on computer vision. Springer, pp 664–680
Zhou Y, Tan CL (2001) Learning-based scientific chart recognition. In: 4th IAPR international workshop on graphics recognition. GREC, Citeseer, pp 482–492
Cheng B, Stanley RJ, Antani S, Thoma GR (2013) Graphical figure classification using data fusion for integrating text and image features. In: 2013 12th international conference on document analysis and recognition. IEEE, pp 693–697
Kim D, Ramesh BP, Yu H (2011) Automatic figure classification in bioscience literature. J Biomed Inform 44(5):848–858
Shao M, Futrelle R (2005) Graphics recognition in pdf documents. In: Proceddings of GREC
Huang W, Tan CL, Leow WK (2004) Elliptic arc vectorization for 3d pie chart recognition. In: 2004 international conference on image processing. ICIP’04., Vol 5, IEEE, 2004, pp 2889–2892
Savva M, Kong N, Chhajta A, Fei-Fei L, Agrawala M, Heer J (2011) Revision: automated classification, analysis and redesign of chart images. In: Proceedings of the 24th annual ACM symposium on User interface software and technology, pp 393–402
Bar chart. https://study.com/academy/lesson/what-is-a-stacked-bar-chart.html
Oracle docs. https://docs.oracle.com/javase/8/javafx/user-interface-tutorial/scatter-chart.htm
Lucid charts https://www.lucidchart.com/blog/how-to-make-a-bubble-chart-in-excel
Pie chart and donut chart https://code.tutsplus.com/tutorials/how-to-draw-a-pie-chart-and-doughnut-chart-using-javascript-and-html5-canvas--cms-27197
Data visualization https://visage.co/data-visualization-101-area-charts/
Huang W, Zong S, Tan CL (2007) Chart image classification using multiple-instance learning. In: IEEE workshop on applications of computer vision (WACV’07). IEEE 2007:27–27
Chagas P, Akiyama R, Meiguins A, Santos C, Saraiva F, Meiguins B, Morais J (2018) Evaluation of convolutional neural network architectures for chart image classification. In: 2018 international joint conference on neural networks (IJCNN), IEEE, pp 1–8
Jung D, Kim W, Song H, Hwang J-I, Lee B, Kim B, Seo J (2017) Chartsense: interactive data extraction from chart images. In: Proceedings of the 2017 chi conference on human factors in computing systems, pp 6706–6717
Tang B, Liu X, Lei J, Song M, Tao D, Sun S, Dong F (2016) Deepchart: combining deep convolutional networks and deep belief networks in chart classification. Signal Process 124:156–161
Pandey RK, Ramakrishnan A, Karmakar S (2019) Effects of modifying the input features and the loss function on improving emotion classification. In: TENCON 2019-2019 IEEE Region 10 conference (TENCON), IEEE, pp 1159–1162
Demirkaya A, Chen J, Oymak S (2020) Exploring the role of loss functions in multiclass classification. In: 2020 54th annual conference on information sciences and systems (CISS). IEEE, pp 1–5
Cao J, Qiu Y, Chang D, Li X, Ma Z (2019) Dynamic attention loss for small-sample image classification. In: Asia-Pacific signal and information processing association annual summit and conference (APSIPA ASC). IEEE 75–79
Song K, Li F, Long F, Wang J, Ling Q (2018) Discriminative deep feature learning for semantic-based image retrieval. IEEE Access 6:44268–44280
Wang L, Wang C, Sun Z, Cheng S, Guo L (2020) Class balanced loss for image classification. IEEE Access 8:81142–81153
Abouelenien M, Yuan X (2013) Boosting for learning from multiclass data sets via a regularized loss function. In: 2013 IEEE international conference on granular computing (GrC). IEEE, pp 4–9
Zheng Q, Tian X, Yang M, Wu Y, Su H (2019) Pac-bayesian framework based drop-path method for 2d discriminative convolutional network pruning. Multidimens Syst Signal Process 1–35
Liu W, Ma X, Zhou Y, Tao D, Cheng J (2018) \( p \)-laplacian regularization for scene recognition. IEEE Trans Cybern 49(8):2927–2940
Yuan Y, Mou L, Lu X (2015) Scene recognition by manifold regularized deep learning architecture. IEEE Trans Neural Netw Learn Syst 26(10):2222–2233
Xu Y, Zhong Z, Yang J, You J, Zhang D (2016) A new discriminative sparse representation method for robust face recognition via l_ 2 regularization. IEEE Trans Neural Netw Learn Syst 28(10):2233–2242
Hong C, Yu J, You J, Chen X, Tao D (2015) Multi-view ensemble manifold regularization for 3d object recognition. Inf Sci 320:395–405
Zhang S, Lei M, Ma B, Xie L (2019) Robust audio-visual speech recognition using bimodal dfsmn with multi-condition training and dropout regularization. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6570–6574
Zheng Q, Zhao P, Li Y, Wang H, Yang Y (2020) Spectrum interference-based two-level data augmentation method in deep learning for automatic modulation classification. Neural Comput Appl 1–23
Amara J, Kaur P, Owonibi M, Bouaziz B Convolutional neural network based chart image classification 83–88
Liu Y, Lu X, Qin Y, Tang Z, Xu J (2013) Review of chart recognition in document images. In: Visualization and data analysis 2013, Vol 8654. International Society for Optics and Photonics, p 865410
Jurio A, Bustince H, Pagola M, Couto P, Pedrycz W (2014) New measures of homogeneity for image processing: an application to fingerprint segmentation. Soft Comput 18(6):1055–1066
Sun P, Chen G, Luke G, Shang Y Salience biased loss for object detection in aerial images. arXiv:1810.08103
Chang D, Ding Y, Xie J, Bhunia AK, Li X, Ma Z, Wu M, Guo J, Song Y-Z (2020) The devil is in the channels: mutual-channel loss for fine-grained image classification. IEEE Trans Image Process 29:4683–4695
Simonyan K, Zisserman A Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Hinton G (2018) Neural networks for machine learning online course. https://www.coursera.org/learn/neural-networks/home/welcome
Haghighat M, Abdel-Mottaleb M, Alhalabi W (2016) Discriminant correlation analysis: real-time feature level fusion for multimodal biometric recognition. IEEE Trans Inf Forensics Secur 11(9):1984–1996
Kotu V, Deshpande B (2018) Data science: concepts and practice. Morgan Kaufmann, Burlington
Kahou SE, Michalski V, Atkinson A, Kádár Á, Trischler A, Bengio Y Figureqa: an annotated figure dataset for visual reasoning. arXiv:1710.07300
Poco J, Heer J (2017) Reverse-engineering visualizations: recovering visual encodings from chart images. In: Computer graphics forum, Vol 36. Wiley Online Library, pp 353–363
Choi J, Jung S, Park DG, Choo J, Elmqvist N (2019) Visualizing for the non-visual: enabling the visually impaired to use visualization. In: Computer graphics forum, Vol 38. Wiley Online Library, pp 249–260
Li X, Shen H, Li H, Zhang L (2016) Patch matching-based multitemporal group sparse representation for the missing information reconstruction of remote-sensing images. IEEE J Sel Topics Appl Earth Observ Remote Sens 9(8):3629–3641
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition (CVPR’05), Vol 1. IEEE 886–893
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, Vol 2. Ieee, pp 1150–1157
Setty S, Husain M, Beham P, Gudavalli J, Kandasamy M, Vaddi R, Hemadri V, Karure J, Raju R, Rajan B (2013) Indian movie face database: a benchmark for face recognition under wide variations. In: et al Fourth national conference on computer vision, pattern recognition, image processing and graphics (NCVPRIPG). IEEE 1–5
Nilsback M-E, Zisserman A (2006) A visual vocabulary for flower classification. In: IEEE computer society conference on computer vision and pattern recognition (CVPR’06), Vol 2. IEEE, pp 1447–1454
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors declare that they do not have any conflict of interest.
Funding
Authors declare that they have no relevant financial or non-financial interest to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A. Performance on Other Public Databases
We tested our approach over datasets apart from chart images. We analyzed the generalizability and robustness of our approach. We tested with two datasets that are known for their highly distinctive and feature-rich images.
1.1 Appendix A.0.1. Datasets
To further analyze the performance of the proposed approach on other public databases, and to test the generalizability of the approach, we choose two publicly available dataset - Indian movie face database(IMFDB) [53] (2380 training images and 107 testing images) and 17-category flower dataset [54] (1260 training images and 100 testing images). Randomly only 10 categories from each database were considered. For the flower dataset, we randomly selected 10 classes as Buttercup, Colts’ Foot, Daffodil, Daisy, Dandelion, Fritillary, Iris, Sunflower, Bluebell, and Tulip. Similarly, for IMFDB, we selected 5 males and 5 female actors randomly. Small size and blur images from IMFDB was eliminated. IMFDB dataset has a facial image of actors that were acquired from their different movie videos. The actor’s images include the facial images from his different age, with and without makeup, and different hairstyles. This dataset has visually similar images, but appearances and structural features are different. The same feature property was also seen in the flower dataset.
1.1.1 Appendix A.0.2. Results
The results shown in following sub-sections were obtained with hyper parameters settings for the both all chart datasets as, learning rate = 0.001, optimization algorithm = RMSProp, \(\beta \) = 0.9, batch size = 25, activation function = ReLu, no of epochs = 150. While values of \(L_1\) and \(L_2\) are, \(L_1\) = 0.0003, \(L_2\) = 0.0005 for IMFDB, \(L_1\) = 0.0002, \(L_2\) = 0.0003 for 17-flower dataset. Dropout with a rate of 0.5 is employed on the fully connected layers.
For our analysis, we compared results obtained on public datasets using our approach, results shown in Table table10. We evaluated the classification performance of these models with and without influence of dissimilarity index (DI).
It was noticed that, with DI, the classification performance was improved by almost 2% ranging mostly from 96 to 99%. InceptionV3 and ResNet50 gave a good accuracy rate in comparison to other pretrained models. In spite ofDespite variation amongst actor’s facial images, with DI the accuracy rate of the learning model increased by 1–2%. Thus, we can state that our approach has generalized ability to classify images apart from chart images based on their similar and dissimilar features both.
Table 10a, b show the overall and category wise performance of the proposed approach for IMFDB and Flower dataset using VGG16 as backbone architecture. For the flower dataset, elastic net+DI gave better results with an increase of 2% in the range of 97–98%, in comparison to the elastic net without DI. While for the face dataset, the accuracy range was of 95–97%. With DI, the accuracy rate improved by 4–6%. The model was able to learn flower database in a better way than IMFDB, might be due to the reason that IMFDB had images extracted from video sources having different resolutions and sizes. Those with good resolution were classified correctly, while those with poor resolutions only were misclassified. Figure 8 shows the performance rate of the IMFDB and flower database. Even with high variability, the model was able to classify the images satisfactorily.
Appendix B. Analysis of Dissimilarity Index by using Other Distance Norm
Dissimilarity index (Eqn.11) is computed using L1 distance. To further analyze the impact of other distance norms, we used L2 distance, canberra and cosine distance [48] indicator instead of L1 distance. Results of which are shown in Table 11. We followed same hyper parameters setting as mentioned in Sect. 4.2. However, for cosine distance the values of \(L_1\) and \(L_2\) were slightly increased to avoid the model being overfit. The values are \(L_1\) = 0.001 and \(L_2\) = 0.003 for our web chart dataset, \(L_1\) = 0.002 and \(L_2\) = 0.003 for FigureQA, \(L_1\) = 0.001 and \(L_2\) = 0.002 for ReVision corpus.
On using L2 and Canberra distance, the classification accuracy obtained was more than 96% which is fairly good. For cosine distance, it is said that larger the value than better the results. Due to this, it was noticed that a large value on being fused with weight increased the model weights, causing model to overfit for similar chart types. Model performed well for FigureQA and ReVision, but the performance rate was affected in case of web chart images. To avoid this, we had to increase the regularization parameter, after which classification accuracy obtained was more than 93%. Overall it can be stated that, using L1 and L2 distance performs very well with the proposed model. Canberra distance gave good results, while cosine distance with slight high regularized parameter yield better results. It can be concluded that, the model learns competently on features from dissimilar regions in terms of chart style, appearance, and structural patterns even using other distance indicators.
Rights and permissions
About this article
Cite this article
Mishra, P., Kumar, S. & Chaube, M.K. Classifying Chart Based on Structural Dissimilarities using Improved Regularized Loss Function. Neural Process Lett 54, 2385–2411 (2022). https://doi.org/10.1007/s11063-021-10735-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-021-10735-z