Using Cluster Analysis to Assess the Impact of Dataset Heterogeneity on Deep Convolutional Network Accuracy: A First Glance

  • Mauro MendezEmail author
  • Saul Calderon
  • Pascal N. Tyrrell
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1087)


In this paper we performed cluster analysis using Fuzzy K-means over the image-based features of two models, to assess how dataset heterogeneity impacts model accuracy. A highly heterogeneous dataset is linked with sparse data samples, which usually impacts the overall model generalization and accuracy with test samples. We propose to measure the Coefficient of Variation (CV) in the resulting clusters, to estimate data heterogeneity as a metric for predicting model generalization and test accuracy. We show that highly heterogeneous datasets are common when the number of samples are not enough, thus yielding a high CV. In our experiments with two different models and datasets, higher CV values decreased model test accuracy considerably. We tested ResNet 18, to solve binary classification of x-ray teeth scans, and VGG16, to solve age regression from hand x-ray scans. Results obtained suggest that cluster analysis can be used to identify heterogeneity influence on CNN model testing accuracy. According to our experiments, we consider that a CV \(< 5\%\) is recommended to yield a satisfactory model test accuracy.


Cluster analysis Heterogeneity Transfer learning Small dataset Convolutional Neural Network 


  1. 1.
    Abdi, H.: Coefficient of variation. Encycl. Res. Des. 1, 169–171 (2010)Google Scholar
  2. 2.
    Ahmadvand, P., Ebrahimpour, R., Ahmadvand, P.: How popular CNNs perform in real applications of face recognition. In: 2016 24th Telecommunications Forum (TELFOR), pp. 1–4. IEEE (2016)Google Scholar
  3. 3.
    Altman, D.G., Matthews, J.N.: Statistics notes: interaction 1: heterogeneity of effects. BMJ 313(7055), 486 (1996)CrossRefGoogle Scholar
  4. 4.
    Antoniou, A., Storkey, A., Edwards, H.: Data augmentation generative adversarial networks. stat 1050, 8 (2018)Google Scholar
  5. 5.
    Bowden, J., Tierney, J.F., Copas, A.J., Burdett, S.: Quantifying, displaying and accounting for heterogeneity in the meta-analysis of RCTs using standard and generalised Qstatistics. BMC Med. Res. Methodol. 11(1), 41 (2011)CrossRefGoogle Scholar
  6. 6.
    Calderon, S., et al.: Assessing the impact of the deceived non local means filter as a preprocessing stage in a convolutional neural network based approach for age estimation using digital hand X-ray images. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 1752–1756. IEEE (2018)Google Scholar
  7. 7.
    Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat.-Theory Methods 3(1), 1–27 (1974)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. CoRR abs/1405.3531 (2014)Google Scholar
  9. 9.
    Collins, F.S., Varmus, H.: A new initiative on precision medicine. N. Engl. J. Med. 372(9), 793–795 (2015)CrossRefGoogle Scholar
  10. 10.
    Cui, Z.: Allowable limit of error in clinical chemistry quality control. Clin. Chem. 35(4), 630–631 (1989)CrossRefGoogle Scholar
  11. 11.
    Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979)CrossRefGoogle Scholar
  12. 12.
    Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning, pp. 647–655 (2014)Google Scholar
  13. 13.
    Fitzpatrick, A.M., et al.: Heterogeneity of severe asthma in childhood: confirmation by cluster analysis of children in the national institutes of health/national heart, lung, and blood institute severe asthma research program. J. Allergy Clin. Immunol. 127(2), 382–389 (2011)CrossRefGoogle Scholar
  14. 14.
    Fletcher, J.: What is heterogeneity and is it important? BMJ 334(7584), 94–96 (2007)CrossRefGoogle Scholar
  15. 15.
    Frantziskonis, G.: Heterogeneity and implicated surface effects: statistical, fractal formulation and relevant analytical solution. Acta Mech. 108(1–4), 157–178 (1995)CrossRefGoogle Scholar
  16. 16.
    Frid-Adar, M., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.: Synthetic data augmentation using GAN for improved liver lesion classification. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 289–293. IEEE (2018)Google Scholar
  17. 17.
    Gertych, A., Zhang, A., Sayre, J., Pospiech-Kurkowska, S., Huang, H.: Bone age assessment of children using a digital hand atlas. Comput. Med. Imaging Graph. 31(4–5), 322–331 (2007)CrossRefGoogle Scholar
  18. 18.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)Google Scholar
  19. 19.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT press (2016)Google Scholar
  20. 20.
    Guibas, J.T., Virdi, T.S., Li, P.S.: Synthetic medical images from dual generative adversarial networks. CoRR abs/1709.01872 (2017).
  21. 21.
    Han, D., Liu, Q., Fan, W.: A new image classification method using CNN transfer learning and web data augmentation. Expert Syst. Appl. 95, 43–56 (2018)CrossRefGoogle Scholar
  22. 22.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  23. 23.
    Hervier, B., et al.: Hierarchical cluster and survival analyses of antisynthetase syndrome: phenotype and outcome are correlated with anti-tRNA synthetase antibody specificity. Autoimmun. Rev. 12(2), 210–217 (2012)CrossRefGoogle Scholar
  24. 24.
    Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, Hoboken (2013)CrossRefGoogle Scholar
  25. 25.
    Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)CrossRefGoogle Scholar
  26. 26.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)CrossRefGoogle Scholar
  27. 27.
    Jolliffe, I.: Principal Component Analysis. Springer, Heidelberg (2011)zbMATHGoogle Scholar
  28. 28.
    Kodinariya, T.M., Makwana, P.R.: Review on determining number of cluster in k-means clustering. Int. J. 1(6), 90–95 (2013)Google Scholar
  29. 29.
    Le Guennec, A., Malinowski, S., Tavenard, R.: Data augmentation for time series classification using convolutional neural networks. In: ECML/PKDD Workshop on Advanced Analytics and Learning on Temporal Data (2016)Google Scholar
  30. 30.
    Liang, Z., et al.: CNN-based image analysis for malaria diagnosis. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 493–496. IEEE (2016)Google Scholar
  31. 31.
    Liu, Y., Hayes, D.N., Nobel, A., Marron, J.: Statistical significance of clustering for high-dimension, low-sample size data. J. Am. Stat. Assoc. 103(483), 1281–1293 (2008)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Neff, T., Payer, C., Stern, D., Urschler, M.: Generative adversarial network based synthesis for supervised medical image segmentation. In: Proceedings of the OAGM and ARW Joint Workshop (2017)Google Scholar
  33. 33.
    Ng, H.W., Nguyen, V.D., Vonikakis, V., Winkler, S.: Deep learning for emotion recognition on small datasets using transfer learning. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 443–449. ACM (2015)Google Scholar
  34. 34.
    Parmar, C., et al.: Radiomic feature clusters and prognostic signatures specific for lung and head & neck cancer. Sci. Rep. 5, 11044 (2015)CrossRefGoogle Scholar
  35. 35.
    Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRefGoogle Scholar
  36. 36.
    Sampaio, W.B., Diniz, E.M., Silva, A.C., De Paiva, A.C., Gattass, M.: Detection of masses in mammogram images using CNN, geostatistic functions and SVM. Comput. Biol. Med. 41(8), 653–664 (2011)CrossRefGoogle Scholar
  37. 37.
    Severyn, A., Moschitti, A.: Twitter sentiment analysis with deep convolutional neural networks. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959–962. ACM (2015)Google Scholar
  38. 38.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)Google Scholar
  39. 39.
    Terrin, N., Schmid, C.H., Lau, J., Olkin, I.: Adjusting for publication bias in the presence of heterogeneity. Stat. Med. 22(13), 2113–2126 (2003)CrossRefGoogle Scholar
  40. 40.
    Wardenaar, K.J., de Jonge, P.: Diagnostic heterogeneity in psychiatry: towards an empirical solution. BMC Med. 11(1), 201 (2013)CrossRefGoogle Scholar
  41. 41.
    Wirapati, P., et al.: Meta-analysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures. Breast Cancer Res. 10(4), R65 (2008)CrossRefGoogle Scholar
  42. 42.
    Xu, J., et al.: Short text clustering via convolutional neural networks (2015)Google Scholar
  43. 43.
    Yamashita, R., Nishio, M., Do, R.K.G., Togashi, K.: Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4), 611 (2018)CrossRefGoogle Scholar
  44. 44.
    Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Mauro Mendez
    • 1
    Email author
  • Saul Calderon
    • 1
  • Pascal N. Tyrrell
    • 2
  1. 1.School of ComputingCosta Rica Institute of TechnologyCartagoCosta Rica
  2. 2.Departments of Medical Imaging and Statistical SciencesUniversity of TorontoTorontoCanada

Personalised recommendations