Online Variational Learning for Medical Image Data Clustering

  • Meeta KalraEmail author
  • Michael Osadebey
  • Nizar Bouguila
  • Marius Pedersen
  • Wentao Fan
Part of the Unsupervised and Semi-Supervised Learning book series (UNSESUL)


Data mining is an extensive area of research involving pattern discovery and feature extraction which is applied in various critical domains. In clinical aspect, data mining has emerged to assist the clinicians in early detection, diagnosis, and prevention of diseases. Advances in computational methods have led to implementation of machine learning in multi-modal clinical image analysis. One recent method is online learning where data become available in a sequential order, thus sequentially updating the best predictor for the future data at each step, as opposed to batch learning techniques which generate the best predictor by learning the entire data set at once.

In this chapter, we have examined and analysed multi-modal medical images by developing an unsupervised machine learning algorithm based on online variational inference for finite inverted Dirichlet mixture model. Our prime focus was to validate the developed approach on medical images. We do so by implementing the algorithm on both synthetic and real data sets. We test the algorithm’s ability to detect challenging real world diseases, namely brain tumour, lung tuberculosis, and melanomic skin lesion.


Unsupervised learning Online variational inference Inverted Dirichlet distribution Mixture models Healthcare Brain tumour detection Lung tuberculosis Skin lesion diagnosis 


  1. 1.
    Agrawal, J.P., Erickson, B.J., Kahn, C.E.: Imaging informatics: 25 years of progress. Yearb. Med. Inform. Suppl 1, 23–31 (2016)Google Scholar
  2. 2.
    Sohail, M.N., Jiadong, R., Uba, M.M., Irshad, M.: A comprehensive looks at data mining techniques contributing to medical data growth: A survey of researcher reviews. In: Patnaik, S., Jain, V. (eds.) Recent Developments in Intelligent Computing, Communication and Devices. Springer, Singapore, pp. 21–26 (2019)CrossRefGoogle Scholar
  3. 3.
    Ganguly, D., Chakraborty, S., Balitanas, M., Kim, Th.: Medical imaging: A review. In: Kim, Th., Stoica, A., Chang, R.S. (eds.) Security-Enriched Urban Computing and Smart Grid. Springer, Heidelberg, pp. 504–516 (2010)CrossRefGoogle Scholar
  4. 4.
    Perera, C.M., Chakrabarti, R.: A review of m-health in medical imaging. Telemed. e-Health 21(2), 132–137 (2015)CrossRefGoogle Scholar
  5. 5.
    Lester, D.S., Olds, J.L.: Biomedical imaging: 2001 and beyond. Anat. Rec. An Offi. Publ. Am. Assoc. Anatomists 265(2), 35–36 (2001)Google Scholar
  6. 6.
    Van Beek, E.J., Hoffman, E.A.: Functional imaging: CT and MRI. Clin. Chest Med. 29(1), 195–216 (2008)CrossRefGoogle Scholar
  7. 7.
    Doi, K.: Computer-aided diagnosis in medical imaging: historical review, current status and future potential. Comput. Med. Imaging Graph. 31(4–5), 198–211 (2007)CrossRefGoogle Scholar
  8. 8.
    Petrick, N., Sahiner, B., Armato III, S.G., Bert, A., Correale, L., Delsanto, S., Freedman, M.T., Fryd, D., Gur, D., Hadjiiski, L., Huo, Z., Jiang, Y., Morra, L., Paquerault, S., Raykar, V., Samuelson, F., Summers, R.M., Tourassi, G., Yoshida, H., Zheng, B., Zhou, C., Chan, H.P.: Evaluation of computer-aided detection and diagnosis systems. Med. Phys. 40(8), 087001 (2013)CrossRefGoogle Scholar
  9. 9.
    Erickson, B.J., Korfiatis, P., Akkus, Z., Kline, T.L.: Machine learning for medical imaging. Radiographics 37(2), 505–515 (2017)CrossRefGoogle Scholar
  10. 10.
    Guadalupe Sanchez, M., Guadalupe Sánchez, M., Vidal, V., Verdu, G., Verdú, G., Mayo, P., Rodenas, F.: Medical image restoration with different types of noise. In: 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 4382–4385 (2012)Google Scholar
  11. 11.
    Sittig, D.F., Wright, A., Osheroff, J.A., Middleton, B., Teich, J.M., Ash, J.S., Campbell, E., Bates, D.W.: Grand challenges in clinical decision support. J. Biomed. Inform. 41(2), 387–392 (2008)CrossRefGoogle Scholar
  12. 12.
    Chen, T.J., Chuang, K.S., Chang, J.H., Shiao, Y.H., Chuang, C.C.: A blurring index for medical images. J. Digit. Imaging 19(2), 118–125 (2005)CrossRefGoogle Scholar
  13. 13.
    Fan, W., Bouguila, N., Ziou, D.: Variational learning for finite Dirichlet mixture models and applications. IEEE Trans. Neural Netw. Learn. Syst. 23(5), 762–774 (2012)CrossRefGoogle Scholar
  14. 14.
    Tirdad, P., Bouguila, N., Ziou, D.: Variational learning of finite inverted Dirichlet mixture models and applications. In: Laalaoui, Y., Bouguila, N. (eds.) Artificial Intelligence Applications in Information and Communication Technologies, vol. 607, pp. 119–145. Springer, Cham (2015)CrossRefGoogle Scholar
  15. 15.
    Robert, C.P., Casella, G.: Monte Carlo Statistical Methods (Springer Texts in Statistics). Springer, Heidelberg (2005)Google Scholar
  16. 16.
    Gultepe, E., Makrehchi, M.: Improving clustering performance using independent component analysis and unsupervised feature learning. Hum-centric Comput. Inf. Sci. 8(1), 148:1–148:19 (2018)Google Scholar
  17. 17.
    Fan, W., Bouguila, N., Ziou, D.: Variational learning of finite Dirichlet mixture models using component splitting. Neurocomputing 129, 3–16 (2014)CrossRefGoogle Scholar
  18. 18.
    Bouguila, N., Ziou, D.: Online clustering via finite mixtures of Dirichlet and minimum message length. Eng. Appl. Artif. Intell. 19(4), 371–379 (2006)CrossRefGoogle Scholar
  19. 19.
    Zakariya, S.M., Ali, R., Ahmad, N.: Combining visual features of an image at different precision value of unsupervised content based image retrieval. In: 2010 IEEE International Conference on Computational Intelligence and Computing Research, pp. 1–4 (2010)Google Scholar
  20. 20.
    Constantinopoulos, C., Likas, A.: Unsupervised learning of Gaussian mixtures based on variational component splitting. IEEE Trans. Neural Netw. 18(3), 745–755 (2007)CrossRefGoogle Scholar
  21. 21.
    Williams, G.: Descriptive and predictive analytics. In: Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery, pp. 171–177. Springer, New York (2011)CrossRefGoogle Scholar
  22. 22.
    Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: current status and future directions. Data Min. Knowl. Disc. 15(1), 55–86 (2007)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Bellazzi, R., Zupan, B.: Predictive data mining in clinical medicine: current issues and guidelines. Int. J. Med. Inform. 77(2), 81–97 (2008)CrossRefGoogle Scholar
  24. 24.
    Swan, M.: Emerging patient-driven health care models: an examination of health social networks, consumer personalized medicine and quantified self-tracking. Int. J. Environ. Res. Public Health 6(2), 492–525 (2009)CrossRefGoogle Scholar
  25. 25.
    Iavindrasana, J., Cohen, G., Depeursinge, A., Müller, H., Meyer, R., Geissbuhler, A. Clinical data mining: a review. Yearb. Med. Inform. 121–133 (2018)Google Scholar
  26. 26.
    Chechulin, Y., Nazerian, A., Rais, S., Malikov, K.: Predicting patients with high risk of becoming high-cost healthcare users in Ontario (Canada). Healthc. Policy 9, 68–79 (2014)Google Scholar
  27. 27.
    Ramezankhani, A., Kabir, A., Pournik, O., Azizi, F., Hadaegh, F.: Classification-based data mining for identification of risk patterns associated with hypertension in middle eastern population: A 12-year longitudinal study. Medicine (Baltimore) 95(35), e4143 (2016)CrossRefGoogle Scholar
  28. 28.
    Parva, E., Boostani, R., Ghahramani, Z., Paydar, S.: The necessity of data mining in clinical emergency medicine; a narrative review of the current literature. Bull. Emerg. Trauma. 5(2), 90–95 (2017)Google Scholar
  29. 29.
    Kuo, I.T., Chang, K.Y., Juan, D.F., Hsu, S.J., Chan, C.T., Tsou, M.Y.: Time-dependent analysis of dosage delivery information for patient-controlled analgesia services. PLoS One 13(3), 1–13 (2018)CrossRefGoogle Scholar
  30. 30.
    Lee, M.J., Chen, C.J., Lee, K.T., Shi, H.Y.: Trend analysis and outcome prediction in mechanically ventilated patients: A nationwide population-based study in Taiwan. PLoS One 10(4), 1–13 (2015)Google Scholar
  31. 31.
    Baek, H., Cho, M., Kim, S., Hwang, H., Song, M., Yoo, S.: Analysis of length of hospital stay using electronic health records: A statistical and data mining approach. PLoS One 13(4), 1–16 (2018)CrossRefGoogle Scholar
  32. 32.
    Tiao, G.G., Cuttman, I.: The inverted Dirichlet distribution with applications. J. Am. Stat. Assoc. 60(311), 793–805 (1965)MathSciNetzbMATHCrossRefGoogle Scholar
  33. 33.
    Xu, R., Wunsch, D.C.: Clustering algorithms in biomedical research: A review. IEEE Rev. Biomed. Eng. 3, 120–154 (2010)CrossRefGoogle Scholar
  34. 34.
    Wang, H.X., Luo, B., Zhang, Q.B., Wei, S.: Estimation for the number of components in a mixture model using stepwise split-and-merge EM algorithm. Pattern Recogn. Lett. 25(16), 1799–1809 (2004)CrossRefGoogle Scholar
  35. 35.
    Schneider, A., Hommel, G., Blettner, M.: Linear regression analysis: part 14 of a series on evaluation of scientific publications. Dtsch. Arztebl. Int. 44, 776–82 (2010)Google Scholar
  36. 36.
    Kovalchuk, S.V., Funkner, A.A., Metsker, O.G., Yakovlev, A.N.: Simulation of patient flow in multiple healthcare units using process and data mining techniques for model identification. J. Biomed. Inform. 82, 128–142 (2018)CrossRefGoogle Scholar
  37. 37.
    Jensen, P.B., Jensen, L.J., Brunak, S.: Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13(6), 395–405 (2012)CrossRefGoogle Scholar
  38. 38.
    Blei, D.M., Kucukelbir, A., McAuliffe, J.D.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112(518), 859–877 (2017)MathSciNetCrossRefGoogle Scholar
  39. 39.
    Corduneanu, A., Bishop, C.: Variational Bayesian model selection for mixture distributions. In: Proceedings Eighth International Conference on Artificial Intelligence and Statistics, pp. 27–34. Morgan Kaufmann, San Francisco (2001)Google Scholar
  40. 40.
    Lawrence, N.D., Bishop, C.M., Jordan, M.I.: Mixture Representations for Inference and Learning in Boltzmann Machines (2013). CoRR abs/1301.7393. 1301.7393Google Scholar
  41. 41.
    Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)zbMATHCrossRefGoogle Scholar
  42. 42.
    Bishop, C.M., Lawrence, N., Jaakkola, T., Jordan, M.I.: Approximating posterior distributions in belief networks using mixtures. In: Proceedings of the 1997 Conference on Advances in Neural Information Processing Systems 10, pp. 416–422. MIT Press, Cambridge (1998)Google Scholar
  43. 43.
    Amari, S.I.: Natural gradient works efficiently in learning. Neural. Comput. 10(2), 251–276 (1998)CrossRefGoogle Scholar
  44. 44.
    Fan, W., Bouguila, N.: Online variational learning of finite Dirichlet mixture models. Evol. Syst. 3(3), 153–165 (2012)CrossRefGoogle Scholar
  45. 45.
    Hoffman, M., Bach, F.R., Blei, D.M.: Online learning for latent Dirichlet allocation. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 23, pp. 856–864. Curran Associates, Inc., (2010)Google Scholar
  46. 46.
    Bakas, S., Kuijf, H.J., Keyvan, F., Reyes, M., van Walsum, T.: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Springer International Publishing, Berlin (2018)Google Scholar
  47. 47.
    Menze, B.H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahani, K., Kirby, J., Burren, Y., Porz, N., Slotboom, J., Wiest, R., Lanczi, L., Gerstner, E., Weber, M., Arbel, T., Avants, B.B., Ayache, N., Buendia, P., Collins, D.L., Cordier, N., Corso, J.J., Criminisi, A., Das, T., Delingette, H., Demiralp, Durst, C.R., Dojat, M., Doyle, S., Festa, J., Forbes, F., Geremia, E., Glocker, B., Golland, P., Guo, X., Hamamci, A., Iftekharuddin, K.M., Jena, R., John, N.M., Konukoglu, E., Lashkari, D., Mariz, J.A., Meier, R., Pereira, S., Precup, D., Price, S.J., Raviv, T.R., Reza, S.M.S., Ryan, M., Sarikaya, D., Schwartz, L., Shin, H., Shotton, J., Silva, C.A., Sousa, N., Subbanna, N.K., Szekely, G., Taylor, T.J., Thomas, O.M., Tustison, N.J., Unal, G., Vasseur, F., Wintermark, M., Ye, D.H., Zhao, L., Zhao, B., Zikic, D., Prastawa, M., Reyes, M., Van Leemput, K.: The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans. Med. Imaging 34(10), 1993–2024 (2015)Google Scholar
  48. 48.
    Kistler, M., Bonaretti, S., Pfahrer, M., Niklaus, R., Büchler, P.: The virtual skeleton database: An open access repository for biomedical research and collaboration. J. Med. Internet Res. 15(11), e245 (2013)CrossRefGoogle Scholar
  49. 49.
    Barkhof, F., Scheltens, P.: Imaging of white matter lesions. Cerebrovasc. Dis. 13(Suppl 2), 21–30 (2002)CrossRefGoogle Scholar
  50. 50.
    Arroyo-Camarena, S., Domínguez-Cherit, J., Lammoglia-Ordiales, L., Fabila-Bustos, D.A., Escobar-Pio, A., Stolik, S., Valor-Reed, A., de la Rosa-Vázquez, J.: Spectroscopic and imaging characteristics of pigmented non-melanoma skin cancer and melanoma in patients with skin phototypes iii and iv. Oncol. Ther. 4(2), 315–331 (2016)CrossRefGoogle Scholar
  51. 51.
    Codella, N.C.F., Gutman, D., Celebi, M.E., Helba, B., Marchetti, M.A., Dusza, S.W., Kalloo, A., Liopyris, K., Mishra, N.K., Kittler, H., Halpern, A.: Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium On Biomedical Imaging (ISBI), Hosted By The International Skin Imaging Collaboration (ISIC) (2017). CoRR abs/1710.05006, 1710.05006Google Scholar
  52. 52.
    Asaid, R., Boyce, G., Padmasekara, G.: Use of a smartphone for monitoring dermatological lesions compared to clinical photography. J. Mob. Technol. Med. 1, 16–18 (2012)CrossRefGoogle Scholar
  53. 53.
    Wu, X., Marchetti, M.A., Marghoob, A.A.: Dermoscopy: not just for dermatologists. Melanoma Manag 2(1), 63–73 (2015)CrossRefGoogle Scholar
  54. 54.
    Sakamoto, K.: The pathology of mycobacterium tuberculosis infection. Vet. Pathol. 49(3), 423–39 (2012)CrossRefGoogle Scholar
  55. 55.
    Huda, W., Abrahams, R.B.: Radiographic techniques, contrast, and noise in x-ray imaging. AJR Am. J. Roentgenol. 204(2), W126–131 (2015)CrossRefGoogle Scholar
  56. 56.
    Brady, A., Laoide, R., McCarthy, P., McDermott, R.: Discrepancy and error in radiology: concepts, causes and consequences. Ulster Med. J. 81(1), 3–9 (2012)Google Scholar
  57. 57.
    Candemir, S., Jaeger, S., Palaniappan, K., P Musco, J., K Singh, R., Xue, Z., Karargyris, A., Antani, S., Thoma, G., Mcdonald, C.: Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration. IEEE Trans. Med. Imaging 33, 577–590 (2014)CrossRefGoogle Scholar
  58. 58.
    Jaeger, S., Karargyris, A., Candemir, S., Folio, L., Siegelman, J., Callaghan, F., Xue, Z., Palaniappan, K., Singh, R.K., Antani, S., Thoma, G., Wang, Y., Lu, P., McDonald, C.J.: Automatic tuberculosis screening using chest radiographs. IEEE Trans. Med. Imaging 33(2), 233–245 (2014)CrossRefGoogle Scholar
  59. 59.
    Kohli, M.D., Summers, R.M., Geis, J.R.: Medical image data and datasets in the era of machine learning-whitepaper from the 2016 c-MIMI meeting dataset session. J. Digit. Imaging 30, 392–399 (2017)CrossRefGoogle Scholar
  60. 60.
    Valindria, V.V., Lavdas, I., Bai, W., Kamnitsas, K., Aboagye, E.O., Rockall, A.G., Rueckert, D., Glocker, B.: Reverse classification accuracy: predicting segmentation performance in the absence of ground truth. IEEE Trans. Med. Imaging 36, 1597–1606 (2017)CrossRefGoogle Scholar
  61. 61.
    Kouanou, A.T., Tchiotsop, D., Kengne, R., Zephirin, D.T., Armele, N.M.A., Tchinda, R.: An optimal big data workflow for biomedical image analysis. Inform. Med. Unlocked 11, 68–74 (2018)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Meeta Kalra
    • 1
    Email author
  • Michael Osadebey
    • 2
  • Nizar Bouguila
    • 1
  • Marius Pedersen
    • 2
  • Wentao Fan
    • 3
  1. 1.Concordia Institute for Information Systems EngineeringConcordia UniversityMontrealCanada
  2. 2.Department of Computer ScienceNorwegian University of Science and TechnologyTrondheimNorway
  3. 3.Department of Computer Science and TechnologyHuaqiao UniversityXiamenChina

Personalised recommendations