Ontology-Guided Principal Component Analysis: Reaching the Limits of the Doctor-in-the-Loop

  • Sandra Wartner
  • Dominic Girardi
  • Manuela Wiesinger-Widi
  • Johannes Trenkler
  • Raimund Kleiser
  • Andreas Holzinger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9832)


Biomedical research requires deep domain expertise to perform analyses of complex data sets, assisted by mathematical expertise provided by data scientists who design and develop sophisticated methods and tools. Such methods and tools not only require preprocessing of the data, but most of all a meaningful input selection. Usually, data scientists do not have sufficient background knowledge about the origin of the data and the biomedical problems to be solved, consequently a doctor-in-the-loop can be of great help here. In this paper we revise the viability of integrating an analysis guided visualization component in an ontology-guided data infrastructure, exemplified by the principal component analysis. We evaluated this approach by examining the potential for intelligent support of medical experts on the case of cerebral aneurysms research.


Principal component analysis Ontology Data mining PCA Data warehousing Doctor-in-the-loop 


  1. 1.
    Akgul, C.B., Rubin, D.L., Napel, S., Beaulieu, C.F., Greenspan, H., Acar, B.: Content-based image retrieval in radiology: Current status and future directions. J. Digit. Imaging 24(2), 208–222 (2011)CrossRefGoogle Scholar
  2. 2.
    Anderson, N.R., Lee, E.S., Brockenbrough, J.S., Minie, M.E., Fuller, S., Brinkley, J., Tarczy-Hornoch, P.: Issues in biomedical research data management and analysis: needs and barriers. J. Am. Med. Inf. Assoc. 14(4), 478–488 (2007)CrossRefGoogle Scholar
  3. 3.
    Atzmüller, M., Baumeister, J., Puppe, F.: Introspective subgroup analysis for interactive knowledge refinement. In: Sutcliffe, G., Goebel, R. (eds.) FLAIRS Nineteenth International Florida Artificial Intelligence Research Society Conference, pp. 402–407. AAAI Press, Menlo Park (2006)Google Scholar
  4. 4.
    Buchan, I.E., Winn, J.M., Bishop, C.M.: A unified modeling approach to data-intensive healthcare. In: Hey, T., Tansley, S., Tolle, K. (eds.) The fourth paradigm: Data-Intensive Scientific Discovery, pp. 91–98. Microsoft Research, Redmond (2009)Google Scholar
  5. 5.
    Cios, K.J., William Moore, G.: Uniqueness of medical data mining. Artif. Intell. Med. 26(1), 1–24 (2002)CrossRefGoogle Scholar
  6. 6.
    Gigerenzer, G., Gaissmaier, W.: Heuristic decision making. Ann. Rev. Psychol. 62, 451–482 (2011)CrossRefGoogle Scholar
  7. 7.
    Girardi, D., Dirnberger, J., Giretzlehner, M.: An ontology-based clinical data warehouse for scientific research. Saf. Health 1(1), 1–9 (2015)CrossRefGoogle Scholar
  8. 8.
    Girardi, D., Kueng, J., Holzinger, A.: A domain-expert centered process model for knowledge discovery in medical research: putting the expert-in-the-loop. In: Guo, Y., Friston, K., Aldo, F., Hill, S., Peng, H. (eds.) BIH 2015. LNCS, vol. 9250, pp. 389–398. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  9. 9.
    Girardi, D., Küng, J., Kleiser, R., Sonnberger, M., Csillag, D., Trenkler, J., Holzinger, A.: Interactive knowledge discovery with the doctor-in-the-loop: a practical example of cerebral aneurysms research. Brain Inf., 1–11 (2016). (Online First Articles)Google Scholar
  10. 10.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRefGoogle Scholar
  11. 11.
    Holzinger, A.: Human-computer interaction and knowledge discovery (HCI-KDD): what is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  12. 12.
    Holzinger, A.: Trends in interactive knowledge discovery for personalized medicine: Cognitive science meets machine learning. IEEE Intell. Inf. Bull. 15(1), 6–14 (2014)Google Scholar
  13. 13.
    Holzinger, A.: Interactive machine learning for health informatics: When do we need the human-in-the-loop? Springer Brain Inform. (BRIN) 3, 1–13 (2016). http://dx.doi.org/10.1007/s40708-016-0042-6 CrossRefGoogle Scholar
  14. 14.
    Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge discovery and interactive data mining in bioinformatics - state-of-the-art, future challenges and research directions. BMC Bioinform. 15(S6), I1 (2014)CrossRefGoogle Scholar
  15. 15.
    Holzinger, Andreas, Stocker, Christof, Dehmer, Matthias: Big complex biomedical data: towards a taxonomy of data. In: Obaidat, Mohammad S., Filipe, Joaquim (eds.) ICETE 2012. CCIS, vol. 455, pp. 3–18. Springer, Heidelberg (2014)Google Scholar
  16. 16.
    Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417–441 (1933)CrossRefMATHGoogle Scholar
  17. 17.
    Hund, M., Bhm, D., Sturm, W., Sedlmair, M., Schreck, T., Ullrich, T., Keim, D.A., Majnaric, L., Holzinger, A.: Visual analytics for concept exploration in subspaces of patient groups: Making sense of complex datasets with the doctor-in-the-loop. Brain Inf. 3, 1–15 (2016)CrossRefGoogle Scholar
  18. 18.
    Kessler, W.: Multivariate Datenanalyse: für die Pharma-Bio- und Prozessanalytik. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim (2007)Google Scholar
  19. 19.
    Kurgan, L.A., Musilek, P.: A survey of knowledge discovery and data mining process models. Knowl. Eng. Rev. 21(01), 1–24 (2006)CrossRefGoogle Scholar
  20. 20.
    Malinowski, E.: A thesis in two parts: application of factor analysis to chemical problems. Stevens Inst. Technol. 2(1–2), 54–94 (1961)Google Scholar
  21. 21.
    Nandi, D., Ashour, A.S., Samanta, S., Chakraborty, S., Salem, M.A., Dey, N.: Principal component analysis in medical image processing: a study. Int. J. Image Min. 1(1), 65–86 (2015)CrossRefGoogle Scholar
  22. 22.
    National Center for Biotechnology Information: Mesh search for principalcomponent analysis and medicine (2016). http://www.ncbi.nlm.nih.gov/
  23. 23.
    Niakšu, O., Kurasova, O.: Data mining applications in healthcare: research vs practice. Databases Inf. Syst. BalticDB&IS 2012, 58 (2012)Google Scholar
  24. 24.
    NIH: Cerebral Aneurysm Information Page (April 2010). http://www.ninds.nih.gov/disorders/cerebral_aneurysm/cerebral_aneurysm.htm
  25. 25.
    Pearson, K.: On lines and planes of closest fit to systems of points in space. Philos. Mag. 2, 559–572 (1901)CrossRefMATHGoogle Scholar
  26. 26.
    Rencher, A.: Methods of Multivariate Analysis. Wiley Series in Probability and Statistics. Wiley, Chichester (2002)CrossRefMATHGoogle Scholar
  27. 27.
    Sharaf, M., Illman, D., Kowalski, B.: Chemometrics. Wiley, New York (1986)Google Scholar
  28. 28.
    Thurstone, L.: Multiple-factor Analysis: A Development and Expansion of The Vectors of Mind. The university of Chicago committee on publications in biology and medicine. University of Chicago Press, New York (1947)MATHGoogle Scholar
  29. 29.
    Thurstone, L., Thurston, T.: Factorial Studies of Intelligence. Psychometrika monograph suplements. The University of Chicago press, Chicago (1941)Google Scholar
  30. 30.
    Wang, B.B., Mckay, R.I., Abbass, H.A., Barlow, M.: A comparative study for domain ontology guided feature extraction. In: Proceedings of the 26th Australasian Computer Science Conference vol. 16, pp. 69–78. Australian Computer Society, Inc. (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Sandra Wartner
    • 1
  • Dominic Girardi
    • 1
  • Manuela Wiesinger-Widi
    • 1
  • Johannes Trenkler
    • 2
  • Raimund Kleiser
    • 2
  • Andreas Holzinger
    • 3
  1. 1.Research Unit Medical Informatics at RISC Software GmbHJohannes Kepler UniversityHagenberg and LinzAustria
  2. 2.Institute of Neuroradiology at Neuromed Campus of the Kepler University KlinikumLinzAustria
  3. 3.Research Unit, HCI-KDD, Institute for Medical Informatics, Statistics and DocumentationMedical University GrazGrazAustria

Personalised recommendations