, Volume 39, Issue 4, pp 290–300 | Cite as

The Clinical Data Intelligence Project

A smart data initiative
  • Daniel SonntagEmail author
  • Volker Tresp
  • Sonja Zillner
  • Alexander Cavallaro
  • Matthias Hammon
  • André Reis
  • Peter A. Fasching
  • Martin Sedlmayr
  • Thomas Ganslandt
  • Hans-Ulrich Prokosch
  • Klemens Budde
  • Danilo Schmidt
  • Carl Hinrichs
  • Thomas Wittenberg
  • Philipp Daumke
  • Patricia G. Oppelt


This article is about a new project that combines clinical data intelligence and smart data. It provides an introduction to the “Klinische Datenintelligenz” (KDI) project which is founded by the Federal Ministry for Economic Affairs and Energy (BMWi); we transfer research and development results (R&D) of the analysis of data which are generated in the clinical routine in specific medical domain. We present the project structure and goals, how patient care should be improved, and the joint efforts of data and knowledge engineering, information extraction (from textual and other unstructured data), statistical machine learning, decision support, and their integration into special use cases moving towards individualised medicine. In particular, we describe some details of our medical use cases and cooperation with two major German university hospitals.


Clinical Decision Support Semantic Annotation Protected Health Information Unstructured Data Data Intelligence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal A, Shiffman RN (2001) Using gem-encoded guidelines to generate medical logic modules. In: AMIA 2001, American Medical Informatics Association Annual Symposium, Washington, DC, USA, 3.–7. November 2001, Scholar
  2. 2.
    Azzato EM, Tyrer J, Fasching PAEA (2010) Association between a germline OCA2 polymorphism at chromosome 15q13.1 and estrogen receptor-negative breast cancer survival. J Natl Cancer I 102:650–662CrossRefGoogle Scholar
  3. 3.
    Barbieri DF, Braga D, Ceri S, Valle ED, Huang Y, Tresp V, Rettinger A, Wermser H (2010) Deductive and inductive stream reasoning for semantic social media analytics. IEEE Intell Syst 25(6):32–41CrossRefGoogle Scholar
  4. 4.
    Bissler JJ, Kingswood JC, Radzikowska E, Zonnenberg BA, Frost M, Belousova E, Sauter M, Nonomura N, Brakemeier S, de Vries PJ, Whittemore VH, Chen D, Sahmoud T, Shah G, Lincy J, Lebwohl D, Budde K (2013) Everolimus for angiomyolipoma associated with tuberous sclerosis complex or sporadic lymphangioleiomyomatosis (EXIST-2): a multicentre, randomised, double-blind, placebo-controlled trial. Lancet 381(9869):817–824CrossRefGoogle Scholar
  5. 5.
    Bleyer A, Welch HG (2012) Effect of three decades of screening mammography on breast-cancer incidence. N Engl J MedGoogle Scholar
  6. 6.
    Budde K, Becker T, Arns W, Sommerer C, Reinke P, Eisenberger U, Kramer S, Fischer W, Gschaidmeier H, Pietruck F (2011) Everolimus-based, calcineurin-inhibitor-free regimen in recipients of de-novo kidney transplants: an open-label, randomised, controlled trial. Lancet 377:837–847CrossRefGoogle Scholar
  7. 7.
    Budde K, Lehner F, Sommerer C, Arns W, Reinke P, Eisenberger U, Wüthrich RP, Scheidl S, May C, Paulus EMM, Mühlfeld A, Wolters HH, Pressmar K, Stahl R, Witzke O, ZEUS Study Investigators (2012) Conversion from cyclosporine to everolimus at 4.5 months posttransplant: 3-year results from the randomized ZEUS study. Am J Transplant 12(6):1528–1540CrossRefGoogle Scholar
  8. 8.
    Bundschus M, Dejori M, Stetter M, Tresp V, Kriegel HP (2008) Extraction of semantic biomedical relations from text using conditional random fields. BMC Bioinformatics 9:207CrossRefGoogle Scholar
  9. 9.
    Chaney K, Shiffman RN, Middleton B, White J, Reider J (2013) Findings from a five-year clinical decision support demonstration project and the road ahead. In: AMIA 2013, American Medical Informatics Association Annual Symposium, Washington, DC, USA, 16.–20. November 2013. Scholar
  10. 10.
    Choi IY, Kim TM, Kim MS, Mun SK, Chung YJ (2013) Perspectives on clinical informatics: integrating large-scale clinical, genomic, and health information for clinical care. Genomics Inform 11(4):186–90CrossRefGoogle Scholar
  11. 11.
    Daumke P, Enders F, Simon K, Poprat M, Marko K (2012) Semantic Annotation of Clinical Text — the Averbis Annotation Editor. In: Proceedings of the 55th Conference of the German Society of Medical Informatics, Biometry and Epidermiology (GMDS)Google Scholar
  12. 12.
    Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp 601–610, ACM, New York, NY, USA, Scholar
  13. 13.
    Dugas M, Lange M, Müller-Tidow C, Kirchhof P, Prokosch H (2010) Routine data from hospital information systems can support patient recruitment for clinical studies. Clin Trials 7(2):183–9CrossRefGoogle Scholar
  14. 14.
    Elter M, Held C, Wittenberg T (2010) Contour tracing for segmentation of mammographic masses. Phys Med Biol 55(18):5299–5315CrossRefGoogle Scholar
  15. 15.
    Elter M, Schulz-Wendtland R, Wittenberg T (2007) The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med Phys 34:4164–4172CrossRefGoogle Scholar
  16. 16.
    Evans WE, Relling MV (2009) Moving towards individualized medicine with pharmacogenomics. Nature 429:464–468CrossRefGoogle Scholar
  17. 17.
    Fasching P, Pharoah P, Cox A et al (2012) The role of genetic breast cancer susceptibility variants as prognostic factors. Hum Mol GenetGoogle Scholar
  18. 18.
    Gaizauskas RJ, Harkema H, Hepple M, Setzer A (2006) Task-Oriented Extraction of Temporal Information: The Case of Clinical Narratives. In: TIME, IEEE Computer Society, pp 188–195Google Scholar
  19. 19.
    Ganslandt T, Mate S, Helbing K, Sax U, Prokosch HU (2011) Unlocking Data for Clinical Research – The German i2b2 Experience. Appl Clin Inform 2:116–127CrossRefGoogle Scholar
  20. 20.
    Glass A, McGuinness DL, Wolverton M (2008) Toward establishing trust in adaptive agents. In: IUI ’08: Proceedings of the 13th international conference on Intelligent user interfaces, pp 227–236, ACM, New York, NY, USA, Scholar
  21. 21.
    Groves P, Kayyali B, Knott D, Kuiken SV (2013) The “big data” revolution in healthcare, accelerating value and innovation. In: Centre for US Health System Reform Business Technology Office, Mckinsey & CompanyGoogle Scholar
  22. 22.
    Hammon M, Dankerl P, Kramer M, Seifert S, Tsymbal A, Costa MJ, Janka R, Uder M, Cavallaro A (2012) Automated Detection and Volumetric Segmentation of the Spleen in CT Scans. RofoGoogle Scholar
  23. 23.
    Hinrichs C, Wendland S, Zimmermann H, Eurich D, Neuhaus R, Schlattmann P, Babel N, Riess H, Gärtner B, Anagnostopoulos I, Reinke P, Trappe RU (2011) IL-6 and IL-10 in post-transplant lymphoproliferative disorders development and maintenance: a longitudinal study of cytokine plasma levels and T-cell subsets in 38 patients undergoing treatment. Transpl IntGoogle Scholar
  24. 24.
    Hoyer J, Dreweke A, Becker C, Göhring I, Thiel C, Peippo M, Rauch R, Hofbeck M, Trautmann U, Zweier C, Zenker M, Hüffmeier U, Kraus C, Ekici A, Rüschendorf F, Nürnberg P, Reis A, Rauch A (2007) Molecular karyotyping in patients with mental retardation using 100K single-nucleotide polymorphism arrays. J Med Genet 44:629–636CrossRefGoogle Scholar
  25. 25.
    Huber L, Naik M, Budde K (2011) Desensitization of HLA-Incompatible Kidney Recipients. New Engl J Med 365(17):1643–1645CrossRefGoogle Scholar
  26. 26.
    Hussain T, Michel G, Shiffman RN (2009) The yale guideline recommendation corpus: A representative sample of the knowledge content of guidelines, Int J Med Inform 78(5):354–363Google Scholar
  27. 27.
    Kage A, Elter M, Wittenberg T (2007) An evaluation and comparison of the performance of state of the art approaches for the detection of spiculated masses in mammograms. Conf Proc IEEE Eng Med Biol Soc, pp 3773–3776CrossRefGoogle Scholar
  28. 28.
    Krompass D, Esteban C, Tresp V, Sedlmayr M, Ganslandt T (2015) Exploiting latent embeddings of nominal clinical data for predicting hospital readmission. KI – Künstliche Intelligenz, 153–159, Scholar
  29. 29.
    Lasserre J, Arnold S, Vingron M, Reinke P, Hinrichs C (2012) Predicting the outcome of renal transplantation. JAMIA 19(2):255–262Google Scholar
  30. 30.
    Lu W, Jansen L, Post W, Bonnema J, de Velde JV, Bock GD (2009) Impact on survival of early detection of isolated breast recurrences after the primary treatment for breast cancer: a meta-analysis. Breast Cancer Res TreatGoogle Scholar
  31. 31.
    Lysaght M (2002) Maintenance dialysis population dynamics: Current trends and longterm implications. J Am Soc Nephrol 13:37–40Google Scholar
  32. 32.
    Mandl KD, Mandel JC, Murphy SN, Bernstam EV, Ramoni RL, Kreda DA, McCoy JM, Adida B, Kohane IS (2012) The smart platform: early experience enabling substitutable applications for electronic health records. J Am Med Inform Assoc 19(4):597–603CrossRefGoogle Scholar
  33. 33.
    Middleton B, Kawamoto K, Reider J, Rosendale D, Shiffman RN (2012) From guidelines to clinical decision support: a unified approach to translating and implementing knowledge. In: AMIA 2012, American Medical Informatics Association Annual Symposium, Chicago, Illinois, USA, 3–7 November 2012, Scholar
  34. 34.
    Mkrtchyan T, Sonntag D (2014) Deep Parsing at the CLEF2014 IE Task (DFKI-Medical). In: CEUR Workshop Proceedings, vol 1180, pp 138–146Google Scholar
  35. 35.
    Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, Kohane I (2010) Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc 17(2):124–130CrossRefGoogle Scholar
  36. 36.
    Nickel M, Tresp V, Gabrilovich E, Murphy K (2015) Relational machine learning for knowledge graphs. In: Proceedings of the IEEE Conference. IEEEGoogle Scholar
  37. 37.
    Nickel M, Tresp V, Kriegel HP (2012) Factorizing YAGO: scalable machine learning for linked data. In: Proceedings of the 21st International Conference on World Wide Web Conference, (WWW), pp 271–280. ACM, New York, NY, USA, Scholar
  38. 38.
    Oberkampf H, Zillner S, Bauer B, Hammon M (2012) Interpreting Patient Data using Medical Background Knowledge. In: Proceedings of the International Conference on Biomedical Ontologies (ICBO) 2012, Austria, GrazGoogle Scholar
  39. 39.
    Oberkampf H, Zillner S, Bauer B, Hammon M (2013) An OGMS-based Model for Clinical Information (MCI). In: Proceedings of International Conference on Biomedical Ontology 2013, pp 97–100Google Scholar
  40. 40.
    Prokosch H, Beck A, Ganslandt T, Hummel M, Kiehntopf M, Sax U, Ückert F, Semler S (2010) IT Infrastructure Components for Biobanking. Appl Clin InformGoogle Scholar
  41. 41.
    Prokosch H, Ries M, Beyer A, Schwenk M, Seggewies C, Köpcke F, Mate S, Martin M, Bärthlein B, Beckmann MW, Stürzl M, Croner R, Wullich B, Ganslandt T, Bürkle T (2011) IT Infrastructure Components to Support Clinical Care and Translational Research Projects in a Comprehensive Cancer Center. In: User Centered Networked Health Care – Proceedings of MIE International Congress of the European Federation for Medical Informatics, Oslo, NorwayGoogle Scholar
  42. 42.
    Prokosch HU, Ganslandt T (2009) Perspectives for medical informatics. Reusing the electronic medical record for clinical research. Method Inform Med 48:38–44Google Scholar
  43. 43.
    Rauch A, Thiel C, Schindler D, Wick U, Crow Y, Ekici A, van Essen A, Goecke T, Al-Gazali L, Chrzanowska H, Zweier C, Brunner H, Becker K, Curry C, Dallapiccola B, Devriendt K, Dörfler A, Kinning E, Megarbane A et al (2008) Mutations in the pericentrin (PCNT) gene cause primordial dwarfism. Science 319:816–819CrossRefGoogle Scholar
  44. 44.
    Rojas M, Telaro E, Russo A, Moschetti I, Coe L, Fossati R, Palli D, del Roselli T, Liberati A (2005) Follow-up strategies for women treated for early breast cancer. Cochrane Database Syst RevGoogle Scholar
  45. 45.
    Sackett DL, Rosenberg WMC, Gray JAM, Haynes RB, Richardson WS (1996) Evidence based medicine: what it is and what it isn’t. BMJ 312(7023):71–72CrossRefGoogle Scholar
  46. 46.
    Seifert S, Barbu A, Zhou SK, Liu D, Feulner J, Huber M, Sühling M, Cavallaro A, Comaniciu D (2010) Hierarchical parsing and semantic navigation of full body CT data. In: Proceedings of the SPIE Medical Imaging. Scholar
  47. 47.
    Seifert S, Thoma M, Stegmaier F, Hammon M, Kramer M, Huber M, Kriegel HP, Cavallaro A, Comaniciu D (2011) Combined semantic and similarity search in medical image databases. In: SPIE Medical ImagingGoogle Scholar
  48. 48.
    Seifert S, Zillner S, Huber M, Sintek M, Sonntag D, Cavallaro A (2011) Theseus Usecase MEDICO (in German). In: Acatech diskutiert ,,Internet der Dienste“ (Internet of Services). SpringerGoogle Scholar
  49. 49.
    Siegel R, Naishadham D, Jemal A (2012) Cancer statistics. CA Cancer J ClinGoogle Scholar
  50. 50.
    Sonntag D, Möller M (2009) Unifying semantic annotation and querying in biomedical image repositories. In: Proceedings of International Conference on Knowledge Management and Information Sharing (KMIS)Google Scholar
  51. 51.
    Sonntag, D., Wennerberg, P., Buitelaar, P., Zillner, S.: Cases on Semantic Interoperability for Information Systems Integration: Practices and Applications, chap. Pillars of Ontology Treatment in the Medical Domain, pp 162–186. Information Science Reference (2010)Google Scholar
  52. 52.
    Sonntag D, Zillner S, Ernst P, Schulz C, Sintek M, Dankerl P (2014) Mobile radiology interaction and decision support systems of the future. In: Wahlster W, Grallert HJ, Wess S, Friedrich H, Widenka T (eds) Towards the Internet of Services: The THESEUS Research Program, Cognitive Technologies. Springer International Publishing, pp 371–382Google Scholar
  53. 53.
    Sreenivasaiah PK, Kim do H (2010) Current trends and new challenges of databases and web applications for systems driven biological research. Front Physiol 1:147CrossRefGoogle Scholar
  54. 54.
    Styler WF, Bethard S, Finan S, Palmer M, Pradhan S, de Groen PC, Erickson B, Miller T, Lin C, Savova G, Pustejovsky J (2014) Temporal annotation in the clinical domain. T Assoc Comput Linguist 2:143–154Google Scholar
  55. 55.
    Sun W, Rumshisky A, Uzuner O (2013) Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. J Am Med Inform Assoc 20(5):806–813CrossRefGoogle Scholar
  56. 56.
    Tresp V, Huang Y, Nickel M (2014) Querying the Web with Statistical Machine Learning. In: Wahlster W, Grallert HJ, Wess S, Friedrich H, Widenka T (eds) Towards the Internet of Services: The THESEUS Research Program, Cognitive Technologies. Springer International PublishingGoogle Scholar
  57. 57.
    Tresp V, Zillner S, Costa MJ, Huang Y, Cavallaro A, Fasching PA, Reis A, Sedlmayr M, Ganslandt T, Budde K, Hinrichs C, Schmidt D, Daumke P, Sonntag D, Wittenberg T, Oppelt PG, Krompass D (2013) Towards a new science of a clinical data intelligence. In: Proceedings of the NIPS Workshop on Machine Learning for Clinical Data Analysis and HealthcareGoogle Scholar
  58. 58.
    Untch M, von Minckwitz G, Konecny GE, Conrad U, Fett W et al., CK (2011) PREPARE trial: a randomized phase III trial comparing preoperative, dose-dense, dose-intensified chemotherapy with epirubicin, paclitaxel, and CMF versus a standard-dosed epirubicin–cyclophosphamide followed by paclitaxel with or without darbepoetin alfa in primary breast cance-outcome on prognosis. Ann Oncol: 1999–2006Google Scholar
  59. 59.
    Wagner F, Wittenberg T (2011) New features for the classification of mammographic masses. Int J Comput Appl 35(4):29–35Google Scholar
  60. 60.
    Wagner F, Wittenberg T, Elter M (2010) Classification of mammographic masses: influence of regions used for feature extraction on the classification performance. Proc. SPIE, Medical ImagingGoogle Scholar
  61. 61.
    Wels M, Kelm BM, Hammon M, Jerebko AK, Sühling M, Comaniciu D (2012) Data-driven breast decompression and lesion mapping from digital breast tomosynthesis. MICCAI (1):438–446Google Scholar
  62. 62.
    Wels M, Kelm BM, Tsymbal A, Hammon M, Soza G, Sühling M, Cavallaro A, Comaniciu D (2012) Multi-stage osteolytic spinal bone lesion detection from CT data with internal sensitivity control. In: SPIE Medical ImagingGoogle Scholar
  63. 63.
    Woeckel A, Kurzeder C, Geyer V, Novasphenny I, Wolters R, Wischnewsky M, Kreienberg R, Varga D (2010) Effects of guideline adherence in primary breast cancer – a 5-year multi-center cohort study of 3976 patients. BreastGoogle Scholar
  64. 64.
    Woeckel A, Kreienberg R (2008) First Revision of the German S3 Guideline “Diagnosis, Therapy, and Follow-Up of Breast Cancer”. Breast CareGoogle Scholar
  65. 65.
    Xu F, Uszkoreit H, Li H, Adolphs P, Cheng X (2014) Domain-adaptive relation extraction for the semantic web. In: Wahlster W, Grallert HJ, Wess S, Friedrich H, Widenka T (eds) Towards the Internet of Services: The THESEUS Research Program, Cognitive Technologies. Springer International Publishing, pp 289–297Google Scholar
  66. 66.
    Yu K, Chu W, Yu S, Tresp V, Xu Z (2006) Stochastic Relational Models for Discriminative Link Prediction. In: Advances in Neural Information Processing Systems (NIPS 2006). MIT PressGoogle Scholar
  67. 67.
    Zhou L, Friedman C, Parsons S, Hripcsak G (2005) System architecture for temporal information extraction, representation and reasoning in clinical narrative reports. AMIA Annu Symp Proc, pp 869–873Google Scholar
  68. 68.
    Zillner S (2010) Reasoning-Based Patient Classification for Enhanced Medical Image Annotations. In: Proceedings of the Extended Semantic Web Conference, (ESWC 2010), Heraklion, Greece, JuneGoogle Scholar
  69. 69.
    Zillner S, Neururer S (2015) Technology roadmap for big data healthcare applications. KI – Kuenstliche Intelligenz 29(2):131–141CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Daniel Sonntag
    • 1
    Email author
  • Volker Tresp
    • 2
  • Sonja Zillner
    • 2
  • Alexander Cavallaro
    • 3
  • Matthias Hammon
    • 3
  • André Reis
    • 3
  • Peter A. Fasching
    • 3
  • Martin Sedlmayr
    • 3
  • Thomas Ganslandt
    • 3
  • Hans-Ulrich Prokosch
    • 3
  • Klemens Budde
    • 4
  • Danilo Schmidt
    • 4
  • Carl Hinrichs
    • 4
  • Thomas Wittenberg
    • 5
  • Philipp Daumke
    • 6
  • Patricia G. Oppelt
    • 7
  1. 1.German Research Center for Artificial IntelligenceSaarbrückenDeutschland
  2. 2.SiemensMünchenDeutschland
  3. 3.UkerErlangenDeutschland
  4. 4.CharitéBerlinDeutschland
  5. 5.Fraunhofer IISErlangenDeutschland
  6. 6.AverbisFreiburgDeutschland
  7. 7.IFGErlangenDeutschland

Personalised recommendations