A data- and model-driven approach for cancer treatment


All people are unique and so are their diseases. Our genomes, disease histories, behavior, and lifestyles are all different; therefore it is not too surprising that people often respond differently when administered the same drugs. Cancer, in particular, is a complex and heterogeneous disease, originating in patients with different genomes, in cells with the different epigenomes, formed and evolving on the basis of random processes, with the response to therapy not only depending on the individual cancer cell but also on many features of the patient. Selection of an optimal therapy will therefore require a deep molecular analysis comprising both the patient and their tumor (e.g., comprehensive molecular tumor analysis [CMTA]), and much better personalized prediction of response to possible therapies. Currently, we are at an inflection point in which advances in technology, decreases in the costs of sequencing and other molecular analyses, and increases in computing advances are converging, forming the foundation to build a data-driven approach to personalized oncology. In this article we discuss the deep molecular characterization of individual tumors and patients as the basis of not only current precision oncology but also of computational models (‘digital twins’), the foundation for a truly personalized therapy selection of the future.

We are all very different, with different genomes, different disease histories, different behavior and molecularly different diseases. It is therefore not surprising that we often react differently to drugs we receive. To overcome this in oncology, we require much deeper data on individual tumors and patients (e.g., comprehensive molecular tumor analysis, CMTA), and much better personalized prediction of the effects of possible therapies, initially through precision medicine, but increasingly through digital models of individual tumors and patients, our “digital twins”.

Cancer management

Cancer continues to exert a major socioeconomic burden worldwide. Each year in Europe, there are approximately 3.7 mio. new cases and 1.9 mio. deaths due to cancer [1]. In Germany alone, there are over 500,000 new cases and 200,000 deaths due to cancer each year, with less than 50% of German cancer patients surviving for more than 10 years. The economic burden is enormous, with cancer costing Europe more than 126 billion euros every year [2]. A significant portion of this amount is being spent on drugs that often only help a fraction of the patients.

At the heart of these statistics is, in part, the fact that in many cases the cancer is only diagnosed in late stages, when it has already spread to other organs and acquired higher molecular heterogeneity, complicating its clinical management. Another issue accounting for cancer mortality is the limited number of available, approved drugs and access to clinical trials. Furthermore, every patient and every tumor is different, and reacts differently to treatment. How effective a treatment will be depends on a combination of individual attributes of the patient, such as age, general condition, genetic profile, the molecular landscape and biological properties of the tumor and interactions with the host (e.g., pharmacogenetics, intestinal microbiota, status of the immune system). Detailed and accurate knowledge about the individual tumor and patient is essential for the implementation of efficient precision medicine approaches.

While the uniqueness of each individual can today be considered in some areas of medicine (e.g., surgery), it is not yet adequately addressed in drug-based therapy, although major progress has undoubtedly been made for patient stratification. In particular, developments in molecular profiling tools, mainly genomics, but also transcriptomics and proteomics, are not only enabling insight into the mechanisms of cancer and the identification of patient groups responsive to a particular therapy, but they are also demonstrating how heterogeneous cancer cells are across individual tumors, people, and populations. Recognizing the inherent individuality of each patient and their tumor, precision medicine approaches based on the characterization of an individual’s cancer at the molecular level are starting to enter the mainstream of clinical practice for some cancer types and are providing new hope for cancer patients.

State of the art: precision oncology

Patient stratification using biomarkers as companion diagnostics for targeted therapies has seen improvements in the success rate of cancer treatments, representing a real paradigm shift in clinical practice. In the clinic, single gene and multigene panel sequencing are most often used to detect somatic alterations in a tumor sample. For example, the human epidermal growth factor receptor 2 (HER2), a proto-oncogene encoding the HER2 (alias ERBB2) tyrosine kinase receptor, is used as a biomarker for selecting treatment options. The use of HER2-directed agents such as the monoclonal antibody therapy trastuzumab and pertuzumab, has dramatically improved breast cancer patient outcomes in all stages of the disease [3, 4]. However, although these methods have enriched the treatment toolkit by enabling stratification of patients into subgroups of potential responders, they only test for alterations in a limited number of genes (those present on the panel) and still only provide a treatment option for those patients who happen to carry the selected markers—typically only a small fraction of the cases [e.g., 25% of breast cancers overexpress HER2; 15% of lung cancer patients carry alterations in epidermal growth factor receptor (EGFR) or receptor tyrosine kinase proto-oncogene 1 (ROS1)], limiting the overall impact of single biomarker strategies [5]. Moreover, additional individual genetic alterations characteristic of each tumor may be associated with resistance to the biomarker-selected therapy. For instance, only 20–50% of the patients carrying the ERBB2 amplification marker actually respond to trastuzumab. Moreover, tumor heterogeneity represents another challenge in which the outcome of drug response is determined by cell populations that carry different biological features [6].

“Future precision oncology based on multidimensional molecular data from both the patient and tumor”

Large-scale cancer sequencing initiatives such as the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) are characterizing the molecular complexity of a large number of cancer types in great detail and revealing that for some cancer types, identification of single key driver mutations may not be possible. These efforts have catalyzed a shift in focus from individual driver genes towards a more expansive and heterogeneous cancer mutational landscape (e.g., [7, 8]). By restricting molecular analysis to just a few regions of the tumor genome most precision oncology programs (which typically scan cancer gene panels) run the risk of missing very important information that may impact treatment choice. Additional omics, such as transcriptome sequencing (RNAseq), offer valuable insights for orienting treatment recommendations, such as the MammaPrint signature that predicts treatment response in breast cancer [9]. RNAseq enables access to a deeper level of knowledge on the tumor biology (oncogenic changes in gene expression, epigenetic effects, splice variants, information on unexpected gene fusions, etc) [10, 11], and the tumor microenvironment, which is becoming an increasingly attractive target for clinical intervention, with drugs targeting hypoxia, for instance. The breakthrough of immune checkpoint inhibitors (ICIs) has changed the precision medicine landscape [12] and shows that the focus of targeted therapies goes much beyond mutated forms of cancer genes to the immune environment. One predictor of response to ICIs is tumor mutation burden (TMB), which cannot be estimated with small gene panels. More than 50% of cutaneous melanomas, typically associated with a large number of somatic mutations including BRAFV600E, show durable response to ICIs with a relatively high response rate to anti-PD1 therapy (PD1: programmed cell death protein 1) [13]. In this context, identifying biomarkers that are predictive for response to immune checkpoint blockade will become increasingly important for selecting personalized treatment options. For example, in advanced colorectal cancer, overall response to anti-PD1 therapy is low; however, a subpopulation of patients with high TMB and genomic instability is responsive. Thus, a deep molecular analysis of the tumor and its microenvironment will facilitate selection of immune and targeted therapy options and provide new opportunities for the identification of relevant biomarkers for therapeutic response and side effects. Furthermore, recent data show that combination therapies are generally more effective [14].

This shift towards an increasingly detailed characterization of tumor (and patient) from single gene mutations and gene panels to whole exome sequencing (WES) analyses is already being undertaken in several clinical setups. The integration of RNAseq data, however, is still rarely implemented as of now. We have initiated the Treat20plus pilot program, a German Federal Ministry of Education and Research (BMBF)-funded project focused on deep molecular characterization and modeling of tumors from metastatic melanoma patients, in collaboration with the Charité Comprehensive Center, Berlin. We have developed the Comprehensive Molecular Tumor Analysis (CMTA) integrating low coverage genome, deep exome and deep bulk tumor transcriptome (Fig. 1), in which combined data are interpreted in a comprehensive yet concise report for the clinical tumor board. These data are also interpreted together with the routine tumor pathology, and innovative technologies such as CyTOF-based Imaging Mass Cytometry. We are using the Hyperion-based highly multiplexed (40 antigens in parallel) system, which enables spatially resolved proteome analysis to gain more information on the heterogeneity of the tumor, a major determinant in disease recurrence, and the tumor microenvironment; in particular, the presence, location and activity of different types of immune cells. We are also considering the routine characterization of the status of the immune system by emulsion-based techniques [15] or single cell transcriptome sequencing of circulating immune cells to better understand the factors determining the likely response and side effects of different immunotherapies in specific patients.

Fig. 1

Comparison of different diagnostic approaches. CMTA Comprehensive Molecular Tumor Analysis (adapted from [11], courtesy of S. Karger AG, Basel, Switzerland)

Prediction of drug response in precision medicine

Based on deep molecular analysis of the tumor, therapies can be proposed (or rejected) based on causal arguments (e.g., availability of a drug against a fusion protein driving the tumor) or based on correlations of specific biomarkers or signatures with response or nonresponse to different drugs. While some of these actionable variants (mutations, specific fusion genes or transcripts) are common in specific tumors (a good basis for panel sequencing), they often occur at lower frequency in other tumors; however, these therapy relevant alterations are often only detected by analyses at the depth and breadth provided by approaches such as the CMTA and similarly comprehensive methods.

Predicting drug response from complex data

In view of the enormous complexity of the human body and the disease tissue, and the many components able to affect the response to drugs, it is likely that mechanistic or hybrid models, combining mechanistic components with artificial intelligence (AI)-based tools, will be needed to achieve truly personalized therapy choice.

As the knowledge base on cancer, cellular transduction and molecular interactions widens, so does our ability to generate computational models with the capacity to accurately represent the complex networks and cross-talk determining cancer progression and drug response [16,17,18,19,20,21]. In particular, mechanistic models based on ordinary differential equations (ODEs) are among the most promising approaches for quantitatively capturing the dynamic behavior of the complex cellular processes associated with cancer and facilitating individualized predictions of drug response [18,19,20,21]. To simulate the effect of drugs on a specific tumor, a mechanistic model that integrates knowledge regarding relevant cancer signaling pathways together with detailed mechanistic drug data has to be personalized using the molecular information from tumor and patient (functionally relevant sequence variants or gene fusions, expression changes, etc.). Relevant changes are then used to modify the abundance or the functional features of the corresponding objects in the model [18, 19].

The main advantages of mechanistic models are the integration of data from diverse sources and experimental protocols, as well as the possibility to generate hypotheses for causal mechanisms through the design of in silico experiments to answer yet unsolved questions [22, 23]. By perturbing individual components within the mechanistic network (e.g., through simulation of mutations or changes in gene expression), the study of functional effects on different pathways and the identification of highly sensitive components that represent promising treatment targets is facilitated. Furthermore, virtual drug screens can be performed that incorporate the patient’s background, enabling discrimination between effective and ineffective drugs, and facilitating selection of optimal dosing and prediction of off-target effects that might lead to severe side effects (Fig. 2; [22, 24]). As such, mechanistic modeing represents a powerful and promising approach for virtual clinical trials, drug target identification, and personalized medicine (Fig. 2; see also references [18,19,20, 24]).

The model parametrization challenge

While the structure of these models is well defined, based on the results of basic research over many decades, little is known about the kinetic constants and other parameters in the complex environment of the cells and organisms; these values are, however, absolutely essential for making quantitative predictions. To estimate these types of parameters, we have to iteratively ‘reverse engineer’ the parameters of the system by minimizing the differences between model predictions and experimental data. In our view, this constitutes the major remaining bottleneck on the way to a data- and model-driven personalized medicine of the future. Solving this challenge will represent a major step forward towards accurate predictions of an individual’s response to targeted cancer drugs, allowing direct use of personalized computational models to help determine the optimal therapy choice.

From purely mechanistic to hybrid models

In many other areas (e.g., weather forecasts, virtual crash tests), mechanistic computational models have been shown to be excellent tools, with the ability to integrate a wide range of data. For processes where no information on the exact molecular mechanisms is available, hybrid models combining mechanistic model components with classical AI techniques (e.g., neural nets), might be of use for the generation of ‘hybrid’ models, combining the strengths (and avoiding the weaknesses) of both strategies.

From modeling the tumor to modeling the patient: the DigiTwins concept

The effects and side effects of a cancer drug to be taken orally is dependent on a range of factors; these include the enormous biological complexity of tumor cells and the heterogeneity of the tumor, the metabolism of the drug by intestinal microbiota, the pharmacogenomics determined by the patient genome, possible side effects on key cell types and, especially for immunotherapy, the patient’s immune system. It will therefore be essential to move from just modeling the tumor to also modeling the relevant tissues and cell types of the patient, communicating by exchange of signals (in this case the concentration of the active drug forms over time), as well as the complex interactions of the tumor with the immune system. This concept can, however, be generalized to any other disease area, with virtual patients constructed by modeling relevant components and processes in the individual patient, as well as their interactions, to develop truly personalized therapy choice, prevention and selection of well-being measures, not only for an increasing number of patients but also for healthy individuals (see www.digitwins.org).

Fig. 2

Workflow and applications of mechanistic modeling


Drugs affect the complex and highly variable biological networks in our bodies. To predict the response of a particular person/tumor to a drug, we therefore have to characterize their relevant biological networks in great detail. Such an approach was unthinkable or simply too expensive even a short time ago, costing much more than the classical pathology-based approach to diagnostics and empirical treatment selection. However, it is now increasingly feasible due to progress in a range of fields from next generation sequencing (NGS) and other -omics, imaging and sensor-based techniques to computing.

“Mechanistic models quantitatively capture the complex cellular processes associated with cancer”

The ability to predict the effects of a drug or drug combinations on individual patients in silico opens up numerous new opportunities for the future of precision oncology. By increasing the likelihood that the treatment given to a patient will work, the positive impacts will not only be felt by patients but health care systems in general. Through improvement of outcomes and quality of life, a potential reduction in therapeutic, disease-recurrence and end-stage clinical costs is anticipated that would counterbalance a potential increase in diagnostic (omics-based medicine) costs (NGS costs are already rapidly decreasing), and promote the reduction of inequalities across health systems by guiding the use of resources more effectively.

The data- and model-driven approach to treating cancer patients we describe is within grasp and not part of our distant future, with ongoing proof-of-principle clinical pilot studies, such as Treat20plus, and research scale efforts underway (e.g., iPC, a Horizon 2020 research and innovation project focused on predicting treatment outcomes for pediatric cancer, www.ipc-project.eu). The Treat20plus project, in particular, highlights the seamless integration of this approach within the current care framework, with results forming part of the toolkit used to inform molecular tumor board discussions and subsequent treatment decisions. In the broader context, initiatives such as DigiTwins (www.digitwins.org) are looking to a future sustainable vision of health care rooted in a data- and model-driven approach. Technology developments will be key to this vision. The routine inclusion of technologies such as single cell NGS [25] and in situ sequencing [26], as well as sensor-based and imaging techniques, will become critical in enabling the characterization of patients and their tumors in sufficient detail and will serve as the basic data input for the models of the future.

Clinical relevance

  • Cancer treatment boards can be informed by a data and model approach as part of the current patient diagnostic and therapy selection process.

  • The data- and model-driven approach to personalized oncology is already being tested in pilot clinical studies.

  • Further model development and optimization is required to ensure specificity, accuracy, and sensitivity of model predictions.





  1. 1.

    World Health Organisation, Data and Statistics: http://www.euro.who.int/en/health-topics/noncommunicable-diseases/cancer/data-and-statistics

  2. 2.

    Luengo-Fernandez R, Leal J, Gray A et al (2013) Economic burden of cancer across the European Union: a population-based cost analysis. Lancet Oncol 1(2):1165–1174

    Article  Google Scholar 

  3. 3.

    Slamon D, Eiermann W, Robert N et al (2011) Adjuvant trastuzumab in HER2-positive breast cancer. N Engl J Med 365(14):1273–1283

    CAS  Article  Google Scholar 

  4. 4.

    Paplomata E, Nahta R, O’Regan RM (2015) Systemic therapy for early-stage HER2-positive breast cancers: time for a less-is-more approach? Cancer 121(4):517–526

    CAS  Article  Google Scholar 

  5. 5.

    de Gramont A, Watson S, Ellis LM (2014) Pragmatic issues in biomarker evaluation for targeted therapies in cancer. Nat Rev Clin Oncol 12(4):197–212

    Article  Google Scholar 

  6. 6.

    Dagogo-Jack I, Shaw AT (2018) Tumour heterogeneity and resistance to cancer therapies. Nat Rev Clin Oncol 15:81–94

    CAS  Article  Google Scholar 

  7. 7.

    Hovestadt V, Jones DTW, Picelli S et al (2014) Decoding the regulatory landscape of medulloblastoma using DNA methylation sequencing. Nature 510(7506):537–541

    CAS  Article  Google Scholar 

  8. 8.

    Weischenfeldt J, Simon R, Feuerbach L et al (2013) Integrative genomic analyses reveal an androgen-driven somatic alteration landscape in early-onset prostate cancer. Cancer Cell 23(2):159–170

    CAS  Article  Google Scholar 

  9. 9.

    van’t Veer L, Yau C, Yu NY et al (2017) Tamoxifen therapy benefit for patients with 70-gene signature high and low risk. Breast Cancer Res Treat 166(2):593–601

    Article  Google Scholar 

  10. 10.

    Sultan M, Schulz MH, Richard H et al (2008) A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321:956–960

    CAS  Article  Google Scholar 

  11. 11.

    Schütte M, Ogilvie LA, Rieke DT et al (2017) Cancer Precision Medicine: Why More Is More and DNA Is Not Enough. Public Health Genomics 20(2):70–80

    Article  Google Scholar 

  12. 12.

    Snyder A, Makarov V, Merghoub T et al (2014) Genetic basis for clinical response to CTLA-4 blockade in melanoma. N Engl J Med 371(23):2189–2199

    Article  Google Scholar 

  13. 13.

    Topalian SL, Taube JM, Anders RA, Pardoll DM (2016) Mechanism-driven biomarkers to guide immune checkpoint blockade in cancer therapy. Nat Rev Cancer 16(5):275–287

    CAS  Article  Google Scholar 

  14. 14.

    Sicklick JK, Kato S, Okamura R et al (2019) Molecular profiling of cancer patients enables personalized combination therapy: the I‑PREDICT study. Nat Med 25(5):744–750

    CAS  Article  Google Scholar 

  15. 15.

    Devulapally PR, Bürger J, Mielke T et al (2018) Simple paired heavy- and light-chain antibody repertoire sequencing using endoplasmic reticulum microsomes. Genome Med 10(1):34

    Article  Google Scholar 

  16. 16.

    Kolch W, Halasz M, Granovskaya M et al (2015) The dynamic control of signal transduction networks in cancer cells. Nat Rev Cancer 15(9):515–527

    CAS  Article  Google Scholar 

  17. 17.

    Tyson JJ, Baumann WT, Chen C et al (2011) Dynamic modelling of oestrogen signalling and cell fate in breast cancer cells. Nat Rev Cancer 11(7):523–532

    CAS  Article  Google Scholar 

  18. 18.

    Wierling C, Kessler T, Ogilvie LA et al (2015) Network and systems biology: essential steps in virtualising drug discovery and development. Drug Discov Today Technol 15:33–40

    Article  Google Scholar 

  19. 19.

    Wierling C, Kühn A, Hache H et al (2012) Prediction in the face of uncertainty: a Monte Carlo-based approach for systems biology of cancer treatment. Mutat Res 746(2):163–170

    CAS  Article  Google Scholar 

  20. 20.

    Fröhlich F, Kessler T, Weindl D et al (2018) Efficient Parameter Estimation Enables the Prediction of Drug Response Using a Mechanistic Pan-Cancer Pathway Model. Cell Syst. 7(6), 567–579.e566

    Google Scholar 

  21. 21.

    Röhr C, Kerick M, Fischer A et al (2013) High-throughput miRNA and mRNA sequencing of paired colorectal normal, tumor and metastasis tissues and bioinformatic modeling of miRNA-1 therapeutic applications. PLoS ONE 8(7):e67461

    Article  Google Scholar 

  22. 22.

    Clegg L, Gabhann MF (2015) Molecular mechanism matters: Benefits of mechanistic computational models for drug development. Pharmacol Res 99:149–154

    CAS  Article  Google Scholar 

  23. 23.

    Baker R, Peña J, Jayamohan J et al (2018) Mechanistic models versus machine learning, a fight worth fighting for the biological community? Biol Lett 14(5):20170660

    Article  Google Scholar 

  24. 24.

    Lehrach H (2015) Virtual Clinical Trials, an Essential Step in Increasing the Effectiveness of the Drug Development Process. Public Health Genomics 18(6):366–371

    Article  Google Scholar 

  25. 25.

    Haque A, Engel J, Teichmann SA et al (2017) A practical guide to single-cell RNA sequencing for biomedical research and clinical applications. Genome Med 9:75

    Article  Google Scholar 

  26. 26.

    Lee JH, Daugharthy ER, Scheiman J et al (2014) Highly multiplexed subcellular RNA sequencing in situ. Science 343(6177):1360–1363

    CAS  Article  Google Scholar 

Download references


The authors would like to thank Romy Kümpfel as well as the wider Alacris Theranostics team for help in preparing the manuscript. This work was supported by the German Federal Ministry of Education and Research (BMBF) under grant number 031A512C (Treat20plus).


Open access funding provided by Max Planck Society.

Author information



Corresponding author

Correspondence to Marie-Laure Yaspo.

Ethics declarations

Conflict of interest

S. Schade, L.A. Ogilvie, T. Kessler, M. Schütte, C. Wierling, B.M. Lange and M.-L. Yaspo are employees of Alacris Theranostics GmbH, as noted in the affiliations. H. Lehrach is a board member of Alacris Theranostics GmbH.

For this article no studies with human participants or animals were performed by any of the authors. All studies performed were in accordance with the ethical standards indicated in each case.

Additional information

The German language version of the article is published in Der Onkologe 10/19, https://doi.org/10.1007/s00761-019-00652-1.

Rights and permissions

Open Access. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Schade, S., Ogilvie, L.A., Kessler, T. et al. A data- and model-driven approach for cancer treatment. Onkologe 25, 132–137 (2019). https://doi.org/10.1007/s00761-019-0624-z

Download citation


  • Precision medicine
  • Biomarkers, Tumor
  • Gene expression profiling
  • Translational medical research
  • Molecular targeted therapy