INTRODUCTION

Differential response to pharmacological treatment constitutes a major source of patient morbidity and mortality. Between 5 and 13% of in- and outpatient experience adverse drug-related events, mostly adverse drug reactions (ADRs) and sub-therapeutic effects of drug therapy [1, 2]. Various patient-specific factors, including age, polypharmacy, concomitant diseases, and diet as well as heritable factors contribute to these inter-individual differences with genetic polymorphisms explaining around 20–30% of the inter-individual variability in drug response [3, 4].

The liver as the central organ of drug metabolism is involved in the clearance of around 70% of drugs [5]. Enzymes encoded by the cytochrome P450 (CYP) superfamily of genes are responsible for > 75% of phase 1 drug metabolism and thus constitute major modulators of drug response [6]. Importantly, CYP genes are highly polymorphic between individuals and across populations, which can have important implications for the bioactivation and/or detoxification of medications [7].

Pharmacogenomic biomarkers that can predict drug response have been attributed great promise for the improvement of molecular diagnostics in routine clinical care. It is helpful to distinguish between (i) germline biomarkers, which can influence systemic drug pharmacokinetics and pharmacodynamics and (ii) biomarkers in the somatic cancer genome, which modulate how cancer cells respond to drugs. Besides genetic factors, epigenetic modifications of DNA or histones have been linked to differences in drug response. In oncology, epigenetic alterations in cancer cells have been linked to increased expression of drug efflux transporters, mediating resistance to chemotherapy. Detection of epigenetically modified DNA in the blood stream can be used for tumor stratification and presents an emerging tool for monitoring treatment efficacy as well as development of drug resistance [8, 9]. Moreover, pharmacological modulators of the epigenetic machinery have been successfully used in oncological treatment, mostly as adjuvants to sensitize tumors to standard-of-care chemotherapy. For a comprehensive update of this field, we refer to recent reviews [10, 11].

In this review, we provide an overview of genomic biomarkers that predict drug response and guide choice and dosage of drug treatment. Furthermore, we review recent technological advances that facilitate biomarker discovery and utilization.

DEVELOPMENTAL BACKGROUND FOR THE GENETIC POLYMORPHISM

The establishment of genetic polymorphisms in a specific population is the result of selection and genetic drift. In case of genes encoding drug absorption, distribution, metabolism, and excretion (ADME), a major selection occurred about 450 million years ago when animals became terrestrial and had to adapt to the novel dietary environment including plant toxins [12]. Among insects several examples are evident where genetic alterations of CYP genes have enabled the animals to adapt to a new environment caused by the presence of insecticides or a shift of host from one plant to another (Table 1). In humans, a similar adaptation occurred 10,000 to 5000 years ago when a strong selection favored survival of individuals carrying multiple copies of the CYP2D6 gene in Northeast Africa. The mechanism behind the assertion of these duplication alleles is believed to be the resulting increased capability for detoxification of alkaloids in plants allowing a more diversified plant diet during the starvation periods in these areas. The subsequent migration of the respective populations has resulted in an infiltration of these alleles into the Mediterranean area but not to Asia South Africa, or West Africa [19]. Positive selection for gain-of-function alleles is however very rare among ADME genes. By contrast, loss-of-function mutations in ADME genes are often not selected against due to the relatively minor role in endogenous process of the genes, thus constituting the basis for inter-individual differences in drug pharmacokinetics.

Table 1 Examples of Environmentally Induced Genetic Selection of CYP Gene Amplifications in Animals and Humans

PHARMACOGENOMIC GERMLINE BIOMARKERS

Differences in the response to exogenous substances have already been described more than two millennia ago by the Greek philosopher Pythagoras who noticed in the sixth century before Christ that individuals responded very differently to the ingestion of fava beans with some experiencing severe hemolytic anemia [20]. Excitingly, only in the last decades, technological advances have shed light on the molecular bases underlying these differences and discovered the responsible genetic polymorphisms (in the case of hemolytic responses to fava beans, genetic variants in G6PD were found to be responsible for the inter-individual differences in toxicity). By now a whole arsenal of genetic variants has been identified that mechanistically link alterations in structure or functionality of the gene product to differences in drug response or toxicity. Pharmacogenomic biomarkers are mostly located in genes encoding drug-metabolizing enzymes, transporters, drug targets, or HLA alleles and predict drug efficacy or inform about the risk to develop ADRs (Tables 2 and 3). Furthermore, genetic biomarkers have revolutionized the therapy of cystic fibrosis (CF) and we refer the interested readers to some of the recent excellent reviews that cover the field of genetically guided, targeted CF therapy [47, 48].

Table 2 Genetic Germline Variants Associated with Adverse Drug Reactions
Table 3 Genetic Germline Variants that Modulate Drug Efficacy

Genetic variation in genes encoding proteins of importance for drug response and ADRs, herein called pharmacogenes, can cause (i) too high or too low exposure of the drug, (ii) increased formation of toxic metabolites, (iii) increased or decreased interactions with the drug target, or (iv) activation of the immune system which in turn can lead to idiosyncratic drug toxicity (Fig. 1). In the following, we highlight selected pharmacogenomic examples that impact clinical practice and refer to the primary references provided in Tables 2 and 3 as well as to comprehensive recent reviews for further information on the topic [49,50,51].

Fig. 1
figure 1

Possible effects of genetic variations in pharmacogenes. Mutations in ADME genes can impact drug exposure. a In case of a pharmacologically active parent compound (red hexagons), increased functionality alleles can result in decreased drug exposure and reduced efficacy due to increased inactivation to inactive metabolites (black hexagons). One clinically relevant example is the increased metabolic inactivation of omeprazole in patients with the increased functionality allele CYP2C19*17 during Helicobacter pylori eradication therapy. In contrast, alleles with reduced functionality result in lower clearance and higher exposure, as exemplified by increased number of bleeding complications due to standard warfarin doses in patients with deficient CYP2C9 enzyme. b Conversely, when the administered compound is a prodrug (black circles), alleles with increased functionality results in elevated formation of active metabolites (red circles), higher exposure, and potentially toxicity. One prominent case of this class is morphine intoxication upon codeine treatment in patients with multiplicated active alleles of CYP2D6. Reduced metabolic activation of a prodrug generally results in lack of efficacy. Examples are the lack of analgesia due to codeine treatment in CYP2D6 poor metabolizers or treatment failure of clopidogrel in patients with reduced CYP2C19 activity. c Multiple drugs only bind to specific variant forms of the respective target protein, particularly in oncology (see Table 5). d During oncological treatment, drug targets frequently mutate entailing a lack of response to formerly very potent drugs. e Some drugs, such as abacavir and flucloxacillin, elicit immune-mediated toxicity reactions by binding to specific variants of the major histocompatibility complex either directly or conjugated to a protein carrier as a hapten, which in turn facilitates activation of T cells and triggers an immune response

CLINICALLY IMPORTANT EXAMPLES OF ASSOCIATIONS BETWEEN GENETIC VARIANTS AND DRUG RESPONSE OR TOXICITY

The Effect of CYP2D6 Genotype on Codeine Efficacy and Toxicity

Codeine, an analgesic and antitussive opium alkaloid, is O-demethylated by CYP2D6 to its active metabolite morphine and CYP2D6 activity constitutes the determining factor for codeine pharmacokinetics. Patients homozygous for loss-of-function haplotypes in CYP2D6, including the *4, *5, and *6 alleles, experience drastically reduced morphine formation and lack of analgesia. Consequently, in such poor metabolizers (PM), alternative medications that are not metabolized by CYP2D6 should be considered, such as buprenorphine, morphine, fentanyl, methadone, or non-opioid analgesics. In contrast, in ultrarapid metabolizers (UM), in which the active CYP2D6 gene is duplicated, morphine formation is increased and standard codeine doses can result in serum morphine levels that substantially exceed the therapeutic range, resulting in severe toxicity. The risk is highest in pediatric patients who receive codeine following adenotonsillectomy and multiple cases of life-threatening respiratory depression or death due to codeine therapy have been reported [52]. These severe adverse drug reactions (ADRs) prompted the FDA to require boxed warnings on all codeine-containing medications to highlight the risks for pediatric patients. Furthermore, these cases resulted in a change in routine clinical practice for pain control after tonsillectomy, away from codeine towards other analgesic agents that are not at risk for catastrophic events (e.g., acetaminophen or rofecoxib and hydrocodone) [53].

Warfarin Pharmacogenetics

Warfarin is the most commonly used oral anticoagulant for the treatment and prevention of thromboembolic events. However, a narrow therapeutic window combined with substantial inter-individual variation in warfarin pharmacokinetics and pharmacodynamics poses severe clinical challenges. Warfarin inhibits the VKORC1 subunit of epoxide reductase, thereby disrupting the formation of the vitamin K-dependent clotting factors. Warfarin is a racemic mixture of R- and S-enantiomers with the latter being around 5 times more potent. S-warfarin is inactivated by CYP2C9 and eliminated predominantly via the urine.

Genetic variants in VKORC1 and CYP2C9 have been reproducibly linked to differences in warfarin dose requirements. Reduced functionality polymorphisms in VKORC1 (mostly VKORC1*2) and CYP2C9 (particularly CYP2C9*2 and *3) have been associated with lower (1–2 mg/day reduction per allele) warfarin dose requirements [54,55,56,57]. In addition, the reduced functionality variant rs2108622 in the CYP4F2 gene whose gene product metabolizes vitamin K shows an additional minor contribution [43].

While the molecular mechanisms behind warfarin pharmacogenetics have been extensively analyzed, the advantages of preemptive genetic testing remain unclear and randomized, multi-center, controlled trials reported discrepant results. While the CoumaGen-II and EU-PACT trials indicated significant improvements in the percentage of time within the therapeutic international normalized ratio (INR) range and time to reach therapeutic INR with genotype-guided dosing [58, 59], the COAG trial did not show significant differences in the time within the therapeutic range or the incidence of bleeding complications [60]. Likely reasons for the mixed outcomes include differences in reference (usual care in EU-PACT vs. dosing algorithm guided by clinical variables in COAG), the use of loading dose (loading dose given in EU-PACT vs. no loading dose in COAG) or the diversity of the trial population (homogeneous European population in EU-PACT vs. 27% Africans and 6% Hispanics in COAG). Thus, while all trials suggest numerical, but not necessarily significant, benefits of genotype-guided dosing, the clinical utility of preemptive warfarin genotyping appears limited.

The Role of HLA-Alleles in Hypersensitivity to Abacavir

The antiretroviral guanoside analogue abacavir is commonly used for the treatment of HIV infections in adults and pediatric patients older than 3 months. While the drug is generally well tolerated, around 4% of patients experience hypersensitivity syndrome (HSS) that manifest as fever and gastrointestinal and respiratory problems as well as dermatological symptoms that range from rashes to Stevens-Johnson syndrome or toxic epidermal necrolysis [61]. Prospective, randomized clinical trials demonstrated that HSS is strongly associated with the presence of the HLA-B*57:01 allele with a negative predictive value of 100% and a positive predictive value of 47.9% [62]. Mechanistically, HSS is caused by activated abacavir-specific CD8+ T cells that are triggered by the abacavir parent molecule bound to HLA-B*57:01 [63]. This non-covalent binding is highly specific and can be abrogated by a single point mutation of the S116 residue, which results in a lack of T cell activation [64]. Following identification and clinical implementation of this pharmacogenomic biomarker, abacavir prescriptions drastically increased and indeed this biomarker might be one of the best examples where genotyping for one mutation can completely prevent the occurrence of compound toxicity. Prior to initiating abacavir therapy, screening for the HLA-B*5701 allele is recommended by the FDA, the Clinical Pharmacogenetics Implementation Consortium (CPIC), and the Dutch Pharmacogenetics Working Group (DPWG) and in case the allele is detected, alternative therapy is mandated [65, 66].

Associations Between TPMT Genotype and Thiopurine Toxicity

The thiopurine mercaptopurine (6-MP) and its prodrug azathioprine (AZA) are used for the treatment of acute lymphoblastic leukemia (ALL) and are also widely prescribed off-label for their immunosuppressive effects in the treatment of Crohn’s disease and ulcerative colitis. AZA is rapidly metabolized into 6-MP in the liver, which is further either bioactivated by hypoxanthine-guanine-phosphoribosyltransferase (HPRT) to form thioguanine nucleotides or inactivated by either thiopurine-S-methyltransferase (TPMT) or xanthine oxidase (XO) to 6-methylmercaptopurine or thiouric acid, respectively.

Myelosuppression is the most common adverse reaction to thiopurine therapy and patients with reduced TPMT activity are at substantially increased risk. Importantly, TPMT genotype is a strong predictor for TPMT activity and even patients heterozygous for the TPMT loss-of-function alleles *2A or *3 showed significantly higher incidences of dose reductions due to toxicity [67, 68]. Due to the substantial body of evidence that links TPMT genotype to thiopurine treatment outcomes and adverse events, TPMT genotyping is already widely applied in clinical practice [69, 70]. The cost-effectiveness of preemptive TPMT genotyping remains however inconclusive [71, 72] and data from randomized controlled trials is currently lacking.

The Role of SLCO1B1 Variants in Simvastatin-Induced Myopathy

Severe toxicity has been observed in patients receiving the blockbuster drug simvastatin for treatment of dyslipidemia, an inhibitor of 3-hydroxy-3-methylglutaryl coenzyme A (HMG-CoA) reductase, the rate-limiting enzyme of cholesterol biosynthesis. Importantly, the SNP rs4149056 located in SLCO1B1 (SLCO1B1*5), the gene encoding the hepatic simvastatin transporter OATP1B1, causes impaired hepatic import of the drug which prevents the interaction with its hepatic target HMG-CoA reductase and results in increased plasma concentration due to impaired hepatic clearance. Patients are at 2.6- or 4.5-fold increased risk per variant allele of developing myopathy when taking normal (40 mg daily) or high doses (80 mg daily) of simvastatin, respectively [31]. Due to these risks, particularly in the high dose group, the FDA issued a warning that high dose regimens of simvastatin should only be used in patients who have already received high doses for more than 12 months without muscle-related adverse effects [73].

DPYD Variants and Fluoropyrimidine Toxicity

Combinatorial therapies that include fluorouracil, such as the chemotherapeutic regimens FOLFOX and FOLFIRI represent the first-line treatment for various solid tumors. Fluorouracil (5FU) and other fluoropyrimidines, such as capecitabine, tegafur, and floxuridine, inhibit thymidylate synthase, which catalyzes the rate-limiting step in deoxythymidine triphosphate (dTTP) biosynthesis, thereby inhibiting DNA replication [74]. Fluoropyrimidines have a narrow therapeutic window and dose-adjustments based on therapeutic drug monitoring (TDM) resulted in increased response-rate and decreased toxicity [75]. Dihydropyrimidine dehydrogenase (DPD), the enzyme encoded by DPYD, inactivates around 80% of 5FU and genetic polymorphisms in DPYD have been consistently linked to inter-individual differences in fluoropyrimidine response and toxicity [25, 76]. Importantly, a recent study in 2038 patients demonstrated that 5FU dosing guided by prospective genotyping for the reduced functionality allele DPYD*2A resulted in significantly lower incidences of severe toxicities (73% in historic controls vs. 28% in genotype-guided cohort) and appeared to reduce costs for the health care system [77]. Thus, implementation of DPYD genotyping for 5FU therapy in routine clinical care might be a promising next step in reducing patient morbidity while at the same time allocating health care resources more efficiently.

RARE GENETIC VARIANTS AND POPULATION-SPECIFICITY

Genetic variants can be classified as common (> 1% allele frequency, also called genetic polymorphisms) or rare (< 1% frequency) depending on their prevalence in the overall population. Recent twin studies indicated that the contribution of genetic factors to drug response differs drastically between medications. While genetic factors contributed only to a minor extent to differences in talinolol pharmacokinetics, heritable factors were responsible for 80–90% of the differences in the pharmacokinetics of metoprolol and torsemide; importantly, however, the analyzed common genetic polymorphisms only explained around 40% of this variability [78, 79]. These results indicate that additional factors, such as rare genetic variants can be important modulators of drug pharmacokinetics. Indeed, recent population-scale sequencing projects revealed that ADME genes harbor vast numbers of rare genetic variants that are not assessed by conventional genotyping arrays [80, 81]. Rare variants are more likely to have deleterious effects with an estimated odds ratio of 4.2 compared to variants with MAF > 0.5% [82, 83] and combined are estimated to account for 30–40% of the functional variability in ADME genes [81].

Importantly, variant and haplotype frequencies differ majorly between populations (Table 4). Thus, while a variant may be rare globally, frequencies of a minor allele might be substantial in specific populations. One such example is the prevalence of the reduced functionality allele CYP2C8*2, which is not found in individuals of European or East Asian ancestry but is common in Africans (MAF = 15.9%) [7]. Similarly, the loss-of-function CYP3A4*20 allele causing increased risk of adverse reactions to, e.g., paclitaxel, was not found in Asian, African, South American, and most European populations but reached frequencies of 3.8% in specific regions of Spain [86]. Combined, these findings suggest that ethnic origin is an important parameter in pharmacogenomic research and understanding of the geographical distribution of genetic variability builds the fundament for precision public health approaches.

Table 4 Overview of Important Pharmacogenetic Variant and Allele Frequencies Across Major Human Population

The treatment with antiretrovirals in Zimbabwe provides an impressive case for the benefit of such approaches: When the national ministry of health implemented a WHO recommendation to change the first-line treatment of HIV to efavirenz, unexpectedly many Zimbabweans experienced ADRs associated with efavirenz overdose. Importantly, in Zimbabwe, 20% of the population are homozygous for the reduced functionality allele CYP2B6*6 which entails that efavirenz plasma concentrations exceed the recommended therapeutic levels, resulting in the local failure of a globally established dosing regimen [87]. Thus, in order to prevent such public health crises, selection of first-line treatment should be evaluated for each population separately, considering the specific genetic landscapes in the geographic region of interest [88, 89].

RARE VARIANTS AND PRECISION MEDICINE

With decreasing sequencing times and costs, it is envisioned that precision medicine will increasingly utilize NGS technologies to derive predictions of drug response. Such analyses should be tailored to the drug in question and encompass genes likely to affect its kinetics, response, or risk of adverse reactions. In this concept, a pre-defined panel of genes is sequenced using NGS and genetic variants in the patient of interest are identified (Fig. 2). Analysis of the sequencing results will yield (i) non-sense mutations, such as frameshift, stop-gain, or start-lost variants; (ii) silent (also called synonymous) mutations; and (iii) missense mutations that result in amino acid exchanges. While exceptions from the rule exist, synonymous variants rarely have a functional effect, whereas the vast majority of non-sense variants result in a loss-of-function of the gene product. Missense variants however are more heterogenous: while some variants result in reduced functionality alleles, others do not have any functional effects. Overall, around 70% of genetic variants within coding sequences have no pronounced effect on the functionality of the gene product, whereas 30% of the mutations unveiled by exome sequencing result in reduced function or loss-of-function alleles.

Fig. 2
figure 2

The use of next-generation sequencing in precision medicine. Genomic DNA (gDNA) is isolated from tissue biopsies (somatic genome) or from blood samples (germline genome) and libraries for downstream sequencing applications are prepared. Optionally, a target capture step can be performed to enrich for genomic intervals of interest and to reduce required sequencing capacity. Subsequent to sequencing, genetic variants in the sample (indicated by red arrow) are identified by comparison to a relevant reference, which can be either the human reference genome or, in the case of tumor biopsies, the germline genome from the same patient. The functional consequence of identified variants can be predicted based on preexisting experimental data or if such information is not available by using computational algorithms

Due to the vast number of rare genetic variants, it is not feasible to experimentally characterize the functional effects of all such mutations, thus posing a significant challenge for the clinical interpretation of genetic variability and hampering the translation of genomic data into actionable advice. In the absence of experimental data, in silico algorithms can provide some guidance regarding the predicted functional consequences of the genetic variant [90]. Ideally, the results of multiple such computational methods are integrated that base their conclusions on diverse and complementing sets of criteria, including evolutionary conservation, physiochemical properties, secondary structure, variant effects on protein stability, and protein domain information [91]. The predictive power of algorithms to detect functional alterations in the gene product is however relatively low, particularly in ADME genes. Current computational tools have been trained on disease-causing genetic variants and use evolutionary constraints as the main parameter to predict functional effects of the mutations in question. Such an approach poses problems for the assessment of pharmacogenetic variants as ADME genes are often only poorly conserved. Consequently, while computational tools correctly classify disease-causing variants with accuracies between 70 and 90% [92,93,94,95], a comprehensive assessment of their performance for ADME missense variants revealed only much lower predictive accuracies. ADME-specific optimization of computational prediction models is thus necessary, which will provide an important step forward to allow the rapid translation of exome sequencing data into a compendium of functionally altered genes of relevance for the specific drug therapy in question for each patient, adding relevant information for pharmacogenetically guided drug therapy.

There is evidently a need to improve both in silico and experimental methods for functional prediction of missense mutations. However, already today NGS-based approaches provide more accurate and more individualized information for pharmacogenomic predictions of drug action than the current array based techniques that focus solely on common genetic variants. To facilitate the translation of this perception into clinically actionable information and to fully harness the added value of clinical NGS, overcoming the indicated limitations thus constitutes one of the most important frontiers of future pharmacogenomic research.

GENETIC BIOMARKERS IN THE SOMATIC CANCER GENOME

Currently, cancer affects around 90 million individuals and causes nearly 1 in 6 deaths worldwide [96]. Underlying the formation of neoplasms is the accumulation of somatic mutations that activate the so-called oncogenes and inactivate tumor suppressors. Every tumor harbors a unique combination of acquired genetic variants and cancer genomics, i.e., the analysis of genetic differences between tumor and non-tumor cells aims to unveil the genetic basis that confers cancer cells their proliferative capacity and the ability to escape apoptosis. By revealing its molecular underpinnings and identifying clinically actionable variants that can be targeted by approved drugs, this approach allows to tailor therapy to the specific tumor, opening new avenues for personalized oncology (Table 5).

Table 5 Approved Targeted Cancer Drugs

To date, the most commonly identified oncogenic variants affect signal transduction systems, cell cycle genes, metabolic enzymes, the epigenetic machinery, or factors involved in transcription, splicing, or translation (Fig. 3). Prominent examples of such mutations result in the constitutive activation of growth factor signaling. Approved targeted therapies are available for variants in receptor tyrosine kinases, such as EGFR (also termed HER1), ERBB2 (also termed HER2), PDGFRA, KIT, ALK, and JAK2 that are commonly mutated in various cancers. Furthermore, targeting activating mutations, amplifications, or gene fusion events of FGFRs represents promising therapeutic opportunities for various solid tumors with multiple clinical trials currently ongoing [104].

Fig. 3
figure 3

Molecular pathways that commonly harbor oncogenic mutations. a Signal transduction cascades, such as the PI3K-AKT and Ras-ERK pathways, are commonly mutated in various cancers resulting in constitutive activation of signaling leading to cell survival and proliferation. b Components of the epigenetic machinery, such as DNA methyltransferases (DNMTs), histone modifying enzymes (DOT1L, MLLs, KDMs), and nucleosome remodelers are inactivated in up to 50% of ovarian or hepatocellular carcinomas [97, 98]. Interestingly, some cancers are also characterized by increased functionality of epigenetic enzymes, as exemplified by amplifications or increased catalytic activity mutations in the histone methyltransferase EZH2 in breast cancer and B cell lymphoma [99, 100]. c Similarly, multiple cancer types exhibit loss of cell cycle regulators or overexpression of cyclins or cyclin-dependent kinases (CDKs) [101, 102]. d Tumors are characterized by extensive metabolic remodeling. For instance, the R132H mutation in IDH1 modulates enzyme specificity and results in production of the oncometabolite 2-hydroxyglutarate, which inhibits DNA and histone modifying enzymes [103]. Oncogenes are indicated in red, tumor suppressors in green. RTK = receptor tyrosine kinase

Depending on the cancer type and nature of these polymorphisms, the use of targeted therapies is indicated that interfere with specific mutated gene products found exclusively in cancer cells. For instance, treatment with the EGFR inhibitors afatinib, erlotinib, and gefitinib improves progression-free survival of non-small cell lung cancer (NSCLC) patients that harbor deletions in exon 19 of EGFR or EGFRL858R substitution mutations [105,106,107]. In contrast, EGFR inhibition did not result in improved clinical outcomes in glioblastoma patients compared to conventional chemo- and radiation therapy [108]. However, even in patients that are initially responsive to targeted therapy, drug resistance can arise most commonly due to the acquisition of additional mutations. In NSCLC patients, the EGFRT790M variant decreases the affinity of tyrosine kinase inhibitors (TKIs) to bind to the ATP binding pocket of EGFR and represents the most common mechanism of EGFR inhibitor resistance [109]. To counter this acquired drug resistance, the FDA-approved osimertinib, which demonstrated significantly increased efficacy in T790M-positive NSCLC patients compared to conventional platinum-based therapy [110].

In addition to point mutations, cancer cells often undergo genomic rearrangements that can result in the deletion of entire exons or the formation of functional fusion proteins. Chronic myeloid leukemia (CML) is characterized by a specific genomic translocation event between chromosomes 9 and 22 that gives rise to a functional BCR-ABL1 fusion protein that exhibits constitutive kinase activity. The TKI imatinib inhibits the phosphorylation of downstream targets of BCR-ABL1 and in addition blocks various other kinases, such as PDGFRA and KIT. While response rates to imatinib are very high (hematologic remission in 97% of CML patients), 51–88% of late stage patients developed imatinib resistance [111, 112]. Mechanisms of imatinib resistance include point mutations in BCR-ABL1, amplifications of the chimeric gene as well as BCR-ABL1-independent mechanisms, such as overexpression of efflux transporters or downregulation of the imatinib importer OCT1 [113]. By now, a variety of therapeutic options is available for the treatment of imatinib-resistant CML. Nilotinib and dasatinib are effective against most imatinib-resistant point mutants with the exception of cells with the T315I mutation [114]. For BCR-ABL1T315I-positive CML, the recently approved TKI ponatinib (full FDA approval in 2016) demonstrated a major cytogenetic response in 56% of patients irrespective of BCR-ABL1 mutation status and thus significantly improves clinical outcomes for the respective patients [115].

The examples provided above give an impression of the complexity of genetic variability in cancer cells. Due to increasing throughput and decreasing costs of sequencing, genetic information of primary cancers as well as metastases becomes progressively more available. This massive amount of data can be accessed at central data hubs, such as the Genomic Data Commons (GDC; https://gdc.cancer.gov/) provided by the National Cancer Institute that currently provides genomic information of 14,551 cases and the Catalog Of Somatic Mutations In Cancer (COSMIC; http://cancer.sanger.ac.uk/cosmic) hosted by the Sanger Institute, which constitutes the largest database of somatic cancer mutations. However, the translation of this unveiled landscape of oncogenetic variability into clinical advice remains difficult despite the multitude of computational tools that assist in detection and interpretation of cancer genome alterations [116]. There is thus an urgent need for methods that support the identification of causative mutations that drive tumorigenesis and select alterations that may be therapeutically actionable. The recently launched Cancer Genome Interpreter (http://www.cancergenomeinterpreter.org) provides a versatile tool to estimate the biological significance of observed mutations and to predict their clinical relevance. However, prospective randomized trials of sufficient scale that demonstrate a clinical benefit of genomic profiling-guided off-label use for advanced cancer patients are still lacking [117].

The rapid progress in method development enabled the screening of hundreds of genes in the somatic and germline genome. Many platforms are by now commercially available and are increasingly used in clinical trials in which genomic DNA of different tumors is comprehensively analyzed using NGS for associations between genetic variability and therapeutic success of the anticancer therapy in question. Such an approach is mainly warranted for drugs for which the presence of mutations in genes encoding the signal transducers and modulators will determine the clinical success of the inhibitor in question, such as new antibodies that interfere in receptor-mediated signal transduction. We thus anticipate that pharmacogenetically guided anticancer therapy will increasingly utilize biomarkers consisting of a set of mutations in critical genes.

EMERGING TECHNOLOGIES FACILITATING BIOMARKER DISCOVERY

The knowledge we have gained about pharmacogenomic biomarkers, particularly regarding the importance of rare and population-specific variants, can be attributed to the increase in speed and accuracy of NGS technology, combined with decreasing prices. However, certain challenges of short-read sequencing remain and particularly the mapping of structural variants, copy number variations (CNVs) and of large (> 1 kb) repetitive elements remains problematic [118]. Filtering variants called using standard filters for short-read sequencing results in the removal of low complexity regions, segmental duplications and variable number tandem repeats. As a consequence, the pharmacogenetic variability in important genes, such as CYP2A6, CYP2B6, CYP2D6, CYP3A4, GSTM1, HLA-B, UGT2B15, and UGT2B17 cannot be interrogated by standard paired-end 150 bp sequencing (Fig. 4). Similarly, a substantial proportion of variants cannot be called with high confidence for genes containing repeats larger than 1 kilobase (kb), including ABCB1, SLC19A1, and SLC22A1.

Fig. 4
figure 4

Fraction of genomic intervals of major ADME genes in complex regions of the genome. The barplots illustrate the fraction of variants (blue) and base pairs (red) that lie low complexity regions of the genome or span segmental duplications or variable number tandem repeats. Importantly, these regions cannot be reliably interrogated by conventional short-read sequencing techniques

Long-read sequencing technologies aspire to enhance the recovery of reads that cannot be unambiguously mapped by short-read sequencing. In recent years, multiple long-read sequencing approaches have been presented (Fig. 5). Pacific Biosciences (PacBio) offers platforms for single-molecule real-time (SMRT) sequencing that have been successfully applied to medical genotyping as well as to the sequencing of human genomes [119,120,121]. The long reads allow for accurate variant calling as well as phasing of multiple heterozygous variants whose genomic location might be several kilobases apart. As such, SMRT provides an excellent technology for the sequencing of complex CYP loci and, using CYP2D6 as an example, has been demonstrated to allow the simultaneous detection of SNVs and CNVs in multiplexed samples [122, 123]. In addition to genomic sequencing, SMRT allows direct decoding of epigenetic marks [124].

Fig. 5
figure 5

Schematic depiction of next-generation sequencing paradigms. a Short-read sequencing requires the mechanic fragmentation of DNA into small 600–800-bp-long molecules, which are then sequenced from one or both ends and mapped to a reference genome using standard bioinformatics tools. b Pacific Biosciences (PacBio) and Oxford Nanopore Technologies allow sequencing of longer molecules which can ameliorate the obstacles of mapping short reads in repetitive or low complexity regions. c Synthetic long-read methods have been developed, which use partitioning and barcoding of longer DNA molecules before standard library preparation. This allows the assembly of short reads into longer fragments that can subsequently be mapped and scaffolded with efficiencies similar to those of physical long-read sequencing techniques

Nanopore sequencing developed by Oxford Nanopore Technologies offers an alternative to the PacBio platform. Recent progress towards higher throughput, including whole genome sequencing (WGS), as well as detection of DNA methylation, also makes it well suited for biomarker discovery in complex regions of the genome [125, 126]. Furthermore, long-read sequencing combined with target capture methods based on the hybridization of biotinylated baits offers the possibility to focus on specific genomic regions of interest [127, 128]. A recent elegant approach demonstrated the utility of direct selection of DNA fragments in real-time by dynamic time warping and matching reads to the reference genome [129]. Thus, its portability, flexibility, and speed in data production make nanopore sequencing suitable for real-time applications, including direct point-of-care pharmacogenomic testing.

Besides long-read sequencing, various approaches to generate synthetic long-reads have been presented. The main advantage of synthetic methods is that they can leverage the low cost and high accuracy of short-read sequencing. Illumina’s TruSeq Synthetic Long-Read technology, previously referred to as Moleculo, is based on fragmenting genomic DNA to approximately 10 kb fragments, their clonal amplification, shearing, and indexing with a unique barcode. Similarly, contiguity preserving transposase sequencing from Illumina provide in vitro means of generating libraries comprised of thousands of indexed pools, each containing thousands of sparsely sequenced long fragments, ranging from 5 kb up to 1 megabase [130]. The Chromium platform (10× Genomics) provides synthetic long reads by partitioning and barcoding the genome, followed by sequencing on any NGS platform. The barcoded linked reads can be aligned using “read clouds,” thereby overcoming the complexities of mapping reads in repetitive regions of the genome. All linked reads for a single barcode are aligned simultaneously, with the prior knowledge that the reads arise from a small number of long (10–200 kb) molecules [131].

In summary, “real” as well as synthetic long-range sequencing represent promising emerging technologies that allow the phasing of variants, which can refine pharmacogenetic genotype calls and thus improve the phenotypic prediction regarding drug response.

CLINICAL IMPLEMENTATION OF PHARMACOGENOMICS

The clinical implementation of pharmacogenomic biomarkers is increasing and information about the importance of genetic variation has been included in the labels of 190 and 155 drugs approved by the US Food and Drug Administration (FDA) and the European Medicines Agency (EMA), respectively (https://www.fda.gov/Drugs/ScienceResearch/ResearchAreas/Pharmacogenetics/ucm083378.htm) [132].

A central question is to what extent the preemptive use of pharmacogenomic biomarkers results in increased benefits for patients and society. So far, results of prospective randomized trials have been presented only for a very limited number of drugs [133]. In Europe, a large prospective trial called PREPARE (PREemptive pharmacogenomic testing for Preventing Adverse drug REactions) has been initiated by the EU-financed Ubiquitous Pharmacogenomics project (http://upgx.eu/) that aims to implement and evaluate the impact of pharmacogenomic testing on therapeutic outcomes in seven European clinical centers [134]. In total, 8100 patients will be enrolled and 40 clinically relevant PGx markers across 13 important pharmacogenes will be analyzed. In one arm of the trial, patients will receive treatment based on standard physiological and clinical parameters, whereas patients in the other arm will receive pharmacogenetically guided therapy. Outcomes of this interesting trial are expected in 2020.

In the USA, the NIH-funded eMERGE project has entered the final stage, which aims at analyzing the importance of rare genetic variants on patient phenotypes, developing technical and regulatory solutions to integrate genomic information into Electronic Health Records (EHR), assessing physician and patient attitudes towards the value of pharmacogenomic data, developing educational programs, and increasing the knowledge and awareness of clinically significant genetic variants. Additional programs conducted in the USA have been reviewed recently [135].

POINTS TO CONSIDER FOR STUDIES OF CLINICAL PHARMACOGENOMICS

Importantly, certain pitfalls should be considered when evaluating the clinical importance of pharmacogenomic associations. The problems include the analysis of populations that are heterogenous regarding ethnicity or disease classification, the inappropriate pooling of data derived from non-compatible studies, the use of inappropriate methods for isolation or sequencing of genomic DNA, use of somatic DNA instead of germline DNA, and vice versa, concluding on the basis of inaccurate proxy polymorphisms and erroneous haplotype identification based on a set of genetic variants. Furthermore, the choice of genotyping methodology, including appropriate selection of interrogated SNPs or genomic intervals as well as an assessment of the analytical validity of the chosen method, constitutes important aspects during the project planning phase. In case NGS-based approaches are used for genotyping, strategies should be in place to interpret encountered rare genetic variants with unknown functional consequences [136]. Thus, in order to assist the design and interpretation of studies of pharmacogenomic biomarkers, the EMA has released a draft guideline (http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2016/05/WC500205758.pdf).

CONCLUSIONS

Pharmacogenomic information provides an important tool for patient stratification and the selection of optimal drug and dosing regimens, particularly in oncology. However, in other therapeutic areas, the routine use of pharmacogenomic biomarkers in clinical practice is currently sparse primarily due to the lack of convincing data that show the added value for patient and health care providers. Importantly, the large prospective trials that are currently conducted in EU and the USA will shed light on the overall benefits of this technology and provide answers to how and where the implementation of preemptive pharmacogenomically guided drug treatment should be recommended.