Keywords

1 Introduction

Antibiotic resistance is a naturally occurring phenomenon that can be found in environments containing antibiotic-producing microorganisms, even in the absence of human activity (D’Costa et al. 2006). While antibiotic resistance is rampant in livestock and is a confounding factor in the emergence and spread of resistance into the human population, most research focuses on bacterial pathogens affecting humans and in particular the ESKAPE pathogens (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species) as well as Mycobacterium tuberculosis (Santajit and Indrawattana 2016). Instrumental in the development of resistance is a bacterium’s inherent ability to survive exposure to low antibiotic concentrations giving the population the opportunity to accumulate genomic changes, eventually leading to full resistance (Drlica et al. 2008). In clinical practice, bacteria can frequently encounter significantly lower drug concentrations in host-niches such as the nasopharynx, inner ear, or lungs compared to plasma levels (Rybak 2006). Exposure to subinhibitory concentrations of antibiotics may be capable of reducing bacterial growth rates, but can fail to fully eradicate infections, providing selective pressure for acquired resistance. Outside clinical settings, environments containing antibiotics are plentiful, especially due to the rise in antibiotic usage in humans, agriculture, and veterinary medicine (D’Costa et al. 2006; Watkinson et al. 2007). In such environments, selection for antibiotic resistance is likely and frequent (Gullberg et al. 2011).

There are several mechanisms whereby bacteria can resist antibiotic stress including modification of the antibiotic’s direct target, enzymatic drug inactivation, and reduction of intracellular drug concentrations via efflux pumps (Walsh 2000; McKeegan et al. 2002; Wright 2003) (Fig. 1). Adaptation—the process by which bacteria attain such mechanisms of resistance—can happen through two modes: horizontal and vertical evolution. The horizontal mode of adaptation (horizontal gene transfer; HGT) involves the acquisition of genetic material from organisms that share the same environment, whereas the vertical mode of adaptation involves the acquisition of de novo mutations. Both modes have an important role in shaping the pangenome of bacterial species (Santajit and Indrawattana 2016; Sommer et al. 2017). The use of antibiotics can exert selective pressures that fix horizontally transferred genes or acquired mutations in a population. Examples of HGT include integrons carrying mecA which converts methicillin-sensitive S. aureus (MSSA) to the resistant “superbug” MRSA (Wielders et al. 2002), beta-lactamases in P. aeruginosa, A. baumannii, and various species of Enterobacteriaceae (Weldhagen 2004), and macrolide resistance in Staphylococcus epidermidis (Lampson et al. 1986) and Streptococcus pneumoniae (Chancey et al. 2015). Examples of de novo resistance mutations are plentiful, including mutations in topoisomerase subunits gyrA and parC conferring resistance to fluoroquinolones (Fàbrega et al. 2009) or in different penicillin-binding proteins, which confer resistance to beta-lactams (Murakami et al. 1987; Sauvage et al. 2002; Munita and Arias 2016; Gifford et al. 2018). Moreover, both modes of evolution can be accelerated by antibiotics. On one hand, fluoroquinolones can induce horizontal gene transfer by activating competence in S. pneumoniae (Prudhomme et al. 2006; Slager et al. 2014), while on the other hand, the use of the same class of antibiotics can increase the mutation rate (Lindgren et al. 2003). Importantly, the maintenance of newly acquired resistance in a given population, and its dissemination among species, relies heavily on the associated fitness cost (Melnyk et al. 2015). For instance, the cost of metabolite production in a given reaction may constrain the evolution of antibiotic resistance, highlighting the role of bacterial metabolism and environment on antimicrobial adaptation (Zampieri et al. 2017a). This cost may be different in strains with different genetic backgrounds, suggesting that resistance maintenance depends on the bacterial metabolic cost/status as well as for instance the bacterial transcriptional profile in a particular environment (Cornick and Bentley 2012). These negative fitness costs (i.e., reduced growth or replication rates) suggest that in the absence of the antibiotic pressure the adaptive mutations would disappear from the population, nevertheless adapted populations rarely revert to their wild-type versions, and new mutations can compensate for the fitness cost (Andersson and Hughes 2010).

Fig. 1
figure 1

Action and resistance mechanisms of antibiotics in bacteria. Schematic representation of the most common antibiotic mechanisms of action (a) and resistance (b) found in bacteria. (a) Five major groups of antibiotic classes are represented according to their mechanisms of action: DNA synthesis inhibitors (antibiotics that interfere with DNA replication targeting DNA gyrase and Topoisomerase IV), RNA synthesis inhibitors (antibiotics that interfere with RNA polymerase, thereby inhibiting transcription), Cell wall synthesis inhibitors (antibiotics that interfere in cell wall biosynthesis by inhibition of proteins such as the penicillin-binding protein family), Protein synthesis inhibitors (antibiotics that interact with one of the subunits of the ribosome, thereby interfering with translation), and Cell membrane disruptors (antibiotics that bind or insert into the membrane and cause depolarization). (b) The most common mechanisms of resistance among bacteria include acquisition of mutations in the specific antibiotic targets (represented as a black star), active efflux of antibiotics accomplished by integral membrane transporters (efflux pumps), and biosynthesis of enzymes capable of deactivating antibiotics through chemical modification and/or degradation. Yellow ovals: bacterium; brown oval: DNA gyrase; green circles: RNA polymerase; blue double helix: DNA; purple single helix: mRNA, messenger RNA; brown blocks on mRNA: Ribosomes; orange circles: penicillin-binding proteins; light blue circle chains: peptides; red diamonds: antibiotics; yellow border: cell wall; orange dotted border: cell membrane

Antibiotics usually target important cellular functions (e.g., cell wall synthesis, DNA replication, or protein synthesis), involving highly conserved and often essential genes present in a wide range of bacteria (Hershberg et al. 2008) (Fig. 1). However, it has become clear that while antibiotics may have very specific targets, the bacterial response to antibiotics and the occurrence of resistance is much more distributed across the genome. For instance, we and others have assayed the antibiotic response of bacteria through genetic perturbations (Fajardo et al. 2008; Tamae et al. 2008; Breidenstein et al. 2008; Schurek et al. 2008; Girgis et al. 2009; Nichols et al. 2011; van Opijnen and Camilli 2012; van Opijnen et al. 2016), which established that a large number of genes and pathways can influence drug susceptibility. These findings underline that we have a limited view of how an antibiotic inhibits a bacterial cell; instead of just a drug–target binary interaction, it is a complex, multifactorial process that begins with that interaction but propagates into various biochemical, metabolic, and regulatory processes of the cell (Tomasz 1979; Vakulenko and Mobashery 2003; Floss and Yu 2005; Drlica et al. 2008; Chandrasekaran et al. 2016; van Opijnen et al. 2016). Thus, a bacterium’s resistance to an antibiotic partially stems from the genome-wide program that is triggered by that antibiotic. This means that small alterations to this program may establish the bacterium on the road to the development of resistance (Albert et al. 2005; El’Garch et al. 2007; Kohanski and Collins 2008; Kohanski et al. 2010; Baquero et al. 2011). So far it has largely been ignored that the genetic diversity present in a pangenome, and the often multiple trajectories that can lead to resistance, can result in strain- and/or species-specific-resistant mechanisms with different fitness costs for maintaining resistance mutations. We believe that all these factors have contributed and are still contributing to the emergence of a diverse resistome, (Davies and Davies 2010; Blair et al. 2015; Munita and Arias 2016) that only makes sense when viewed from a pangenomic context and which makes both the discovery and tracking of resistance as well as the treatment of resistant bacteria far more complex than previously thought.

1.1 Species- and Strain-Specific Differences in Adaptation to Antibiotics

The influence of the pangenome on the complexity underlying the evolution of resistance can be seen both on a between and within species level. For instance, different mechanisms and mutations that trigger resistance to the same antibiotic have mainly been found between species. Additionally, interactions between drug-resistance mutations and genetic backgrounds triggering differences in resistance levels have been found at both the between and within species level. In the following section, we discuss how the pangenome, or genetic differences between strains and species, affect the mechanism and/or the level of resistance that evolves.

1.1.1 Species-Specific Resistance: There Is More Than One Way to Become Resistant

Due to specific (pangenomic) genetic characteristics, different species can adopt different mechanisms of adaptation to become resistant to the same antibiotic. Additionally, different species can acquire different mutations or genes to achieve the same resistance mechanism to the same antibiotic. A well-documented example of the first scenario is beta-lactam-resistant mechanisms among Gram-negative and Gram-positive bacteria. Beta-lactam antibiotics inhibit bacterial cell wall synthesis by targeting penicillin-binding proteins (PBPs), a group of enzymes that are present in all bacterial species and which catalyze peptidoglycan cross-linking. PBPs interact with beta-lactams via an active site serine and form a relatively stable covalent complex (Sibold et al. 1994). The primary resistance mechanism against clinically important beta-lactams (e.g., penicillin, carbapenem, cephalosporin) is different between Gram-negative and Gram-positive bacteria. In Gram-negative bacteria, beta-lactam resistance is commonly driven by the acquisition of hydrolyzing beta-lactamases that inactivate the drug. In contrast, beta-lactam resistance in most Gram-positive species is mediated by target modifications, with the exception of staphylococcal penicillinase (Rosdahl 1985; Skov et al. 1995). For instance, beta-lactam-resistant Enterococcus faecium have acquired mutations in an essential PBP (PBP5) that reduce the accessibility of the active site and result in a low-affinity form (PBP5fm) (Sauvage et al. 2002). Similar low-affinity PBPs have also been reported in methicillin-resistant S. aureus (MRSA) (Murakami et al. 1987) and in beta-lactam-resistant strains of S. pneumoniae (encoded by mosaic genes acquired through HGT) (Sibold et al. 1994; Reichmann et al. 1996). It seems likely that this divergence in beta-lactam-resistant mechanisms between Gram-negative and Gram-positive bacteria arose from the differences in their cell envelopes (Munita and Arias 2016). In Gram-negative bacteria, the presence of an outer membrane and associated porins allows for the entry and accumulation of beta-lactams in the periplasmic space, prior to binding PBPs in the inner membrane (Fig. 2a). Such compartmentalization allows for beta-lactamase accumulation at sufficient concentrations and effective deconstruction of the beta-lactam molecules.

Fig. 2
figure 2

Species-specific antibiotic-resistance mechanisms. (a) In Gram-negative bacteria, acquisition of beta-lactamase is the preferred mechanism of beta-

lactam resistance. This is most likely due to the presence of a periplasmic space that allows for beta-lactamase accumulation, resulting in effective degradation of the antibiotic. In Gram-positive bacteria, beta-lactam resistance is commonly driven by mutations in antibiotic target, penicillin-binding proteins (PBPs), which reduces the beta-lactam-binding affinity. (b) Macrolides target the peptidyl (P) site of ribosomes and interact with nucleotides A2508/2509 in the 23S rRNA. Macrolide resistance is commonly driven by modifications of target nucleotides. In species with low copy number of the rrna operon, macrolide resistance via 23S rRNA modification is frequently achieved by point mutations of A2508/2509. In species with high copy number of the rrna operon, target nucleotide modification is commonly achieved by erm-methylation

While species-specific differences in antibiotic resistance can come from very different mechanisms, there are examples where the target is the same, but the manner in which it is targeted is different. For example, macrolides target the peptidyl site of nascent peptides in the large subunit of bacterial ribosomes, thereby inhibiting protein synthesis. Many cases of clinical macrolide resistance are caused by mutations at specific nucleotide positions in the 23S rRNA. Due to differences in the copy number of the ribosomal RNA operon (rrna), different species have been shown to have different macrolide-resistant mutations in the 23S rRNA gene (Fig. 2b). Generally, mutations at A2058 or A2059 in the 23S rRNA (using E. coli nucleotide sequence numbering) confers macrolide resistance for many pathogenic bacteria, predominantly bacteria with one or two copies of the rrna operon, such as azithromycin-resistant Treponema pallidum (Stamm and Bergen 2000; Matejkova et al. 2009), clarithromycin-resistant Mycobacterium species (Meier et al. 1994; Nash and Inderlied 1995; Wallace et al. 1996), and Helicobacter pylori with resistance to macrolide–lincosamide–streptogramin B antibiotics (termed as the MLSB phenotype) (Wang and Taylor 1998). A2058 and/or A2059 mutations change the structure of the drug-binding pocket and thereby reduce the binding affinity of the drug contributing to resistance. In bacteria with higher copy numbers of rrna, such as Staphylococcus, Enterococcus, and Streptococcus, acquisition of point mutations on all or multiple copies of the 23S rRNA genes is highly improbable. Instead, macrolide resistance via 23S rRNA modification is frequently achieved by erm-methylation of target nucleotides. Erm genes are mobile genes that encode 23S rRNA methylases and can catalyze dimethylation of A2058 (Toh et al. 2007). In S. pneumoniae, ErmB provides a high level of resistance to erythromycin (MIC > 256 μg/mL) (Schroeder and Stephens 2016), which suggests that resistance level conferred by the same mutation is also dependent on the genetic background.

1.1.2 Interactions Between Resistance Mutations and Genetic Background Can Affect the Level of Resistance

While it may not come as a complete surprise that different species can adopt different strategies to overcome resistance, recent studies have shown that when different species or strains do have the same strategy to become resistant, the same mutation does not automatically result in the same level of resistance. This can be caused by differences in the genetic background and is a good example of how genetic differences between species and strains, can have important effects on (the emergence of) resistance. Examples at the species level are loss-of-function mutations of the 16S rRNA-specific methyltransferase GidB involved in streptomycin resistance (Okamoto et al. 2007; Koskiniemi et al. 2011). Streptomycin, an aminoglycoside antibiotic, binds to the 30S subunit of the ribosome and causes misreading of the correct tRNA. These mutations have been identified in low-to-intermediate levels of streptomycin resistance in multiple bacteria, such as M. tuberculosis, Mycobacterium smegmatis, S. aureus, and E. coli (Okamoto et al. 2007; Wong et al. 2011; Perdigao et al. 2014). Koskiniemi et al. (2011) showed that high-level streptomycin resistance caused by the loss of GidB is largely dependent on the presence of an aminoglycoside adenyltransferase (AadA) in the bacterium’s genome, which is an enzyme that modifies and thereby inactivates aminoglycosides (Tait et al. 1985; Svab et al. 1990; Magrini et al. 1998; Frank et al. 2003). In an experimental evolution study, streptomycin-adapted Salmonella typhimurium strains that have both the aadA gene and gidB mutations gained a higher level of streptomycin resistance than strains having either one alone (Wistrand-Yuen et al. 2018). Species with aadA, e.g., S. typhimurium, can thereby gain a high level of streptomycin resistance, while species that lack this enzyme (e.g., E. coli, S. aureus, and M. tuberculosis) only obtain low-level resistance (Okamoto et al. 2007).

Within a species, the same mutations may also not necessarily result in the same level of resistance. One such example is resistance in M. tuberculosis to isoniazid (INH). As a prodrug, INH must be processed by the mycobacterial enzyme KatG into its active form, isonicotinic acyl-NADH. The active drug then binds the enoyl-acyl carrier protein reductase InhA and blocks the synthesis of mycolic acid (Quémard et al. 1991). In M. tuberculosis, the primary INH-resistance mechanism is via a point mutation in KatG (e.g., S315T), which results in a partially active protein that reduces INH binding while retaining enough activity to support bacterial survival. Another frequently observed resistance mutation is in the promoter region of the target gene inhA (Lee et al. 2001). Strains that have inh promoter mutations have been observed to show different levels of INH resistance based on their phylogenetic lineages. M. tuberculosis is grouped in six main phylogenetic lineages (Hershberg et al. 2008; Comas et al. 2010): three modern lineages that have evolved in regions with high-density populations and recent massive demographic expansion (i.e., lineage 4: Europe and America, lineage 3: India and East Africa, lineage 2: East Asia) and three ancient lineages from older and low-density populations (i.e., lineage 1: the Philippines, lineage 5: Rim of Indian ocean, and lineage 6: west Africa) (Portevin et al. 2011). A study of 158 isolates of multidrug-resistant M. tuberculosis revealed that mutations in the inhA promoter cause high level of INH resistance (≥3.0 μg/mL) only in the modern lineages 2 and 3, while these mutations cause low-level resistance (MIC <3.0 μg/mL) mainly in ancient lineages 1 and 5 (Fenner et al. 2012). Although M. tuberculosis harbors limited genetic diversity compared to other species, multiple studies have suggested that the variation in drug-resistant phenotypes of M. tuberculosis could be at least partially explained by epistatic interactions among the genetic background of different phylogenetic lineages, compensatory mutations and drug-resistance mutations (Gagneux et al. 2006; Fenner et al. 2012; Gygli et al. 2017).

1.1.3 The Ability of Evolving Antibiotic Resistance May Vary Across Species Due to Epistatic Interactions and/or “Potentiator” Genes

Apart from epistatic interactions between genetic background and drug-resistance mutations, the presence of potentiator genes can make it possible for a novel trait to evolve that would otherwise be inaccessible (Blount et al. 2012; Lind et al. 2015). Depending on the genetic background, the presence of potentiators of antibiotic-resistance genes can prime strains to evolve resistance. To uncover the role of potentiators in different genetic backgrounds, Gifford and colleagues evolved eight strains in the Pseudomonas genus to the beta-lactam antibiotic ceftazidime and compared their pathways that led to resistance (Gifford et al. 2018). Their results show that Pseudomonas species that have the transcription factor ampR (P. protegens and P. fluorescens) evolve ceftazidime resistance faster than species lacking this gene (P. mendocina or P. fulva) (Gifford et al. 2018). AmpR has been shown to increase the expression of beta-lactamase ampC upon the inactivation of peptidoglycan synthesis (Mark et al. 2011; Ropy et al. 2015). The authors hypothesized that ampR potentiates ceftazidime adaptation by allowing mutations in peptidoglycan biosynthesis genes such as ampD, pml, and dacB. Indeed, in P. aeruginosa, dacB inactivating mutations have only been observed in genetic backgrounds harboring ampR (Moya et al. 2009; Mark et al. 2011). These findings show that (clinically relevant) high-resistance level markers (e.g., mutations, genes acquired by HGT) should be considered and validated in different genetic backgrounds, and thus in a pangenomic context.

1.2 Strain- and Species-Specific Phenotypic Stress Responses to Antibiotics

Recent advances in high-throughput techniques involving mutant libraries as well as various omics approaches have allowed for unprecedented understanding of how bacteria respond to antibiotic-mediated stress. Such strategies have shown the diversity of antibiotic responses within species represented by a large pangenome as well as between species. Various examples discussed below show that antibiotics can induce stress throughout the bacterium both at the direct target of the antibiotic as well as at off-target pathways throughout the genome. Due to the pangenome and the consequent differences in genetic backgrounds, strains and species respond to antibiotics with (slightly) different sets of genes and thereby experience antibiotic stress in different ways. This means the selective pressures a bacterium experiences can be strain and/or species specific and drive the evolution of resistance in a strain- or species-specific manner. As a result, the pangenome not only affects the manner in which stress is experienced, but that same stress (e.g., antibiotics) also contributes to maintaining and expanding the pangenome.

1.2.1 High-Throughput Tools for Investigating the Bacterial Response to Stress

With the rise of low-cost sequencing options, whole genome sequencing (WGS) has proved useful for identifying antibiotic-resistant bacteria by looking for the presence of certain genes (e.g., efflux pumps), insertion–deletions, and other polymorphisms associated with antibiotic resistance (Boissy et al. 2011; Zankari et al. 2012; Liu et al. 2014; McDermott et al. 2016; Zeng et al. 2018). The increased availability of large collections of bacterial whole genome sequences has allowed the identification of numerous single nucleotide polymorphisms (SNPs) associated with drug resistance through genome-wide association studies (Power et al. 2017). Resistance-associated SNPs have been identified for a number of pathogenic bacteria including M. tuberculosis (Desjardins et al. 2016), S. pneumoniae (Chewapreecha et al. 2014), and S. aureus (Alam et al. 2014). For antibiotic surveillance, the ability to identify features such as SNPs means WGS provides much more detailed information compared to traditional phenotyping such as multilocus sequence typing (MLST). This increased resolution can be used to predict antibiotic resistance for clinical isolates based on databases of known antibiotic-resistance determinants (Sandgren et al. 2009; McArthur et al. 2013; Stoesser et al. 2013; Walker et al. 2015; Lakin et al. 2017). Recent work has even demonstrated the ability to identify resistant strains as the sample is sequenced (Břinda et al. 2018) potentially leading to point-of-care devices which can guide appropriate use of antibiotics by clinicians. Nevertheless, predictions of resistance are limited to the antibiotics that have been previously tested (such as clinically important first- and second-line antibiotics), which hampers their utility in predicting bacterial responses to novel antibiotics. Thereby, WGS provides a snapshot of the presence or absence of resistance determinants but cannot directly provide information on what genes or pathways are involved in responding to the stress induced by antibiotics. Consequently, while WGS and MLST are highly useful for resistance surveillance and may guide treatment options, they are more limited in their ability to tease apart phenotypic responses to antibiotics for the purpose of understanding and potentially predicting how resistance develops.

In contrast, the use of ordered mutant libraries can directly link genes to observed phenotypes (Jacobs et al. 2003; Baba et al. 2006), which have allowed the detailed characterization of how bacteria respond to various antibiotics (Nichols et al. 2011). However, these libraries are limited by being time consuming to construct, making it less amenable for a wide variety of bacteria. The advent of techniques such as Tn-Seq (van Opijnen et al. 2009), INSeq (Goodman et al. 2009), HiTS (Gawronski et al. 2009) and TRADiS (Langridge et al. 2009), and variants like RB-TnSeq (Wetmore et al. 2015; Price et al. 2018) and droplet Tn-Seq (Thibault et al. 2019) offer a high-throughput alternative which is easily adaptable. In general, all these techniques rely on generating transposon-insertion libraries, which can be assayed by high-throughput sequencing for the relative frequency of mutants grown in a particular stress-inducing environment such as subinhibitory concentrations of antibiotics. In this way, the phenotype of each genetic mutant can be determined, showing directly how bacteria respond to antibiotics and the genes that benefit or hinder the bacteria’s ability to respond to this stress. Thanks to a diverse number of transposon systems and the relative ease of creating mutant libraries, these techniques are amenable to a wide variety of bacterial species and individual strains, providing data within the context of the genetic background of each assayed strain. Characterization of the response to antibiotics can also be complemented by various “omic” approaches. These include transcriptomic (Jensen et al. 2017; Qin et al. 2018), metabolomic (Zampieri et al. 2017b), and proteomic (Pérez-Llarena and Bou 2016; Ma et al. 2017) analyses. The datasets generated by these techniques can also be overlaid with one another to provide a holistic understanding of how bacteria respond to antibiotic stress (Jensen et al. 2017).

Studies utilizing Tn-Seq and related methods have shown that antibiotic-induced stress involves the target of the antibiotic and also extends throughout the entire genome of the bacterium. For example, fluoroquinolones like ciprofloxacin, levofloxacin, and norfloxacin target topoisomerase IV and DNA gyrase, critical enzymes utilized in DNA synthesis. In the Gram-positive S. pneumoniae and the Gram-negative A. baumannii, Tn-Seq profiles for fluoroquinolones show that genes involved in DNA replication and repair such as recN and xseA are important for responding to these antibiotics. While these genes are not direct targets, the inhibition of DNA replication by targeting gyrase and topoisomerase triggers DNA damage and thus explains the indirect importance of genes involved in DNA repair (van Opijnen and Camilli 2012; Geisinger et al. 2019). In addition, Tn-Seq profiles show a role for genes even beyond those related to DNA repair and replication and indicate the importance of genes with diverse functions including amino acid and carbohydrate metabolism. In P. aeruginosa, the aminoglycoside tobramycin also involves a diverse number of responsive genes, including those involved in cell division, carbohydrate metabolism, and membrane metabolism (Gallagher et al. 2011). Similar findings can be observed in data from E. coli where colony sizes were measured for an ordered mutant library grown in the presence of various stressors, including antibiotics (Nichols et al. 2011). For example, a screen with trimethoprim/sulfamethoxazole, which targets the folate biosynthesis pathway shows an important role for genes involved in this pathway, including mogA and folM, as well as genes involved in nucleotide metabolism. But again, responsive genes also include those involved in carbohydrate metabolism, glycan biosynthesis, and membrane transport. These examples highlight that while stress may be felt acutely at the antibiotic’s target, it extends beyond the primary target and results in selective pressures acting throughout the genome. The importance of this is further confirmed by the observation that resistant clinical isolates often have mutations at sites throughout the genome that resolve such stress and/or work in a compensatory manner (Albert et al. 2005; El’Garch et al. 2007). Interestingly, targeting genes involved in off-target responses can create an opportunity for therapeutic intervention by generating synergy between the off-target gene/response and the assayed drug.

1.2.2 Strain-Specific Responses to Antibiotic Stress

In addition to showing that stress can reverberate throughout the genome, Tn-Seq is able to reveal how the genetic background of a strain affects the response to antibiotic stress. Several examples have shown that the genes and pathways involved in responding to antibiotic stress can be strain specific. For instance, S. pneumoniae strains TIGR4 and Taiwan-19F are similarly susceptible to daptomycin, however, Tn-Seq results show that only 50% of the genes responding to daptomycin are common to both strains, with the other 50% being strain specific (van Opijnen et al. 2016) (Fig. 3). Moreover, the distribution of the functional categories of the responsive genes is significantly different between the two strains. This lack of conserved response is also observed for antibiotics representing fluoroquinolones, aminoglycosides, and glycopeptides, with only 40–50% of the responsive genes conserved between these two strains for a particular antibiotic. Nevertheless, when the functional categories are combined into larger groupings corresponding to different domains of the cell’s physiology, there is no difference in the distribution between the two strains. This suggests that despite strain-specific differences in response at the gene level, the global response is more similar (van Opijnen et al. 2016).

Fig. 3
figure 3

Strain-specific differences in responses to the same antibiotic. Networks show the relative number of responsive genes of a given functional category responding to either amoxicillin (a) or daptomycin (b) for S. pneumoniae strains TIGR4 and Taiwan 19F. The number of genes for each group is shown in the charts on the right side. Note the diversity of functional categories beyond the membrane target of both antibiotics. Each strain also responds to the antibiotics with slightly different functional categories. While the strains appear to lack genetic and functional conservancy, they do respond globally in the same way, when the functional categories are condensed into categories involving the capsule, membrane, cellular control, and metabolism. (c) The functional categories show a similar diversity of functions when responding to aminoglycosides, glycopeptides, and fluoroquinolones for both TIGR4 and 19F

In Mycobacterium tuberculosis, in vitro Tn-Seq experiments have shown that several clinical strains have an increased requirement for the gene encoding KatG, compared to reference strain H37Rv (Carey et al. 2018). As discussed, KatG is an activator of the first-line M. tuberculosis antibiotic isoniazid, and adaptation experiments have shown that loss-of-function mutations in katG can result in isoniazid resistance. However, such mutations occur at a low frequency in clinical strains (Gagneux et al. 2006; Vilchèze and Jacobs 2014), which suggests that the increased fitness cost of mutating katG in clinical strains decreases the frequency of acquisition of isoniazid mutants compared to H37Rv. Furthermore, Tn-Seq identified minimal fitness costs for losing glcB (a maleate synthase involved in the glyoxylate shunt, which is important for carbon and fatty acid metabolism) in some clinical strains, whereas it is highly important in other strains (Carey et al. 2018). The authors hypothesized that such differential requirements for glcB would result in correspondingly differential responses to a novel inhibitor of this protein. Indeed, they found that strains showing less of a requirement for glcB are less susceptible to the inhibitor. This type of variability illustrates how the pangenome affects responses and consequently adaptive solutions to antibiotic stress and underscores why therapies may not produce consistent results across all strains. Furthermore, the finding that strains can demonstrate considerable variation in their response to antibiotics underscores how caution must be taken when evaluating studies that are based on a single strain and thereby ignore differential responses that may be present throughout the pangenome.

1.2.3 Gene Homology Frameworks to Uncover Differential Responses Across Bacterial Species

In addition to considering strain-specific responses to the same antibiotic, we have assessed stress responses at the species level to determine how similar antibiotic response patterns are. By combining data from a variety of sources (Nichols et al. 2011; Murray et al. 2015) and generating two frameworks utilizing the OMA and PATRIC databases (Wattam et al. 2017; Altenhoff et al. 2018) the responses of E. coli, P. aeruginosa, A. baumannii, and S. pneumoniae to ciprofloxacin, could be compared. While not all responsive genes have homologs in all species, a consistent pattern is observed for these diverse species. Genes involved in DNA replication and repair such as recN and xseA are important in all four species and in both homology frameworks. Additional nonhomologous genes annotated as involved in DNA replication and repair are also observed in each of the four species. Each species also has responsive genes that are involved in various metabolic functions and cellular processes not related to DNA repair. Nevertheless, pairwise strain comparisons indicate only 5–10% of homologs not involving DNA replication and repair are shared between one or more species. This suggests that species may respond to antibiotic stress similarly at the antibiotic target and related pathways, but individually each species is responding with a unique program depending on its genetic background and thus coping with unique selective pressures that can influence the emergence of resistance.

1.3 The Role of the PanGenome in Predicting Adaptation to Antibiotic Stress

We have so far discussed that mutations responsible for the resistance phenotype to a certain antibiotic can either be common or be specific to a strain or a species. Nevertheless, the types of adaptive-resistance mutations, and the order in which they arise and fix in a population (adaptive trajectories) have shown to be replicable (Elena and Lenski 2003), which suggests adaptive evolution is, at least to a certain extent, constrained. In this section, we argue that the genetic background and the environmental context are two major factors that constrain adaptive evolution (e.g., during adaptation to antibiotics). We first discuss the role of genetic interactions and how they reduce the number of available adaptive trajectories, and propose a pangenome-wide view of studying genetic interactions. Next, we discuss the possibility of using how a selective pressure in the environment is sensed and experienced by an organism (environmental context) to predict where on the genome adaptive mutations will appear when the selective pressure is maintained. We argue that the predictions based on environmental context can be improved by the addition of pangenome-related information, such as the conservation of genetic sequences across many related organisms.

1.3.1 Adaptive Evolution Is Replicable, Therefore Predictable

The analysis of sequence sets on a pangenome scale allows associations to be made between genetic changes and antibiotic-resistant phenotypes. Such pangenome-wide association studies have revealed common sets of mutations that appear in organisms resistant to a certain antibiotic (Croucher et al. 2011; Mobegi et al. 2017a; Del Barrio-Tofiño et al. 2017). Moreover, phylogenetic reconstructions suggest the same resistance-causing mutations have appeared independently, and multiple times in geographically separated strains (Croucher et al. 2011; Farhat et al. 2013; Chewapreecha et al. 2014). While, such ad hoc associations have the power of explaining the genetic basis of a certain phenotype, they rarely offer a predictive model for future adaptive trajectories. However, the observation that the same mutations have appeared in different pathogens independently suggests that adaptive evolution is not an entirely random process. This can be further seen in lab-directed adaptation experiments, where common sets of mutations keep reappearing in independent populations under the same selective pressure (Lang et al. 2013). These common adaptive trajectories demonstrate the replicability of adaptive evolution, which is not to say evolution is an entirely deterministic process. The emergence of new sequence variants is stochastic and phenomena such as hitchhiking genetic regions, genetic drift, and clonal interference can incorporate different degrees of randomness influencing which mutations will reach fixation and how (Lang et al. 2013). Yet, the replicability of adaptive evolution in antibiotic resistance suggests that there are a limited number of adaptive trajectories available to the adapting organism. In other words, while there are many possible ways a set of resistance mutations can reach fixation, the majority of those trajectories are not plausible because they are constrained by the environment and the genetic context. This means that if the environmental and genetic constraints a bacterium evolves under can be understood and/or (experimentally) captured, adaptive evolution should become predictable.

1.3.2 Genetic Constraints on Adaptation

In order to understand the genetic constraints on adaptation we need to consider epistatic interactions within the genome. Epistatic interactions are defined as the nonadditive effects of combinations of mutations. For instance, mutations can have different effects on fitness, depending on the genetic background of the organism, i.e., what other mutations are already present in the parental strain (Vogwill et al. 2016). This is well illustrated by experiments that compared the fitness of single mutants with combinations of those singlets into double and triple mutants, where the fitness of the double and triple mutants differed from what is expected under the multiplicative model (i.e., in the absence of epistasis, the fitness of combining mutant A and mutant B in the same genome = fitness of A x fitness of B) (Weinreich et al. 2006; Angst and Hall 2013; Hall and MacLean 2016). Moreover, epistasis influences the order in which mutations appear. If two mutations do not interact with each other, then any order in which they appear is equally likely. However, when mutations do interact, the appearance of one may limit the appearance of the other. For instance, when the interactions of 5 mutations that confer resistance to cefotaxime were mapped out in E. coli, only 10 trajectories (out of 120 possible ones) turn out to have non-negligible probabilities of being observed (Weinreich et al. 2006). Another example of this is where lab adaptation of a beta-lactamase in E. coli is limited in its trajectory and will follow a certain path (that with the highest likelihood) when a specific initial mutation is present (Salverda et al. 2011).

Since epistatic interactions limit the adaptive trajectories to a few likely ones, mapping out epistatic interactions can help determine which trajectories are most plausible, and thereby contribute to predictions of adaptive evolution. A high-throughput way of determining epistatic interactions is using genome-wide double-knockout screens, as has been done extensively in Saccharomyces cerevisiae (Tong et al. 2004), and Schizosaccharomyces pombe (Roguev et al. 2008). In these studies, synthetic lethality, which is an “extreme” form of epistasis, was used to build genome-wide epistasis or genetic interaction networks. These networks show the prevalence of epistatic interactions throughout the entire genome, with most genes interacting with at least one other gene, and a few hubs with numerous interactions. A comparison of genetic interaction networks from S. cerevisiae (Tong et al. 2004), and S. pombe (Roguev et al. 2008) demonstrate that the same interactions are not always present, even when considering genes common to both organisms. In other words, while some interactions are conserved, others may be present or absent, depending on the genetic background. This means that the genetic interaction network of a single organism is not representative of a pangenome-wide genetic interaction network. Therefore, predictions of adaptation based on a single-strain network will be limited to that organism.

1.3.3 A Pangenome-Wide View of Epistasis May Enhance Predictions

Epistatic interactions are more than a collection of gene- or locus-pairs, but rather form a complex network that has both components that are universally true (those interactions that are strain or species independent), and components that are only present in a certain strain or species. When epistatic interactions (on a gene level) are mapped for a single strain, the interactions are limited to the genes present in this one strain. However, the lack of a gene is not equivalent to the lack of the influence of that gene. The fact that the gene is absent may actually affect the fitness of strains with this particular genetic background. It is possible that genetic elements that vary considerably in their presence or absence across different strains interact epistatically. Such interactions have indeed been demonstrated between chromosomal mutations and plasmids (Silva et al. 2011) or mobile elements (Stoebel and Dorman 2010), or even between two plasmids (San Millan et al. 2014). Therefore, studies mapping out genetic interactions in a single strain or species can be limiting, showing only the components of a network that applies to the organism being studied. In order to get a comprehensive view, it is necessary to construct a pangenome-wide genetic interaction network.

One possibility is to map out genetic interaction networks on hundreds of related strains/species. However, even with high-throughput screening methods, this approach is limited by time and cost. A more feasible alternative would be inferring genetic interactions through in silico analysis of a collection of genomic sequences. In a simple model, one can assume there is an underlying network of epistatic interactions, where each gene’s state (present or absent) influences the states of the genes it is interacting with [analogous to the Ising model describing particle spin states in statistical mechanics (Ising 1925)]. Each viable organism can then be described as a configuration—the presence and absence state of all genes in the pangenome. In this model, the underlying genetic interaction network results in some configurations being more likely than others. It is reasonable to assume that viable organisms are the more likely configurations. Based on this assumption, and considering the observed states of each gene from many genomes, it is possible to infer the underlying network connectivity between genes (Bresler 2014) and identify interactions between genes that are more likely to be universally true, and not strain/species specific. Such a comprehensive genetic interaction network should give a much better idea about pangenome-wide constraints on adaptive evolution. In fact, the fitness landscape (a popular visual metaphor for the effect of genotypes on fitness) is a pangenomic concept. This long-standing visual lays-out the possible genotypes of an organism (or the existing genotypes in a pangenome) on a flat horizontal surface, and the fitness of each genetic variant is plotted on the vertical axis. Thus, because the fitness landscape considers many genomic variants at once, it inherently represents a pangenome view of fitness. The classical view of the fitness landscape is that there is a single peak of fitness, and an organism adapting under a selective pressure climbs this fitness peak as it accumulates mutations. However, increasing numbers of epistatic interactions result in the fitness landscape becoming decorated with peaks and valleys, forming a rugged surface (Kauffman and Weinberger 1989). This apparent increase in complexity may also explain certain strain-specific adaptive outcomes, as it becomes clearer where local fitness maxima and minima are situated on the landscape. Consequently, the consideration of the pangenome (rather than single genomes) should uncover a comprehensive genetic interaction network.

1.3.4 Toward Predicting Adaptive Evolution and the Importance of Pangenomic Information

The fitness landscape has long been considered a constant and rigid surface for each organism. However, genotype is not the only determinant of fitness—the same organism’s fitness varies in different environments. Thus, the fitness landscape is a much more fluid concept, and its shape/contour depends on environmentally determined selective pressures. In other words, in addition to the genetic context determining fitness outcome and constraining adaptive evolution, environmental context also plays an important role.

Similar to how genetic interaction networks can reveal genetic constraints on adaptive evolution, multi-omic profiling reveals environmental constraints on adaptive evolution. The manner in which a bacterium will adapt to the environment it finds itself in is linked to how a stress, i.e., a selective pressure (e.g., antibiotic), is sensed and processed by the bacterium (Zhu et al. 2018, 2019). The use of multi-omic profiling (e.g., via Tn-Seq, RNA-Seq) can reveal which genomic loci respond to and are important in overcoming stress in the environment. For instance, Tn-Seq experiments identify the genes that contribute to fitness (phenotypically important genes, or PIGs) under the stress, and RNA-Seq experiments reveal transcriptionally important genes (or TIGs) responding to the stress. A simple assumption would be that because PIGs and TIGs are relevant in the organism’s response to stress, they will also be implicated in resolving this stress over the course of adaptive evolution. In other words, genes that acquire adaptive mutations will be TIGs and/or PIGs. While in some cases, PIGs and/or TIGs acquire adaptive mutations, not all adapted genes are PIGs or TIGs (Fig. 4). This, along with the transcriptional and phenotypic responses often involving many different cellular functions, makes it challenging to find straightforward rules that predict which genes will adapt. This has motivated the use of machine learning algorithms to detect potentially multifactorial and complicated determinants of adaptive evolution (Zhu et al. 2018; Wang et al. 2018b). Moreover, where changes in expression and fitness are situated in a network can help inform which genetic changes may or may not be permissible. One can use regulatory networks, protein–protein interaction networks or genome-scale metabolic models to contextualize the stress response. It turns out that with the inclusion of network features such as degree (how many connections does a gene have) or clustering coefficients (how many of a gene’s neighbors are neighbors of each other), machine learning models can be used to predict in which genes adaptive mutations are most likely to occur (Zhu et al. 2018). Moreover, sequence conservation and prevalence, which are features that can be extracted from the pangenome, and which describe how “plastic” (or variable) each gene is, improve prediction accuracy (Fig. 4c) (Zhu et al. 2018). While this is a step toward predicting the emergence of resistance before it actually occurs, for instance during treatment, incorporation of pangenome-wide genetic interaction networks will likely even further enhance the predictive power and accuracy.

Fig. 4
figure 4

Prediction of adaptive evolution relies on pangenome features. (a) Circular plot of the S. pneumoniae chromosome, with all features necessary for accurate prediction of which genes will contribute with adaptive mutations to vancomycin resistance. Importantly, there is no clear association with any dataset alone and the adaptive outcome, however, when taken as a whole, all data types contribute to distinguishing adapted genes from non-adapted ones (see (c) and Zhu et al. 2018). (b) Legend for (a). From innermost plot to outermost: Expression change: log2 Fold Change in gene expression comparing vancomycin treatment to no antibiotic treatment after 20, 30, 45, 60, and 90 min of antibiotic exposure. Sequence conservation: −log10 Smith–Waterman distance across all pairs of homologous sequences. Sequence prevalence: percentage of strains in the S. pneumoniae pangenome that have a homolog of the gene. Essential genes: genes necessary for survival, as determined by Tn-Seq. Fitness change: change in fitness comparing vancomycin treatment to a no antibiotic control as determined by Tn-Seq. Mutation frequency: frequency of each mutation in a population adapted to vancomycin. Adapted gene: gene containing at least one mutation that is fixed at high frequency, and is specific to the vancomycin adapted populations. (c) Classification of adapted genes and non-adapted genes. Receiver operating characteristic curve for a support vector machine trained with all data from (a, b) (blue) and one trained with pangenome sequence conservation and sequence prevalence omitted (red). The inclusion of these two pangenomic features improves the performance of the classifier

1.4 Developing New Therapeutics in the Light of the Pangenome

There is an urgent need to develop new strategies to combat resistant pathogens. Both essential genes and genes required for virulence provide attractive targets for the development of new drugs or biologicals (Clatworthy et al. 2007; Juhas et al. 2011; Mobegi et al. 2014). Such candidate targets have been identified for many species by combining functional experimental analyses like Tn-Seq, RNA-Seq, or CRISPRi with computational predictive models (Mobegi et al. 2017b). However, it is becoming more and more apparent that a gene identified as essential or required for infection in one specific strain is not necessarily essential or required for growth or infection in a different genetic context (Rancati et al. 2018). As a consequence, pangenome variability must be taken into consideration when developing new therapeutics that work at a species-wide level.

1.4.1 Targeting Essential Genes

A gene is essential if it is indispensable for reproductive success, which in the case of unicellular organisms are those genes that are required for replication (Rancati et al. 2018). A loss-of-function mutation in one of these genes, or a drug that inactivates its function will stop growth. That is why the identification of a pathogen’s essentialome (i.e., the set of essential genes in a defined genome or group of genomes) is an attractive approach for the identification of new drug targets.

Currently, Tn-Seq and related techniques are probably the most popular experimental tools used to determine essentialomes (Peng et al. 2017). Genes that lack insertions in saturated transposon libraries selected in rich media, are considered to be highly likely to be essential in any given condition. CRISPRi is another technique that is rapidly gaining popularity for determining gene-essentiality in both prokaryotes and eukaryotes (Peters et al. 2016; Liu et al. 2017; Wang et al. 2018a). However, since many more strains and species exist than can efficiently and rapidly be experimentally screened for their essential genes, in silico predictive models of gene essentiality are receiving increasing interest (Mobegi et al. 2017b; Nigatu et al. 2017). Such models can be based on several types of data, including those obtained from the genomic sequence of an organism (codon usage, orthology, GC content, etc.) or from experimental data such as expression profiles or network topology (Mobegi et al. 2017b; Nigatu et al. 2017). The accuracy of the latter models relies on omics data obtained from species where functional genomics experiments could be performed whereas models based on sequencing are more suitable for poorly studied organisms. Importantly, it is becoming clear that the static concept of gene essentiality is no longer valid. Instead, essentiality is a context-dependent attribute affected by both the environment and the genetic background of a bacterium (Rancati et al. 2018). In the simplest case, a gene can be conditionally essential, meaning it is essential in a specific environment but not in another, or in the case of a pathogen, a gene can be essential in a specific body compartment but not in another.

To understand how genetic context affects gene essentiality it is important to consider the network structure of a genome. Genes in a genome do not act as isolated units, but they interact with each other forming a network. The connections that shape this network can represent protein–protein interactions, epistatic relationships, or transcriptional regulatory interactions (Babu and Madan Babu 2008; Wuchty and Uetz 2014; Costanzo et al. 2016). Some genes present a high degree, i.e. a high number of interactions connecting them to other genes, while other genes are poorly connected (low degree). Essential genes have been shown to have a higher degree than nonessential ones in these genetic interaction networks (Jeong et al. 2001; Davierwala et al. 2005; Costanzo et al. 2010, 2016; Kim et al. 2012; Jiang et al. 2015), which is a characteristic that has been used to predict gene essentiality (Shim et al. 2017). Interestingly, data from different yeast strains has shown that essential genes may be split up into those that are always essential (their loss cannot be overcome), and those that are essential depending on the genetic background. The loss of essential genes from this latter category can be compensated by the adaptive evolution of alternative cellular processes; such essential genes are thereby referred to as “evolvable” essential genes (Motter et al. 2008; Liu et al. 2015). As an example that this is not limited to yeast, the proteins MreC and MreD, involved in peripheral peptidoglycan synthesis, are essential for some S. pneumoniae strains. However, different mutations, including the inactivation of the pbp1a gene can suppress the essentiality of these proteins (Land and Winkler 2011), which classify mreC and mreD as “evolvable” essential genes. This evolvability thus at least partially explains how the essentiality of genes can depend on the genetic background and underscores that it is important to determine a pathogens essentialome at a species-wide level to enable the identification of pangenome-wide drug targets. In general, broad-spectrum antibiotics work against large groups of different species of bacteria, and thus existing drugs often target the “pangenome.” Interestingly, these new pangenome concepts are creating opportunities to develop drugs that are directed at a specific clade. The mevalonate pathway is an example of an essential function against which clade-targeting drugs have been developed. This pathway is involved in the production of isoprenoids, and has been shown to be essential in different Gram-positive bacteria (Wilding et al. 2000; Balibar et al. 2009). The pathway is inhibited by an intermediate product, diphosphomevalonate, and fluorinated derivatives of this compound have shown potent antibacterial activities (Kang et al. 2015). However, while the mevalonate pathway is essential, it has also been shown to be evolvable in S. aureus (Reichert et al. 2018) raising the possibility that resistance mechanisms can easily arise. Importantly, genomic comparison of different Staphylococci has shown that species either have the mevalonate or the non-mevalonate (or 2C-methyl-D-erythritol-4-phosphate, MEP) pathway for the biosynthesis of isoprenoids, and specific pathogenic Staphylococci of domestic animals have the non-mevalonate pathway (Misic et al. 2016). Based on this difference at the genus level, it has been proposed that antibiotics for domestic animal Staphylococci targeting the MEP pathway could avoid the emergence of antibiotic-resistant determinants in human pathogens. Such clade targeting antibiotics may thus be an interesting strategy, but are only possible if a comprehensive understanding of the pangenome is available.

1.4.2 Targeting Mechanisms of Infection

In addition to genes essential for general growth, genes required for colonization, infection and/or those that damage the host (i.e., virulence factors) are also attractive targets for drug therapies (Clatworthy et al. 2007; Rasko and Sperandio 2010; Allen et al. 2014; Dickey et al. 2017). Consequently, resistance mechanisms against compounds targeting these factors (antivirulence drugs) may not easily spread outside the host (Allen et al. 2014). Also, antivirulence drugs may be more effective against persisters (Kim et al. 2018), and since they are directed at very specific targets they could potentially have less of an effect on the natural microbiota of the host (Clatworthy et al. 2007; Dickey et al. 2017). Specific antivirulence drugs or biologicals at different stages of clinical development, target pathways including the production of teichoic acids, biofilm formation, quorum-sensing mechanisms, and specific histidine kinases, and are directed against bacteria including the ESKAPE pathogens (Matano et al. 2016; Pasquina et al. 2016; Goswami et al. 2017; Dickey et al. 2017; Cardona et al. 2018; Huggins et al. 2018). To expand such specific therapeutic options, it is necessary to identify a pathogen’s genetic requirements for infection, for which in vivo Tn-Seq experiments have proven successful (van Opijnen and Camilli 2012; de Vries et al. 2017; Le Breton et al. 2017; Shields et al. 2018). As with essential genes, requirements for certain genes seem to be environment dependent. For example, proline biosynthetic genes in S. pneumoniae strain TIGR4 have been shown to be required for infecting mouse lungs, but are dispensable for colonizing the nasopharynx (van Opijnen and Camilli 2012). Other environmental factors that affect genetic requirements are microbial communities and polymicrobial infections. For instance, S. aureus requires 182 genes for a successful infection when co-inoculated with P. aeruginosa, but the same genes are dispensable if the pathogen is inoculated by itself (Ibberson et al. 2017). By using a Tn-Seq approach, it was shown that two different strains of P. aeruginosa required different genes to grow in cystic fibrosis sputum, a growth condition that partially mimics an in vivo infection (Turner et al. 2015). Moreover, many of the genes required by S. pneumoniae strain TIGR4 for host colonization are not present in the genomes of other clinical isolates of the species (Fig. 5), which underscores that virulence determinants are indeed also dependent on genetic background and thus only make sense in the context of the pangenome. A successful example of consideration of the pangenome to develop an antibacterial therapy is the pneumococcal vaccine (Berical et al. 2016; Brooks and Mias 2018). The S. pneumoniae capsule is one of its most important virulence factors and its diversity is high, with over 90 types (serotypes) currently described (Geno et al. 2015). Capsules are highly antigenic and serotypes differ in polysaccharide residue composition, chemical decoration of sugar monomers and length of the polysaccharide chain (Bentley et al. 2006). Pneumococcal vaccines are formulated by mixing multiple capsule serotypes, which is exemplified by the pneumococcal conjugate vaccine 13 (PCV13) and the pneumococcal polysaccharide vaccine (PPSV23). These vaccines protect against 13 and 23 different capsule-type-based strains, respectively (Berical et al. 2016; Brooks and Mias 2018), and are thereby highly successful in targeting a considerable part of the S. pneumoniae pangenome.

Fig. 5
figure 5

Genetic requirements of strain S. pneumoniae. TIGR4 and its comparison with the species pangenome. Tn-Seq experiments performed in strain TIGR4 (red arrowhead) determined candidate essential genes, genes required for growth in minimal medium and those required for infection (van Opijnen and Camilli 2012). The presence (yellow) and absence (blue) of these genes was established in 332 other S. pneumoniae strains. It is clearly shown that many genes identified as essential or required for infection in TIGR4 are absent in many other invasive disease isolates (Cremers et al. 2015)

1.4.3 Antimicrobial Combination Therapy

In addition to developing novel therapies, utilizing currently available drugs in a more effective way, e.g., through a multidrug strategy (including antibiotic cycling and antimicrobial combination therapy), potentially provides enhanced ways to treat clinical infections and prevent resistance (Smirnova et al. 2011; Yoshida et al. 2017; Firsov et al. 2017). However, it has been shown that the responses to drug–drug combinations can be species specific (even among phylogenetically related organisms), and in some cases strain specific (Brochado et al. 2018). Thus, the application of combination therapies presents a challenge with respect to the pangenome. To overcome this challenge, it is necessary to get a comprehensive understanding of drug–drug interaction outcomes in many species and strains of a species, potentially by testing all possible combinations of drugs, and at different concentrations. Brochado et al. performed 2883 pairwise drug–drug combinations on six bacterial strains from three Gram-negative bacterial species (E. coli, S. typhimurium and P. aeruginosa), yielding a total of 17,050 combinations. The authors found that 70% of the detected drug–drug interactions are species specific, and that 13–30% are strain specific, with different interaction outcomes among the strains (Brochado et al. 2018). Although approaches like these are very important, they can be very time consuming and expensive to perform, especially when one considers hundreds of species/strains in a pangenome. This has prompted the application of computational predictive strategies. One such an approach is INDIGO through which the developers were able to identify a group of genes that are predictive of antibiotic interactions in E. coli, and use these genes to predict drug interaction outcomes in other important pathogens including M. tuberculosis and S. aureus (Chandrasekaran et al. 2016). Using such predictive modeling methods considerably reduces the number of experiments that need to be performed, potentially making it possible to accurately infer drug–drug interaction outcomes on a pangenome scale.

2 Conclusions

The development of antibiotic resistance is a complex process that can involve multiple modes of adaptation and/or multiple sources of selective pressure. Here we argue that a fuller understanding of this process is only possible by viewing it through the lens of the pangenome. We have highlighted recent work that demonstrates that genetic background plays an important role in how bacteria respond to an antibiotic and how they develop resistance. We have explained how species and strains with different genetic backgrounds may exhibit (slightly) different adaptational outcomes in response to antibiotics. Strains within a pangenome may also exhibit strain-specific differences in their mechanism and level of resistance as well as their ability to evolve resistance. These different outcomes can be put into context and partially explained by how antibiotic stress is experienced and processed in strain- and species-specific ways. In this way, antibiotics can contribute to the maintenance and shaping of a pangenome by driving adaptive evolution in strain-specific ways.

In addition to providing context for understanding strain- and species-specific responses to antibiotics and their development of resistance, the pangenome can provide a means of predicting the development of resistance as well as inform the development of novel therapeutics. We argue that adaptation to sustained antibiotic pressure is not a wholly stochastic process but rather constrained by a strain’s genetic background as well as its environmental context. Given these constraints, it is increasingly possible to utilize machine learning algorithms to make predictions on the probability that a bacterium will evolve resistance. These algorithms can utilize multiple layers of data including genomic, transcriptional, and metabolic datasets at the pangenome level. Therefore, they will continue to improve as additional datasets are generated. Finally, we have considered the role the pangenome could play in developing new therapeutics to combat resistant pathogens. Essential genes and virulence genes offer attractive targets for developing novel therapeutics; however, these targets must be considered within the context of the pangenome due to a variety of reasons. Essential genes in one strain may, in fact, be evolvable, while they are static in another strain. Virulence targets may also be strain specific or dependent on the environmental context of infection. While this may limit the number of targets that are present throughout the pangenome, it does offer the possibility of identifying targets that are strain or species specific.