Infectious agents have a long history of causing disease in humans and are a major contributor of childhood mortality worldwide (Table 1). For many infectious agents, a complex interaction between the host and pathogen has developed over time as a result of an evolutionary co-existence and adaptation that is now reflected in both human and pathogen genomes. Initial studies have looked at these human-pathogen interactions on a broad scale, asking what components of the (typically human) genome are associated with disease susceptibility [1]. However, infectious diseases, unlike many other diseases, offer the chance to dissect this more precisely as the non-host or 'environmental' component (the pathogen) has a genome that can be assayed just as accurately as the human genome. Today, we have the appropriate tools and platforms in place to begin examining these adaptations from a whole-genome perspective, using samples directly from clinical studies. By directly observing this human-pathogen interaction in natura, we can avoid the requirement for artificial models, which tend to be tedious to construct and may not reconstitute human physiology accurately [2]. This generates a powerful approach, as it integrates two complementary (but previously thought to be distinct) areas of infectious disease research - host genetic susceptibility to infection outcomes and pathogen genetic variation leading to differential virulence and individual susceptibility.

Table 1 Global causes of childhood deaths

On the basis of data obtained using biological candidate approaches, it has been assumed that broad-based responses mediated by B and T cells (both CD4+ and CD8+) were the primary defenses against infectious agents and that these were sufficient for protection against most infecting pathogens [3]. Defects in Interleukin 2 (IL-2) signaling resulted in failed clonal expansion of both B and T cells, leading to severe combined immune deficiency [3]. Other examples include X-linked agammaglobulinemia - caused by mutations in Bruton's tyrosine kinase - where no mature antibody-producing B cells are produced, resulting in no antibody production [4]. However, in the majority of common infectious diseases studied so far from a genomic and genome-wide association perspective, the evidence has tended to point away from a picture of high-penetrance, generalized susceptibility caused by critical defects in single genes, and instead towards a more complex picture of multiple lower penetrance effects implicating individual genetic components for each invading pathogen. There are some notable exceptions to this, such as the shared human leukocyte antigen (HLA) associations (suggesting T-cell involvement) in leprosy susceptibility [5] and HIV viral set point [6]. As HLA is crucial for recognition of processed pathogen molecules and the initiation of the CD4+ and CD8+ T-cell response, it was unsurprising that some degree of shared susceptibility between pathogens is observed.

Recent evidence has suggested an important role for innate immune interactions in disease susceptibility [5, 7] and specific evolution and adaptation of both host and pathogen genomes to the current state of mutual co-existence. Genome-wide association studies (GWASs) necessarily assay common genetic variation of mostly low penetrance, and this approach is now being used to study host genetic susceptibility to infectious diseases (Table 2). Some of these recent GWAS data on infectious diseases have indeed pointed clearly to surprisingly specific pathogen-receptor interactions rather than broad-based susceptibilities. Similar advancements have been made from the pathogen perspective, in which GWASs on the pathogen genome have also yielded unexpected insights. The example in Plasmodium falciparum malaria showcases the adaptability of the pathogen genome when confronted by the human immune response and anti-malarial pharmacotherapy [810].

Table 2 Host susceptibility genes identified by GWAS studies

In contrast to highly penetrant Mendelian genetic defects causing broad-based susceptibilities with severe clinical outcomes, more common host genetic determinants of susceptibility of lower penetrance and modest effect size could be confined to specific pathogen species [11] or even serotypes (and possibly genotypes). A better understanding of host-pathogen interactions will allow more thorough dissection of the host immune response generated by specific pathogen invasion. Because pathogens adapt to survival pressures by evading drugs, vaccines, and the host immune response and have a higher intrinsic mutation rate than humans, this improved understanding will potentially have an impact on vaccine design and novel therapy discovery. In this review, we focus on three examples of this - dengue, Malaria, and meningococcal disease - to reflect the literature currently available.

Host-pathogen interactions with dengue virus

Dengue fever is caused by infection from one of four serotypes of the dengue (DEN) virus (DEN-1 to DEN-4), which belong to the Flaviviridae family (other members of which include the yellow fever and Hepatitis C viruses) [12, 13]. It is an acute systemic viral infection with a wide spectrum of disease manifestations, ranging from subclinical infection to severe and fatal disease. The commonest severe complications are a transient increase in vascular permeability and altered hemostasis, which could lead to life-threatening hypovolemic circulatory shock (called dengue shock syndrome, DSS). It is the most common mosquito-borne infection after malaria, and its recent resurgence is largely attributed to a combination of factors, including exponential population growth, rapid urbanization, air travel and lack of proper vector control [14]. Previous studies have provided clues regarding the importance of pathogen genetic variation and severity of the infection. Amino acid variants determined from full-genome sequencing of the dengue virus revealed different potentials for causing severe dengue as opposed to uncomplicated dengue [15]. Current evidence also suggests that different serotypes of dengue virus are associated with more or less severe disease, following infections with DEN-2 [1619].

From the human perspective, previous studies have pointed to several branches of the immune system in the pathogenesis of dengue [1519], including the HLA system and dendritic cells, although there are few convincing genetic validation studies. However, a recent large-scale GWAS and replication study showed very strong statistical evidence of association between genetic variants at two distinct loci (MICB and PLCE1) and increased susceptibility to DSS (Table 2) [20]. MICB encodes MHC class I polypeptide-related sequence B, which is an inducible activating ligand for the NKG2D type II receptor on natural killer (NK) cells. NK cells are distinct from B or T cells and are crucial for the early response to viral infections and in shaping the subsequent adaptive immune response to viral infection [2123]. Interestingly another recent GWAS found that a genetic variant in the closely related MICA gene is strongly associated with Hepatitis C virus-induced hepatocellular carcinoma, suggesting a pivotal role for MIC proteins in the pathogenesis of these Flaviviridae infections [24]. PLCE1 encodes phospholipase C epsilon gamma, and missense mutations in PLCE1 cause nephrotic syndrome [25], a kidney disorder in which dysfunction of the glomerulus basement membrane results in proteinuria and hypoproteinemia that, when severe, leads to reduced vascular oncotic pressure and edema. These elements of nephrotic syndrome have striking similarities with the hypovolemic shock in severe dengue and suggest an important role for PLCE1 in maintaining normal vascular endothelial cell barrier function. These associations with MICB and PLCE1 are not serotype specific, but instead are applicable across all four dengue virus serotypes [20].

The GWAS on severe dengue has revealed unexpected insights into its pathogenesis, as it points specifically to NK cell pathology, shared susceptibility pathways with shock, and possibly shared pathology with other Flaviviridae [20]. However, the locations of the causal genetic variants responsible for the observed associations are still unknown and await elucidation through fine-scale mapping. What is notable is that these observations suggest that subtle genomic differences in members of one viral family affect related, yet distinct, components of the human immune response. But is this molecular pathway of susceptibility the only route to Flaviviridae infection or is this shared pathway the only one for a wide variety of viruses? The genetic variation at both MICB and PLCE1 explained no more than 2.5% of the overall heritable variance in susceptibility to DSS, whereas very little evidence of disease association was observed elsewhere in the genome. Will a finer dissection of the infecting pathogen genome reveal more and discover some of the 'missing heritability'? Whole-genome sequencing of the dengue virus isolated from patients with clinical data has been initiated on a large scale [19, 2629], and identification of mutations in dengue virus genes and their systematic reconciliation with human genotype and clinical phenotype will be possible in the near future.

The impact of host and pathogen genetic variations on individual susceptibility to meningococcal disease

Neisseria meningitidis is a Gram-negative, polysaccharide-encapsulated bacterium and is the cause of meningococcal meningitis and sepsis, which are potentially fatal infections without antibiotic treatment. Five out of the twelve identified N. meningitidis serotypes have been reported to cause epidemics (A, B, C, W135, and Y) [30]. Subgroup analysis with specific meningococcal serotypes has revealed differential virulence capabilities of each serotype, with serotypes B and C causing more adverse infection outcomes than the others [3133]. Unlike for meningococcal serotypes A, C, Y, and W135, a vaccine is yet to be developed for serotype B N. meningitidis, thus highlighting the need for improved understanding of the host-pathogen interactions involved. Earlier genetic studies examining human susceptibility to N. meningitidis sepsis were limited to known genes whose biological functions have been well characterized [3436], and many were likely to be false-positive findings as they subsequently were not successfully validated by independent studies [37]. It is hoped that by better defining the critical elements at the host-pathogen interface using unbiased whole-genome approaches, more definitive answers on the susceptibility determinants of overall meningococcal infection and the extent of the immunogenic differences between serotype B N. meningitidis and the other serotypes will emerge.

A recent case-control GWAS study [7] on N. meningitidis sepsis (which included serotypes A, B, C, Y, and W-135) in a UK collection, with validation in sample collections from Austria, the Netherlands, and Spain, showed significant evidence of association between genetic markers in Factor H (CFH) and CFH-related genes (CFHR3, CFHR1) on chromosome 1 and decreased susceptibility to meningococcal disease (Table 2). CFH and CFHR3 are atypical members of the complement family in that they are negative regulators of complement signaling, protecting host cells from destruction through complement activation in response to infection [38].

Previous human mutation studies had shown complement deficiency to be a susceptibility factor for many Neisseria species [39], so perhaps this study [38] may not seem surprising. However, when examined, a new mechanism was revealed. It is not the absence of complement that is important to the occurrence of widespread disease, but rather the presence of the factor H complement regulator. This means that the addition of complement to patients who were deficient in complement pathway components might not prevent them from suffering repeated disease, whereas the removal of the complement regulator might, because CFH protects the bacteria through a complex interaction of the host protein and bacteria (Figure 1). This new mechanistic understanding of the disease may have other implications as well, because the associations observed with CFH and CFHR genetic variants were significantly stronger (per-allele increased risk of disease between 1.6-fold and 1.8-fold compared with baseline; P-value about 10-5 to 10-8) in collections that had a variety of N. meningitidis serogroup infections (enrolled before wide-spread serotype C conjugate vaccination), compared with those collections collected after widespread serotype C vaccination was adopted (Spain; per-allele increase in risk about 1.3- to 1.4-fold, P-value about 10-2 to 10-3). The substantially weaker association at the CFH-CFHR3-CFHR1 locus in the predominantly serotype B sample collection implies that serotype B N. meningitidis has subtly distinct mechanisms of host interaction from those of other serotypes and may explain why serotype B N. meningitidis has been refractory to vaccine development attempts.

Figure 1
figure 1

The complement cascade, factor H (encoded by CFH ), and its interactions with N. meningitidis factor H binding protein (fHBP). The milieux shown here are the host endothelial cells, the interstitium, and invading N. meningitidis cells. Factor H normally binds to glycosaminoglycan sugars on host cells. This same region in Factor H is bound by the N. meningitidis through factor H binding protein (fHBP). There is significant diversity in fHBP, which is more variable in N. meningitidis serogroup B than in other serogroups. The serum concentration of CFH also varies with CFH genotype. Adapted from Tan et al. [66].

N. meningitidis defends itself from host complement-mediated killing through sequestration of human CFH by factor H binding protein (fHBP), a surface lipoprotein present on all strains of N. meningitidis [40]. Owing to its antigenic potential and ability to induce bactericidal antibodies, fHBP is one of the antigens incorporated in the recombinant vaccine for group B N. meningitidis in vaccine trials. This GWAS observation [7] thereby confirmed an interaction between N. meningitidis and human CFH, mediated in a complex manner through fHBP, which was not previously detected in mutation studies of patients suffering from recurrent meningococcal infections due to complement deficiency (Figure 1). As different allelic variants of CFH itself resulted in easily discernable differences in actual human susceptibility to meningococcal disease, therapeutic designs based on synergistic action targeting both fHBP and CFH remain a possibility. Such an approach could be more effective than manipulation of either molecule alone. The fHBP gene has recently been sequenced in various serotypes, revealing significantly higher sequence diversity in serotype-B than non-serotype-B N. meningitidis. This makes serotype-B N. meningitides more challenging to target from the point of view of both the host and the vaccinologist [41]. This could in part explain the much weaker association signal seen in the GWAS and implies a correlation between increased fHBP variability and virulence, which needs to be confirmed by future work.

Plasmodium falciparum

Malaria is a predominantly tropical disease vectored by the Anopheles mosquito, and up to 40% of the worldwide population is at risk of infection. Although infection by four different species of Plasmodium (falciparum, malariae, ovale, and vivax) is causative of malaria, P. falciparum is the predominant form globally. It also results in higher complication and mortality rates than the other three species [42]. Susceptibility to and severity as a result of Plasmodium infection are well studied examples of how host genetic factors affect the disease process. Natural genetic variation resulting in the sickle cell trait and Duffy blood group antigens are two clear examples whereby the human genome adapts specifically in response to an infectious agent. The parasite also exerts considerable selective pressure on the human genome [43], underlying yet again the importance of understanding the host-pathogen interface from both perspectives.

Although the immune responses to malaria have been extensively documented in both animal and human models [44, 45], robust evidence on the specific response that is critical for protective immunity remains elusive [46, 47]; often, the induction of a high degree of immunogenicity does not correlate with protective immunity [48]. GWASs have now made it possible to search systematically for strong association between functional variation in a given immune gene and susceptibility to (or severity of) infection to identify the specific immune responses responsible for protective immunity. All genetic studies for human susceptibility to malaria have yielded two consistent results: the sickle hemoglobin trait is associated with a 5- to 10-fold reduced susceptibility to severe malaria, and the ABO blood group is associated with a more modest reduction (a per-allele odds ratio of 1.2-fold increased risk for blood groups A, B, and AB compared with blood group O) [49]. Both findings have been confirmed by a subsequent GWAS on severe malaria in West Africa (Table 2), revealing that the major interaction point of interest from the host perspective is the red blood cell. Indeed, it is this interaction between P. falciparum and the red blood cell that gives rise to all the clinical symptoms of malaria, and normally functioning erythrocyte physiology has been shown to be crucial to parasite survival [50, 51]. However, what is surprising is that genetic variation in all immune-related genes (such as the HLA, tumor necrosis factor (TNF) and lymphotoxin alpha (LTA) families), the very genes that have been thought to have critical roles in the immune response and clearance of P. falciparum infections [52, 53] do not consistently show association with susceptibility in large collections of malaria patients.

The paucity of findings from the viewpoint of the host beyond the red blood cell suggests that for malaria, genetic variations in P. falciparum itself might account for substantially more disease variance. The genomic approach adopted from the perspective of P. falciparum has been two-pronged: using genome-wide association and detection of natural selection to identify molecules crucial at the host-pathogen interface. It has been observed that the strong selective pressure exerted by P. falciparum on the erythrocyte has led to increased incidence of otherwise rare hematological disorders (such as sickle-cell disease, hemoglobin C, and possibly thalassemia) [54, 55]. A better understanding of the selective pressure that malaria has exerted on the immune system would thus yield considerable insight into the process of malaria pathogenesis and severity. Indeed, surveys of the P. falciparum genome using sequencing and genotyping have revealed it to be highly variable. Genes encoding Plasmodium proteins interacting with the host immune system, such as P. falciparum erythrocyte membrane protein 1 (PfEMP1), are often under balancing selection resulting from pressures from the host immune system on one hand and the need to maintain diversity on the other [8]. The sequencing of P. falciparum virulence genes that interact with the red blood cell (var/PfEMP1) and transporters (such as pfcrt for chloroquine, pfsurfin for dihydroartemisinin and dfhr for cycloguanil) responsible for resistance to anti-malarial drugs revealed greater than average genomic diversity and the presence of positive selection signatures in these Plasmodium genes, indicating the continued evolution of the parasite in response to survival pressures from both the human immune response and to medical intervention. Indeed, a GWAS of P. falciparum genetic variation using an array of over 17, 500 markers obtained by parasite genome sequencing revealed very strong associations between genetic variations in many drug transporter genes, including the P. falciparum chloroquine transporter gene (PfCRT), P. falciparum dihydrofolate reductase (PfDHFR) and resistance to anti-malarial chemotherapy [9]. Strong evidence of association with drug resistance and positive selection was also observed at pfsurfin, a gene that had until then not been implicated in drug resistance. Curiously, PfSURFIN was reported to be co-transported with PfEMP1 to the infected surface of the red blood cell and is thought to be part of a protein complex involved in binding or transport of chemical compounds. The extension of this P. falciparum GWAS to severe malaria susceptibility in humans (as opposed to uncomplicated infection) will be extremely informative [56].

Future challenges

The outcome of a specific episode of infection is strongly suspected to depend in part on specific interactions between host and pathogen genotypes [57]. The cumulative effect size from the contribution from the host and pathogen genomes is likely to be larger than each genome alone, and clinical outcomes of infection are unlikely to be explained by a reductionist approach that studies disparate individual components [58]. Even in less complex organisms than humans, host resistance to pathogens has been shown to vary dramatically across different combinations of host and pathogen genotypes [59], with an even more complicated picture if the host is susceptible to infection by multiple different strains of the same pathogen species. To this end, GWASs and pathogen sequencing have revealed fresh and often unexpected insights for host-pathogen interactions in revealing hitherto unsuspected molecules (Table 2). In cases of both dengue and meningococcal disease, GWASs revealed surprising interacting host molecules (the NK cell pathway for dengue and complement inhibition by factor H for meningococcal disease) that are important in disease pathogenesis. In addition, the GWAS on meningococcal disease revealed that pathogen genetic variation (in terms of different infecting strains) also contributes substantially to overall inter-individual susceptibility to disease - and this may also be case with dengue and other diseases.

Although the GWAS effort from the human perspective has not produced the novel insights for malaria seen for other infectious diseases, it is clear that this is more than compensated for by approaches from the parasite perspective, which have already begun to bear fruit. Var genes identified from analysis of the sequenced Plasmodium genome are already turning up potential antigenic sites of interaction with the human genome, with the potential for being vaccine targets [8], and combined selection and association analysis have revealed PfCRT, PfSURFIN, PfMDR and P. falciparum apical membrane antigen 1 (PfAMA-1) as crucial molecules interacting with both anti-malarial medications and the host immune system, respectively, as well as PFE1445c, which encodes a Plasmodium conserved protein [9]. So, in this case, more detailed work stemming from analysis of P. falciparum resequencing data could reveal crucial interaction points between host and parasite.

The new challenges in defining host-pathogen interactions are starting to become computational, because to comprehensively analyze and integrate whole genome data for both humans and pathogens, complex bioinformatics toolsets are required, as the order of human-pathogen genotype-phenotype combinations increases exponentially. The search is further complicated by the well known observations that pathogen genomes are not as conserved as the human genome; for example, the per-nucleotide mutation rate is 5 × 10-4 per generation [8] for the P. falciparum genome, compared with about 2.5 × 10-8 for humans [60].

To aid in these challenges, high-throughput analysis pipelines can now be used for alignment and mapping of pathogen genomes generated by next-generation sequencing experiments, thus allowing a comprehensive catalog documenting all pathogen genetic polymorphisms (not unlike the human HapMap project). This effort will enable in-depth characterization of pathogen phylogeny and measurement of correlations between pathogen genetic mutations and virulence, together with key human phenotypes of the infection (such as cytokine profile and disease severity). In most cases, the integration of genomic data obtained from GWASs and resequencing efforts from both the host and pathogen should reveal a complete catalog of the inter-individual susceptibility to infection.