Background

Pneumococcal infection is a major cause of morbidity and mortality worldwide [1]. Invasive pneumococcal disease (IPD) is an infection confirmed by the isolation of Streptococcus pneumoniae from a normally sterile site, while non-invasive pneumococcal disease includes sinusitis, mastoiditis, acute otitis media, and community-acquired pneumonia (CAP). Streptococcus pneumoniae has been identified as the most common cause of CAP in adults [2,3,4]. In 2015, an estimated 515.000 deaths (range 302.000–609.000) were attributed to pneumococcal infection among children less than 5 years of age globally [5]. The incidence of IPD is strongly age-related, with the highest incidence in younger children and the elderly with incidence ranging from 11 to 27 per 100,000 in Europe [6,7,8]. Mortality rates for IPD vary from 12 to 22% in adults in the western world and are substantially higher in low income countries [7,8,9,10,11].

Pneumonia with empyema and/or bacteraemia, meningitis, and bacteraemia are the commonest manifestations of IPD. [12] Identified risk factors for IPD include splenectomy, cancer, and diabetes mellitus, but in a substantial proportion of patients no risk factor can be identified [7]. Extreme phenotype studies in patients with recurrent or familial IPD first identified genetic risk factors to increase susceptibility [13]. Most of the identified genetic variation was found in genes controlling the host response to microbes [14]. Subsequently several case–control and cohort studies described genetic variation to increase susceptibility and to predict unfavourable outcome of pneumococcal disease and disease phenotype [6, 9, 15].

In the past 20 years several genetic association studies investigated host genetics in relation to susceptibility and outcome of pneumococcal disease, sometimes showing conflicting results. Here we systematically review these studies, perform a meta-analysis and discuss the potential of these findings for understanding the pathophysiological mechanisms of pneumococcal disease.

Methods

Systematic review

We performed a systematic review and meta-analysis with the objective to summarize host genetic variation associated with susceptibility, phenotype or outcome of patients with IPD and CAP. The following search terms were used in PubMed: ((Streptococcus pneumoniae) OR (S. pneumoniae) OR pneumococcal OR pneumococcus) AND (polymorphisms OR polymorphism OR (genetic variant) OR (genetic variants) OR (genetic association study) OR (single nucleotide polymorphism) OR (single nucleotide polymorphisms) OR SNP OR SNPs OR genotype OR genotypes) without language restrictions and with search date cut offs between Jan 1, 1983 and Jul 4, 2018. We identified additional publications by checking the references in those published studies and via communicating with experts in the field. Extreme phenotype, review studies, and studies with specific patients groups like immunocompromised patients were excluded. Studies were eligible for inclusion if the population of interest was reported with at least one of the outcome measures.

Meta-analysis and statistical analyses

Each study was scored for methodological quality, such as study design, definition of the investigated condition, ethnicity of included patients, sample size, selection of the control group, quality control of genotyping, statistical methods and correction for multiple testing. We performed meta-analyses for multiple studies that assessed a single genetic polymorphism (or a combination of polymorphisms) of which genotype data was available in the manuscript. Different nomenclatures of genetic variants included in the review can be found in Additional file 1: Table S1. Review Manager 5.3 was used to generate Forest plots and calculate overall effect sizes with a fixed effects model or random effects model if the results between studies were too heterogeneous (Q test for homogeneity p < 0.05) [16]. The funder of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.

Results

Systematic review

The date of search was 4 July 2018 and yielded 1219 articles (Fig. 1 - flow diagram) of which 60 articles were eventually included in the review [17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76]. Studies were published from 2000 to 2018 and contained 16,034 patients included in 27 different cohorts from 15 countries. There was a substantial overlap of cohorts and patients between the published articles. Of all studies, 24 (40%) analysed the influence of genetic variation on susceptibility to pneumococcal disease, 8 (13%) on outcome, 2 (3%) on disease phenotype, and 26 (43%) studies assessed multiple categories of which 24 (40%) on susceptibility and outcome (Tables 1 and 2). Eight studies (13%) focused on patients with pneumococcal CAP, 49 studies (82%) on patients with IPD and 3 studies (5%) on IPD and pneumococcal CAP.

Fig. 1
figure 1

Flow diagram for study selection

Table 1 Genetic-association studies on susceptibility to pneumococcal disease
Table 2 Genetic-association studies on outcome and phenotype of pneumococcal disease

Twenty-eight studies (47%) were performed in adults (8188 patients) and 15 studies (25%) in children (4988 patients), 13 (22%) in all age categories (2675 patients) and 4 studies (7%) did not specify the age range of included patients. The population was limited to white patients in 39 studies (64%), mixed ethnicity in 9 studies (15%), and African origin in 3 studies (5%); ethnicity was not specified in 9 studies (15%). The sample size was less than 100 patients in 17 studies (28%), 100–500 patients in 40 studies (67%), and more than 500 in 3 studies (5%). The study population was defined by positive cultures of blood, cerebrospinal fluid or joint fluid in 41 studies (68%), and in 2 studies (3%) cultures of sputum or tracheal aspirate were included as well. Other studies used PCR, antigen tests or both (14 studies, 23%) to confirm bacterial presence. The control populations of the 57 susceptibility cohorts varied considerably and included healthy population-based controls, blood donors, participants in vaccine programs, patients from other hospital departments, university personnel or proxies and family members of patients. Some studies did specify if controls were ethnically, age or sex matched (32 cohorts, 56%).

Most studies (92%) had a candidate genetic variant approach looking at a selection of single nucleotide polymorphisms (range 1 to 326 polymorphisms; median 4). Five studies had a hypothesis free approach, including 1 genome wide association study, 2 exome wide association studies, and 2 sequencing studies [63, 69,70,71,72]. Most studies (41; 68%) determined genotypes by PCR followed by various methods of allelic discrimination, of which 18 studies confirmed genotypes with sequencing, 3 studies with retesting of samples and 19 studies did not mention if or how genotypes were confirmed. Eleven studies (18%) used real time PCR (by Taqman® genotyping assays), 1 (2%) PCR with mass spectrometry analysis, and 7 (12%) next generation sequencing (12%) for determination of genotypes. Seven studies (12%) described blinding of laboratory personnel for the clinical information.

The χ2 test and/or Fisher’s exact test was used in 48 studies (80%) to compare genotypes of selected groups. Logistic regression with correction for confounders to compare genotype frequencies between selected groups was done in 31 studies (52%). Correction for multiple testing was used in 23 (66%) of the 35 studies that assessed three or more polymorphisms.

Meta-analysis

Meta-analysis could be done for 16 (combinations of) polymorphisms assessing an association with susceptibility and for 1 combination of polymorphisms assessing an association with outcome of pneumococcal disease. The number of cohorts in the meta-analysis varied between 2 and 10. Significant heterogeneity was found in 8 studies included in the meta-analyses for which therefore a random-effects model was used. Forest plots were made and overall ORs with 95% CIs were calculated (Additional file 2).

Candidate gene approach

Pathogen recognition receptor signalling pathways

Toll-like receptors (TLRs) or nod-like receptors (NRLs) are pathogen recognition receptors of the innate immune system that recognize molecular patterns derived from microbes. [77] Fourteen studies assessed the effect of polymorphisms in 11 genes of the TLR and NLR signalling pathways on pneumococcal disease [29, 30, 33, 43, 50, 53, 57, 59, 63, 65, 67, 72, 74, 76]. Six polymorphisms were assessed in multiple studies and could be included in a meta-analysis. Five studies assessed the association between polymorphisms in TLR2 (rs5743708) and TLR4 (rs4986790) and susceptibility [30, 33, 50, 59, 74]. In the meta-analyses neither of the polymorphisms showed any effect. Rs352140 in TLR9 was assessed in two studies for an association with susceptibility which resulted in no association in the separate studies and the meta-analysis [43, 74]. The CD14 CC genotype of rs2569190 was significantly associated with susceptibility in a meta-analysis including two studies (OR 1·77, 95% CI 1·18–2·66) [33, 59]. Two studies including 224 patients and 284 controls studied rs4251513 of IRAK4 and no effect was found on susceptibility in the meta-analysis [63, 65].

Polymorphisms in the Toll interleukin-1 receptor domain-containing adaptor protein (TIRAP) gene were investigated in three studies including five cohorts with in total 1601 white patients and 2826 African patients [29, 67, 76]. In the meta-analysis with the polymorphism rs8177374 was not associated with pneumococcal disease.

Three studies assessed the effect on outcome of polymorphisms in genes involved in pathogen recognition receptor signaling [50, 53, 65]. A Spanish study with 60 patients assessed the effect of 10 polymorphisms in IRAK4, IRAK1, IRAKM and MYD88 on outcome of pneumococcal disease, but after re-calculation of their results the patients groups appeared too small to find significant assocations [65]. A study of 66 children with pneumococcal meningitis on the influence of NOD1, NOD2, TLR2, TLR4, TLR9, TRAIL and CASP1 polymorphisms on susceptibility and outcome showed no significant associations after correction for multiple testing [53], [57].

Complement system

Mannose-binding lectin (MBL) is a soluble pattern recognition receptor of the collectin group that activates the lectin complement pathway after binding to a microorganism. Structural mutations in exon 1 of the MBL2 gene resulting in variant allele B, C or D (rs1800450, rs1800451 or rs5030737), have been associated with reduced functional serum MBL levels [78].

The effect of MBL2 variant allele B, C or D on susceptibility to pneumococcal disease was assessed in 9 studies which were included in the meta-analysis [18, 19, 27, 35, 49, 58, 62, 64, 76]. In the meta-analysis, 2504 patients and 4749 controls were included, and homozygosity of any of the variant alleles was significantly associated with susceptibility to pneumococcal disease (OR 1·67, 95% CI 1·04–2·69). A Funnel plot with the 10 study cohorts showed the overall effect on susceptibility was likely influenced by publication bias (Fig. 2). Effect on outcome of MBL2 variant allele B, C or D was assessed in 10 studies, but only 3 of these studies could be included in the meta-analysis due to lacking of detailed genotypic data in the manuscripts [35, 58, 64]. The meta-analysis showed no significant effect on outcome of pneumococcal disease. Rs7096206 in the promotor region of MBL2 was analysed in seven studies and yielded no significant association with susceptibility in the meta-analysis [18, 19, 27, 35, 49, 58, 62].

Fig. 2
figure 2

Funnel plot with MBL2 studies. Funnel plot with studies assessing the effect of MBL2 variant allele B, C or D (rs1800450, rs1800451 or rs5030737) on pneumococcal disease susceptibility. Each dot represents one study. The vertical blue dashed line corresponds to the mean effect size on susceptibility. The outer dashed lines indicate the triangular region within which 95% of studies are expected to lie. SE: standard error as the measure of study size with a reversed scale (most powerful studies are placed towards the top), OR: odds ratio as the effect size of the studies on a log scale

After binding of MBL to a pathogens surface, a serine protease called MASP (MBL-associated serine protease) is activated, which cleaves complement precursors to activated complement proteins further down the cascade [79]. Associations of polymorphisms in MASP2 with pneumococcal disease were assessed in two studies, but showed no significant effect on outcome and susceptibility [34, 73].

Surfactant protein A or D (SFTPA, SFTPD) are also collectins and act as a first line of defence against microorganisms in the nasopharynx and respiratory tract by facilitating elimination of microorganisms [80]. A study of 7 SFTPD and SFTPA polymorphisms in 326 pneumococcal CAP patients and 1538 controls showed no association of these genes with susceptibility [44]. Another study of 182 European Americans (EA) and 53 African Americans (AA) with IPD assessed the effect on susceptibility of 24 polymorphisms in SFTPA and SFTPD [46]. Because genotypic data was not provided they could not be included in the meta-analysis. Their strongest associations were with two SFTPD polymorphisms (rs17886286 and rs12219080; OR 0.45, 95% CI 0.25–0.82and OR 0.32, 95% CI 0.13–0.78), not corrected for multiple testing [46].

L-Ficolin (encoded by FCN2) is a pattern-recognition molecule, that enhances phagocytosis and activates the lectin pathway of complement activation after binding to lipoteichoic acid or Gram-positive bacteria [81]. Five functional polymorphisms in FCN2 were analysed in 290 patients with pneumococcal disease and in 720 controls yielding no associations with susceptibility [31].

After initiation of the three complement activation pathways the final common pathway is activated, in which C5 is converted into C5a,an important anaphylatoxin and a chemoattractant [82]. A Dutch study with 217 pneumococcal meningitis patients assessed the effect on outcome of 17 polymorphisms in 7 complement components further down the cascade [47]. This yielded 1 significant association of rs17611 in C5 with unfavourable outcome (OR 2·25, 95% CI 1·33–3·81) after correction for multiple testing [47]. Another Dutch study investigated in the same population the effect of these complement components on susceptibility showing no significant associations after correction for multiple testing [56].

Fcγ receptors

Fc (fragment crystallizable) receptors are found on the surface of immune cells and bind to immunoglobulins (Ig). Of the 6 types of Fcγ receptors, FcγRIIa and FcγRIIIa exists as two allotypic variants with different binding affinity for IgG [83]. The more common F158 allotype of the FCGR3A gene has a lower IgG affinity than the V158 allotype (rs396991) [84]. For the FCGR2A gene the more common H131 allotype has a higher IgG affinity than the R131 allotype (rs1801274) [84]. Seven studies assessed the effect of rs1801274 (FCGR2A) on susceptibility and 3 assessed the effect on outcome of pneumococcal disease [17, 22, 24, 33, 37, 45, 50, 52]. The outcome studies lacked genotypic data for the meta-analysis and one study on susceptibility was excluded, because patient overlap with another study [22, 33]. In the meta-analysis on susceptibility 6 studies with a total of 570 patients and 4972 controls were included and no overall effect was found [17, 24, 33, 37, 45, 52]. One study assessed the effect of rs396991 (FCGR3A) in 85 bacteraemia pneumococcal pneumonia patients and 1224 healthy controls, showing no effect on susceptibility and outcome [45].

NFκβ signalling pathway

NFκB (nuclear factor kappa-light-chain-enhancer of activated B cells) is a transcriptional regulator important for both the adaptive and innate immune response [85]. Six studies investigated the effect of polymorphisms in genes coding for modulators of the NFκB signalling pathway on outcome and susceptibility of pneumococcal disease [32, 38, 40, 55, 67, 75]. Five polymorphisms in genes coding for NFκB inhibitors could be analysed in a meta-analysis. The effect of polymorphisms in NFKBIA and NFKBIE (rs3138053, rs2233406, rs529948) on susceptibility was assessed in two studies, revealing no significant associations in the meta-analyses [32, 67]. Two other polymorphisms in the NFκB inhibitor genes NFKBIZ (rs616597) and NFKBIL2 (rs760477) were assessed in 3 cohorts for an effect on susceptibility and meta-analysis showed no significant associations [38, 40, 67]. A study including 531 adult pneumococcal meningitis patients and 376 controls studied two polymorphisms in CARD8 and NLRP1 both coding for proteins required for activation of NFκB or caspases in the context for inflammation or apoptosis respectively [85]. This study showed an association of rs2043211 in CARD8 with poor outcome (OR 2·10, 95% CI 1·04–4·21) and rs11651270 in NLRP1 with death (OR 2·32, 95% CI 1·12–4·78), but this was not significant after correction for multiple testing [55].

Cytokines

Cytokines are important molecules mediating cell signalling and include small proteins like chemokines, interferons, interleukins (ILs), lymphokines, or tumor necrosis factors (TNFs) [86, 87] Seven studies assessed the effect of polymorphisms in 11 cytokine genes on susceptibility, disease phenotype and outcome of pneumococcal disease [21, 23, 41, 42, 48, 68, 75]. The polymorphism rs1800795 in IL6 was assessed in two studies, showing no effect on susceptibility in the meta-analysis [23, 48]. One Spanish study with 144 IPD patients and 280 controls assessed the effect on susceptibility of 33 polymorphisms in the genes coding for IL-10, IL-12B, IL-1A, IL-1B, IL-R1 and IL-4 [75]. None were significantly associated after correction for multiple testing [75].

Macrophage migrating inhibitory factor (MIF) is a pro-inflammatory cytokine acting at the interface of the immune and endocrine systems [88]. The effect of polymorphisms in MIF on pneumococcal disease were investigated in one phenotype study showing effect of the high expression allele (rs5844572) on developing the meningitis phenotype and one outcome study showing effect of high expression alleles (rs5844572, rs755622) on unfavourable outcome and death [41, 68].

Coagulation and fibrinolysis factors

During severe infection the inflammatory response shifts the haemostatic balance towards a pro-coagulant state, which can lead to diffuse intravascular coagulation and organ damage [89]. Three studies assessed the effect of polymorphisms in coagulation or fibrinolysis genes on susceptibility and outcome of pneumococcal disease [39, 60, 66]. A study investigated the effect of the factor V Leiden (FVL) mutation (rs6025) in 163 patients and 8147 controls on IPD susceptibility and outcome, showing no significant associations [39].

Carboxypeptidase B2 (CPB2), also known as thrombin-activatable fibrinolysis inhibitor (TAFI), plays an anti-fibrinolytic role during fibrin clot degradation and an anti-inflammatory role by inactivating pro-inflammatory mediators, such as complement activation products [90]. A study with 716 pneumococcal meningitis patients studied the effect of polymorphisms in carboxypeptidase B2 (CPB2, rs1926447, rs3742264) on disease susceptibility and outcome [66]. No effect was found on susceptibility, but rs3742264 was associated with developing systemic complications (OR 0·40, 95% CI 0·20–0·79) [66].

Plasminogen activator inhibitor 1 (PAI-1) inhibits the pro-fibrinolytic enzymes urokinase and tissue plasminogen activator and thereby modulates fibrinolysis [91]. The effect of rs1799889 in the gene coding for PAI-1 (SERPINE1) on pneumococcal meningitis outcome was studied in a Dutch study with 400 patients and they found an effect on occurrence of cerebral infarction (OR 2·20, 95% CI 1·19–4·07), unfavourable outcome (OR 1·69, 95% CI 1·03–2·78) and mortality (OR 2·20, 95% CI 1·02–4·86) [60].

Other factors

Eight studies focused on genes that could not be categorized in the other subcategories. Two of these studies assessed the role of polymorphisms in the gene coding for C-reactive protein (CRP) in pneumococcal disease. CRP contains a dinucleotide repeat polymorphism in the intron region (rs3138528) which was assessed in a study with 205 IPD patients and 345 controls, showing significantly more patients had the 134 base pair allele than controls (OR 1·52, 95% CI 1·18–1·96) [20]. Another study investigated the effect of 3 polymorphisms in CRP (rs1800947, rs2794521, rs1130864) on outcome in 42 patients with a pneumococcal bacteraemia and found an association with mortality and rs2794521 (OR 9·6, 95% CI 1·3–72·5), not corrected for multiple testing [28].

Protein thyrosine phosphatases (PTPs) regulate the immune response through influencing the responsiveness of B and T cell receptors [92]. Rs2476601in the gene coding for PTP non-receptor type 22 (PTPN22) was assessed in two studies with in total 1492 IPD patients and 2050 controls [25, 67]. The meta-analysis showed no effect on susceptibility [25].

Nitric oxide synthase 2 (NOS2) is an enzyme encoded by the NOS2 gene, which is involved in nitric oxide production and apoptosis of macrophages [93]. Nine polymorphisms in NOS2 were investigated in a Malawian study, showing no influence of any of the variants on IPD susceptibility or survival [36].

One study investigated if rs37972 in the glucocorticoid-induced transcript 1 gene (GLCCI1) influenced disease outcome and the response to glucocorticosteroids in pneumococcal meningitis [54]. The function of GLCCI1 unknown, but it is expressed in both lung cells and immune cells and may be an early marker of glucocorticoid-induced apoptosis [94]. No association was found between rs37972 and mortality rates per dexamethasone treatment group [54].

Studies have showed bacteria are able to hijack the β2-adrenoceptor and thereby stabilize its binding to the endothelium which could enhance crossing the blood-brain barrier [95]. The effect of 2 functional polymorphisms in the β2-adrenoceptor (ADRB2) gene on susceptibility and outcome of pneumococcal meningitis was studied in 396 patients and 376 controls [51]. Rs1042714 of ADRB2 was associated with susceptibility (OR 1·52, 95% CI 1·12–2·07) but had no influence on outcome of disease [51].

Studies with hypothesis free approach

Five studies had a hypothesis free approach to find (new) genetic variations associated with pneumococcal disease. Two of them were sequencing studies in a selected group of genes [63, 72]. The first study sequenced 3 genes involved in the Toll-like receptor signalling pathway: MYD88, IRAK4, IKBKG (inhibitor of nuclear factor kappa-B kinase subunit gamma) of 164 IPD patients and 164 controls [63]. After sequencing 233 variants were identified of which one (rs4251545 in IRAK4) had a minor allele frequency (MAF) of more than 5%. This variant was associated with susceptibility to IPD (OR 1·50; 95% CI 1·10–2·04; p = 9·96 × 10− 3) but after correction for multiple testing this polymorphism did not retain statistical significance [63].

The other sequencing study sequenced 46 innate immune genes of 435 patients and 416 controls to assess the influence on outcome and susceptibility to pneumococcal meningitis [72]. They identified 2099 variations of which 80% had a MAF below 1% (1854 variations for susceptibility and 1385 for outcome). Neither the single nucleotide polymorphism (SNP) or haplotype analysis nor the analysis for association between a set of rare variants and phenotypes, reached the significance level after correction for multiple testing. The strongest associations with susceptibility were in CARD8, rs2008521 (OR 1·82; CI 1·28–2·75; p = 8·2 × 10− 4) and in CXCL1, rs56078309 (OR 1·96; CI 1·34–2·87; p = 8·2 × 10− 4) and with outcome were in IRAK4, rs4251552 (OR 2.86; CI 1·58–5·18; p = 4·8 × 10− 4) and NOD2, rs2067085 (OR 2·16; CI 1·40–3·34; p = 5·1 × 10− 4) [72].

Two of the hypothesis free studies were exome wide association studies performed in the same Dutch cohort of pneumococcal meningitis patients [69, 71]. Genotyping of subjects in these studies was done with an Illumina BeadChip consisting of more than 240,000 markers, with approximately 75% of these markers having a MAF below 5%. The first study assessed susceptibility to pneumococcal meningitis and included 469 patients and 2072 controls and a total of 100,464 polymorphisms passed quality control thresholds [71]. The strongest associations with susceptibility were rs139064549 in COL11A1 (OR 3·21; 95% CI 2·05–5·02; p = 1·51 × 10− 6) and rs9309464 in EXOC6B (OR 0·66; 95% CI 0·54–0·81; p = 6·01 × 10− 5), both did not reach the exome wide significance level [71]. The study on outcome included 472 culture proven pneumococcal meningitis patients and their strongest association was in AKT3, rs10157763 (OR 1·88; 95% CI 1·4–2·6; p = 9·9 × 10− 5) but this was not significant after correction for multiple testing [69].

The fifth hypothesis free study was a genome wide association study on pneumococcal bacteraemia susceptibility in 429 Kenyan children and 2677 controls [70]. In this study samples were genotyped with an Affymetrix® SNP chip and polymorphisms not passing the quality control with a MAF of less than 1%, a HWE of p < 1 × 10− 20 and a missingness of more than 2%, were excluded for imputation. After sample and SNP quality control 787,861 genotyped autosomal SNPs were left for analysis, which were extended to 10,996,499 autosomal SNPs after imputation. The study identified an association which reached the genome wide significance threshold between rs140817150in a long intergenic non-coding RNA (lincRNA) gene (AC011288.2) and pneumococcal bacteraemia susceptibility and replicated the results in a replication cohort with 113 children and 1136 controls (OR 2·47, 95% CI 1·84–3·31, p-combined = 1·69 × 10− 9) [70].

Discussion

We identified 60 studies evaluating host genetic variations in 16,034 patients with pneumococcal disease. Meta-analyses showed that genetic variants in the genes CD14 (rs2569190) and MBL2 (one of the variant alleles rs1800450, rs1800451 or rs5030737) were associated with susceptibility to pneumococcal disease. A hypothesis free approach was applied in few studies resulting in one genome wide significant association in a gene coding for lincRNA (rs140817150) with IPD susceptibility which was replicated in an independent IPD cohort.

Few findings were replicated in independent cohorts. Replication generally led to negative results, or – in case of MBL2 – careful analysis suggested considerable publication bias. The role of genetic variation on outcome was evaluated in about half of identified studies, but results were not confirmed because of the lack of detailed clinical metadata and heterogeneity of definitions and outcomes. To ease replication, international collaboration between study groups on genetics in pneumococcal disease is needed to ensure uniform research designs and outcome measures [96, 97]. This should lead to an open source research register for genetic associations studies, evaluating host and pathogen genetic data of pneumococcal disease, to facilitate data exchange and prevent publication bias. Such team-science effort is needed to decrease methodological flaws and contribute to more robust findings on the genetic basis of pneumococcal disease, a disease with enormous impact on global health [1, 97].

The significantly associated polymorphisms in the meta-analysis, in CD14 (rs2569190) and MBL2 (one of the variant alleles of rs1800450, rs1800451 or rs5030737) are known functional polymorphisms. The variant alleles of MBL2 have structural differences which are associated with decreased MBL concentrations and thereby decreased activation of the complement system [98]. Soluble CD14 (sCD14) is a pattern recognition receptor and acts as a co-receptor of TLR-4 to bind microbial components to endothelial and epithelial cells [99]. The risk allele T of rs2569190 for pneumococcal disease susceptibility in our meta-analysis, is associated with high sCD14 levels in expression studies [100, 101]. Our findings correspond with other studies showing the T allele is associated with an increased occurrence of sepsis and increased serum sCD14 levels in patients with risk genotypes [102] [103]. Although the causal allele might be not the association signal due to linkage disequilibration, these studies are suggestive for a causal relationship of genetic variation in both MBL2 or CD14 and susceptibility to pneumococcal disease.

The results of our meta-analyses should be interpreted with caution because many included methodologically flawed studies. First of all, sample sizes were often inadequate, whereby robust conclusions on the influence of the studied genetic variants could not be drawn. In studies focusing on outcome, small sample sizes result in few unfavourable events per study group and consequentially limited study power. Second, in most studies data collection was retrospective which might have led to missing data. Many studies had a retrospective inclusion design which poses a risk for to selection bias as reflected by the extremely low mortality rates among included patients. In other studies DNA was not available for a considerable proportion of patients, particularly those with more severe disease passing away before DNA collection. Inclusion of patients with less severe disease decreases study power and could underestimate influence of polymorphisms on severity or mortality of pneumococcal disease. Third, case selection differed between studies. Different phenotypes of pneumococcal disease, ethnicities and age categories were studied which could possibly limit the meta-analysis. In 30% of the studies ethnicity was mixed or not specified, which could be a major source for bias since frequencies of polymorphic genetic loci vary substantially between ethnic groups. Furthermore, control populations were heterogeneously selected and only 8 cohorts (of 57 cohorts; 14%) matched for both age and sex. Fourth, quality control procedures for DNA extraction and genotyping were rarely specified. Only half of the studies which determined genotypes by PCR followed by allelic discrimination methods (21 of 41 studies) stated they confirmed genotypes by sequencing or retesting of samples. In the candidate gene studies only 15 (27%) described the genotyping success rate and 7 (13%) blinding of laboratory personal. Four out of the five hypothesis free studies described extensive quality control procedures like genotyping accuracy, calling rates, and rates of missing samples [69,70,71,72]. Finally, statistical analyses differed between studies leading to different effect sizes or different cut-offs for significant associations. Logistic regression with correction for confounders was done in only half of the studies and about one third of the studies that assessed three or more polymorphisms did not correct for multiple testing.

In recent years, many loci have been identified by GWAS, since the cost of genotyping SNPs decreased and the cohort sizes increased [104]. Despite the success in identifying disease loci, understanding of how polymorphisms predispose individuals to disease remains limited [104]. Besides methodological flaws, it is likely single genes or genetic variants do not control susceptibility and outcome of complex traits. Probably most heritability can be explained by effects on genes outside core pathways due to interconnection with genes in regulatory networks expressed in disease-relevant cells [105]. In order to understand the genetics of complex traits future studies should focus on gene-gene interactions as well [97]. Other future approaches for increasing our understanding in heritability could be targeted or whole-genome sequencing in people with extreme phenotypes, in order to find variants in the lower frequency with larger effect domains [97]. Besides reference panels of genomic variation should be adequately used to enhance coverage of existing and future GWAS and methods for detection of copy number variants and other structural variants could be improved [97]. Besides all this, functional understanding of these variants is needed for better insight in pathogenesis of disease and drug discovery. For example the whole genome association study of the Kenyan Bacteraemia Study Group explored the functionality of a polymorphism in a gene coding for lincRNA, with a qPCR to quantify levels of RNA expression in leukocyte cell subtypes, observing elevation only in neutrophils [70]. Most of the studies included in this review investigated a functional role of identified polymorphisms by measuring serum of CSF protein expression, [20, 26, 28, 34, 41, 42, 44, 47, 55, 56, 58, 60, 66, 69, 72] but not all were able to demonstrate a functional effect. Moreover the majority of the studies (70%) did not analyse the functionality of the genetic variants.

Conclusions

Several host genetic polymorphisms have been identified to influence susceptibility and outcome of pneumococcal disease, but most of these studies are hampered by methodological flaws or were not reproduced (yet). Carefully designed whole-genome association and replication studies are needed with detailed clinical meta-data to further clarify and confirm the genetic basis of pneumococcal disease. To improve our understanding in the functionality of polymorphisms the next step is to investigate the downstream molecular effects of polymorphisms with large-scale clinical cohort studies within a specific acute illness as pneumococcal disease.