Background

The term infectious diseases (or communicable diseases) defines a wide range of health disorders resulting from the invasion of an infectious agent within the host body. The infectious agents comprise all the microorganisms or macro-organisms which are competent to invade the host body using multiple modes of transmission (directly or indirectly) to generate an infectious disease (Barreto et al. 2006; Webster et al. 2017). Infectious diseases, caused by diverse infectious pathogens (viruses, bacteria, fungi, and parasites) are considered to be within the top leading serious health conditions worldwide. In particular, pathogens involved in lower respiratory infections have been shown to be among the leading causes of death worldwide in the year 2015 (WHO 2017; GBD 2015 LRI Collaborators 2017). Effective pathogen invasion to the host body is dependent on multiple factors, which are mostly related to the host features and environmental conditions and partly related to the pathogen genotypic features (Casadevall and Pirofski 2017). Host susceptibility towards a specific infectious pathogen is a complex trait that is highly influenced by age, sex, immune system functionality, host microbiota, environmental health (food hygiene/personal hygiene), climate conditions, and the host genetic makeup (Yi Rang et al. 2018; Nikolich-Žugich 2018; Casadevall and Pirofski 2017; Belkaid and Harrison 2017). Although host susceptibility enables the occurrence of infection, significant variations are observed between individuals relating to the specific pathogen-generating asymptomatic infection in one individual versus lethal response in another, as well as in their response to antimicrobial treatment. Such variations are believed to be attributed to the host genetic components (Verhein et al. 2018; Hollox and Hoh 2014; Calcagno et al. 2017). With the emergence of the genetic study era, new tools including advanced sequencing techniques, DNA arrays, GWAS, and whole exome sequencing have enabled more studies to be performed in human populations to target the identification of the genetic architecture underlying host susceptibility or resistance, as well as to study the severity of the host response (Kenney et al. 2017; Tian et al. 2017; Abel et al. 2014; Manry and Quintana-Murci 2013). On the pathogen side, advanced molecular studies were performed targeting the pathogen genetic features determining strain, virulence, infectivity, pathways of pathogen invasion (receptor binding/inhibition), infection pathogenesis, pathogen susceptibility to antimicrobial treatments, and pathogen–host gene interactions (Lanza et al. 2018; Kucharski et al. 2016; Yang et al. 2008). Nonetheless, human genetic studies of complex diseases and traits encountered many constraints, mainly related to obtaining coherent and standardized environmental conditions with proper controls, due to conflicts and contradictions surrounding the ethical code for the protection of patients in human-subject studies (Barrow and Gossman 2017). To surmount these constraints, standard methodology in complex disease research adopts the utilization of recombinant inbred mouse strains and comparative and translational genomics approaches for mapping the genomic locations of the QTLs in linkage with the observed phenotypic variations between individuals (Crow 2007; Williams et al. 2001; Haendel et al. 2015). As expected, cross-species phenotyping, QTL mapping, and translational research introduced considerable advances in identification of the genetic architecture of complex traits but still required enhancement of the mapping resolution. Subsequently, a scientific panel on complex trait analysis proposed the development of the Collaborative Cross (CC), a unique mouse model that imitates the level of genetic diversity observed in humans, to enable high-resolution mapping of the genetic basis of human complex diseases (Threadgill et al. 2002; Churchill et al. 2004). Full-reciprocal inter-crosses of eight founder mouse strains, A/J, C57BL/6J, 129S1/SvImJ, NOD/LtJ, NZO/HiLtJ, CAST/Ei, PWK/PhJ, and WSB/EiJ, generated the CC mouse model recombinant inbred lines, which now comprise a set of ~ 200 available CC lines (Iraqi et al. 2012; Roberts et al. 2007). At first, the genotypes of the CC lines were obtained using the high-density mouse diversity array (MDA), containing ~ 620,000 single nucleotide polymorphism (SNP) marker sets that captures the genetic diversity of the CC lines’ founder strains, both classical and wild-derived (Yang et al. 2009). Subsequently, heterozygous SNPs, SNPs with missing genotypes within the eight CC founder strains, and SNPs with genotyping errors were eliminated, resulting in 170,935 SNPs, which were mapped onto build 37 of the mouse genome (Durrant et al. 2011). Advanced generations of the CC lines were re-genotyped using the mouse universal genotype array (MUGA) of ~ 7500 SNPs, and once again re-genotyped five generations later using the Mega-MUGA method of ~ 77,000 SNPs (Iraqi et al. 2012, 2008; Welsh et al. 2012). The particularly high diversity observed in the CC lines is attributed mainly to the inclusion of the three wild-derived strains (CAST/EiJ, PWK/PhJ, and WSB/EiJ) within the CC parental founders, representing three distant mouse progenitors: M.m. castaneus, M.m. musculus, and M.m. domesticus, respectively. Evidently, the wild-derived strains are more distant than are the classical strains from the C57BL/6J reference genome, where PWK and CAST vary at 17 million SNPs compared to four million SNP variations for the classical strains (Keane et al. 2011). The contribution of the additional sequence variants, not segregating within the classical strains, was believed to enhance the discovery of novel and high-resolution associations of disease phenotypes with polymorphisms (Valdar et al. 2006). To date, extensive genetic studies in many research areas have been successfully accomplished using the CC mouse model, achieving novel QTL mapping with high-resolution (~ 1 Mb), identification of candidate genes underlying the QTL, next-generation RNA-sequencing for gene expression variations, and estimation of founder effect size, merely by phenotyping a minimal number of CC lines (~ 50 CC lines), induced by certain environmental conditions, with no need for assessing the parental strains (Valdar et al. 2006; Kovacs et al. 2011; Durrant et al. 2011; Vered et al. 2014; Levy et al. 2015; Aylor et al. 2011; Philip et al. 2011; Kelada et al. 2011; Xiong et al. 2014; Ram et al. 2014; Gralinski et al. 2015; Rogala et al. 2014; Phillippi et al. 2014; Ferris et al. 2013; Thaisz et al. 2012; Bottomly et al. 2012; Mathes et al. 2011; Gelinas et al. 2011; Zombeck et al. 2011; Abu-Toamih Atamni et al. 2016a, b; Abu-Toamih Atamni et al. 2017; Nashef et al. 2017). Additionally, broad-sense heritability (H2) was assessed as the extent of phenotypic variety related to the differences between CC lines using the formula: H2 = Vg/(Vg /Ve) and the analysis of variance (ANOVA) test of each trait, as detailed in our previous publication (Iraqi et al. 2014). Hence, the CC mouse genetic resource population adheres to the paradigm “genotype once, phenotype many times,” where the genotypic data are available for use in multiple studies of complex trait diseases (Iraqi et al. 2012). Herein, we present the implementation of the powerful CC mouse model in our lab for dissecting the genetic basis of host susceptibility towards various infectious pathogens, including Aspergillus fumigatus (Durrant et al. 2011), Klebsiella pneumoniae (Vered et al. 2014), co-infection with Porphyromonas gingivalis and Fusobacterium nucleatum (Shusterman et al. 2013a, b; Nashef et al. 2018), Pseudomonas aeruginosa (Lorè et al. 2015), and host response towards microbial toxins (Lipopolysaccharide (LPS) and Lipoteichoic acid (LTA)) (Nashef et al. 2017) for studying Sepsis.

Aspergillus fumigatus pathogen

Aspergillus fumigatus (A. fumigatus) is a ubiquitous human pathogenic and opportunistic fungus, leading to acute Aspergillosis in immunocompromised patients, which initially appears as primary pulmonary infection and evolves into severe systemic damage and high mortality rates (Latgé 1999; Soubani and Chandrasekar 2002; Ghazaei 2017). Patients who are immunocompromised as a result of HIV infection, neutropenia, oncological, or organ transplants conditions, are highly prone towards virulent A. fumigatus pulmonary Aspergillosis due to their inadequate innate immunity (Maschmeyer et al. 2007). However, host response towards specific pathogens, in this case variants of A. fumigatus, is a complex trait which varies between patients largely due to host genetic components contributing to susceptibility or resistance response and to severity of disease (van de Veerdonk et al. 2017; Nivoix et al. 2008; Li et al. 2016). These phenotypic variations between the hosts were demonstrated in a successful study in our lab by phenotyping 371 immune-competent mice from 66 CC lines post A. fumigatus infection challenge. Results of this study showed the mapping of eight QTLs, of which five were contributed mainly by the wild-derived strains. Host response was evaluated by survival time post-infection and varied significantly (p < 0.05) between the CC lines (Fig. 1), presenting a wide profile of responses ranging from 4 to 28 days of survival (Durrant et al. 2011). These results confirm the essential role of genetic variability among the CC lines and the extensive advantages of the CC genetic resource in providing high-resolution mapping of QTLs affecting a wide variety of traits, including susceptibility to a spectrum of infectious diseases, in naïve non-immunocompromised mice (Iraqi et al. 2014; Yang et al. 2009; Durrant et al. 2011; Vered et al. 2014; Nashef et al. 2018). To our knowledge, this is the first report of a murine study to assess host response to A. fumigatus and enabling successful mapping of three QTLs in naïve non-immunocompromised mice. In fact, since founder effects were contributed mainly by the wild-derived strains, without these founders it may not be possible to dissect such complex disease. These findings and conclusions were also confirmed in recent published studies using the CC mouse model for different complex traits, including host response to multiple infectious diseases such as West Nile virus (Green et al. 2017), Influenza A viruses (Elbahesh and Schughart 2016), Influenza H3N2 (Leist et al. 2016), Klebsiella pneumoniae (Vered et al. 2014), Ebola hemorrhagic fever (Rusmussen et al., 2014), and Aspergillus fumigatus (Durrant et al. 2011).

Fig. 1
figure 1

Mean survival time (days) of different CC lines in response to Aspergillus fumigatus infection (Durrant et al. 2011). The X-axis represents the different CC lines while the Y-axis represents the mean survival time in days (± SEM). Full details of the analysis are presented in Durrant et al. (2011)

Klebsiella pneumoniae pathogen

Klebsiella pneumoniae (K. pneumoniae) is gram-negative bacteria, a common multi-drug-resistant opportunistic pathogen that is associated with nosocomial (hospital-acquired) infections including pneumonia, urinary tract infections, gastroenteritis, soft tissue infections, and sepsis (Jarvis et al. 1985; Gerding et al. 1979; Podschun and Ullmann 1998). Moreover, classical K. pneumoniae is a major cause of ventilator-associated pneumonia with high morbidity and mortality rates, particularly in intensive care units (ICUs) (Patil and Patil 2017). Notwithstanding the immense association of K. pneumoniae with nosocomial infections, emerging hypervirulent strains of K. pneumoniae (hvKPs) such as K1/K2 serotypes, have the capability to cause severe and life-threatening community-acquired infections in healthy young patients (Shon et al. 2013; Lee et al. 2017). Several factors determine K. pneumoniae pathogenicity and virulence levels, of which LPS and capsular polysaccharide (CPS) play highly significant roles in bacterial resistance to host immune response (phagocytosis) and drug resistance (Lee et al. 2017; Podschun and Ullmann 1998). Furthermore, host genetic background plays a significant factor in determining infection effectiveness and pathogenicity, whereas both sides contribute host–pathogen interactions. Accordingly, we have assessed the host genetic basis for susceptibility or resistance to infectious K. pneumoniae (K2 serotype) using 328 CC mice generated from 73 CC lines challenged with an intraperitoneal (IP) infection dose of 104 CFU/ml (colony forming units per ml) (Vered et al. 2014). The recorded phenotype for post-infection response was mouse survival time (days) and body weight changes (g), monitored daily for 15 days post-infection. Study findings revealed significant variations in survival time (Fig. 2a) between CC lines (p < 0.05), but not for body weight changes (data not shown)..The observed survival profiles between susceptible CC lines (≤ 2 days survival) and resistant CC lines that survived 15 days post-infection demonstrated the high genetic diversity of the CC lines, which contained a variety of responses. Interestingly, day 7 post-infection appears to be a critical time-window in infection pathogenesis; a host that survived day 7 eventually survived until day 15 and completed the challenge (resistant). Moreover, broad-sense heritability of the survival time trait was high, reaching the value of 0.45, which emphasizes the major role of the host hereditary factors in determining host response towards the K2 serotype of K. pneumoniae. Using the phenotypic and genotypic data of 48 CC lines, we have mapped three significant time-specific QTLs in linkage with survival time (days) of the host following K. pneumoniae infection (Fig. 2b). The three QTLs were named Klebsiella pneumoniae-resistant locus 1, 2, and 3 (Kprl1, Kprl2, and Kprl3), and are located on Chromosomes 4, 8, and 18, respectively. These QTLs were each mapped for specific time of survival; Kprl1 QTL mapped to survival time until day 2 post-infection (susceptible), while Kprl2 and Kprl3 QTL mapped to day 8, suggesting a time-dynamic genetic host response that activates different genes during the different stages of disease progression. Founder effects estimation determined that the observed variations between the CC lines post-infection were contributed mainly by the wild-derived founder strains. Furthermore, examination of the mapped genomic intervals using the mouse genome database (http://www.informatics.jax.org) revealed several candidate genes that were highly associated with host susceptibility or resistance towards K. pneumoniae. The suggested candidate genes also included well-known protein-coding genes that play significant roles in pathways of pathogen invasion to the host body. These include the catenin alpha-like 1 (Ctnnal1) gene of the Kprl1 QTL, which plays central roles in cell adhesion, as well as the actin-like 7a and 7b (Actl7a and Actl7b) candidate genes of the Kprl1 QTL which are involved in phagocytosis and in eliminating bacterial infection. The mapped QTLs in linkage with the host response phenotype towards K. pneumoniae infection are independent of the A. fumigatus mapped QTLs, proposing distinct genetic networks in the host in association with specific pathogens, which should be further investigated to identify advanced levels of gene expression variation between controls (naïve) and post-infection individuals. To our knowledge, this is the first report of a murine study to assess host response towards the K.pneumoniae pathogen, enabling successful mapping of three QTLs in naïve, non-immunocompromised mice. This achievement is believed to be due mainly to the presence of the three wild-derived strains (CAST/Ei (M.m.castaneus), PWK/PhJ (M.m.musculus), and WSB/EiJ (M.m.domesticus)) within the eight CC founder strains, enriching the known structure of genetic variations among the resulting mouse strains, due to their genetic distinction from classical laboratory strains.

Fig. 2
figure 2

Genetic dissection of host response to Klebsiella pneumoniae infection. a Shows phenotypic profile of mean survival time (days) of the different CC lines and four inbred strains after infection with Klebsiella pneumoniae. The X-axis represents the CC lines, while the Y-axis represents the mean survival time in days (± SEM). b and c present the QTL mapping results using the phenotypic and genotypic data of the CC lines. In total, results revealed three significant QTLs associated with survival time after infection with Kp; one QTL on day 2 on chromosome 4 named Kprl1 (b) and two QTLs on day 8 named Kprl2 and Kprl3 on chromosomes 8 and 18, respectively (c). The X-axis shows the 19 autosomes plus the X chromosomes, while the Y-axis shows the log p value of the linkage analysis. (Vered et al. 2014)

Pseudomonas aeruginosa pathogen

Pseudomonas aeruginosa (P. aeruginosa) is a gram-negative, multi-drug-resistant (MDR), opportunistic and nosocomial pathogen widespread in hospital ICUs and healthcare systems worldwide (10–15% of cases). It is ranked second in the WHO list of critical-priority drug-resistant bacteria on which to focus for the development of new therapeutic strategies (Tacconelli et al. 2017; Majumdar and Padiglione 2012; Vincent 2003). P. aeruginosa infection leads to high morbidity and mortality rates particularly among immunocompromised patients, post-invasive medical procedure patients, and cystic fibrosis (CF) patients (Wieland et al. 2018; Stefani et al. 2017; Winstanley et al. 2016). P. aeruginosa incidence in ICUs constitutes a global major threat due to higher risk for severe ventilator-associated pneumonia and sepsis associated with multi-systemic damage and high mortality (Gellatly and Hancock 2013). However, significant phenotypic variations were reported between individuals’ morbidity and mortality rates in response to P. aeruginosa infection, indicating the important role of host genetic variants in determining the outcome of the host–pathogen interplay. Consequently, various pathogen-oriented genetic studies were performed, targeting the identification of P. aeruginosa genetic features associated with host response phenotypic variations in disease progress and pathogenesis (Ramanathan et al. 2017; Bianconi et al. 2011; Cigana et al. 2009; Bragonzi et al. 2009; Nguyen and Singh 2006). On the other hand, host-oriented genetic studies focused on dissection of the host genetic architecture contributing to the observed phenotypic variations in host response to certain sub-types of P. aeruginosa, assessing both human populations and animal models (Wang et al. 2017; Di Paola et al. 2017; Alhazmi et al. 2018; Weiler and Drumm 2013). Additionally, various studies using different mouse strains infected with certain sub-types of P. aeruginosa demonstrated highly significant host phenotypic variations post-infection between the different strains, evident support for the role of host genetics in infection pathogenesis (Bragonzi 2010; De Simone et al. 2014). A recent study focusing on mapping host genetic features in linkage with susceptibility or resistance to P. aeruginosa was performed using the Collaborative Cross (CC) mouse population. In this study, 17 CC lines (92 mice: 50 males, 42 Females) and a susceptible control group of A/J mice were challenged with intratracheal injection of P. aeruginosa, and were subsequently monitored for 7 days post-infection (Lorè et al. 2015). Host phenotypic responses were measured in terms of modifications in survival time (days) and body weight (gram) throughout the 7-day post-infection observation period. Survival time varied significantly between CC lines, showing a wide range of survival profiles (Fig. 3a), from highly susceptible CC lines with a lethal response (1.5 days survival) to highly resistant (7 days survival). Moreover, similar phenotypic variations between the CC lines were recorded for body weight modification post-infection, showing CC lines with severe body weight loss versus CC lines with body weight recovery 5 days post-infection (Fig. 3b). Heritability (H2) calculations for the measured traits were 0.54 for survival time and 0.28 for body weight modifications (Table 1). A similar study for mapping the genetic components underlying the host response to P. aeruginosa was previously performed by De Simone et al. (2016), using an F2 intercross population by mating A/J (as susceptible) with C3H/HeOuJ (as resistant). This study revealed mapping of a significant locus associated with susceptibility towards P. aeruginosa on chromosome 6 and designated as Pairl1 for P. aeruginosa infection-resistance locus 1, positioned at genomic location 90.8 Mbp with a genomic interval of 20.7 Mbp (81.5–102.2 Mbp). The genomic interval of Pairl1 is relatively large, compared to extremely low intervals (> 3 Mbp) achieved in different studies when using the CC mouse model, although not using this specific pathogen. QTL analysis of the current study is ongoing and is expected to be published soon. These findings add evidence for the strong role of host genetic features in determining infection pathogenesis.

Fig. 3
figure 3

Response to P. aeruginosa airway infection of CC lines (Lorè et al. 2015). a Shows the mean survival time in days (± SEM). b Shows percentage changes in body weight following the infection. The X-axis represents the CC lines, while the Y-axis represent the mean survival time in days (± SEM) (a) and % change in body weight (b)

Table 1 Estimated broad-sense heritability (H2) of a variety of phenotypic host responses to different infectious pathogens assessed using the CC mouse model

Bacterial Sepsis

As stated by the Sepsis-3 task force, the updated definition and criteria for sepsis is a life-threatening organ dysfunction caused by a dysregulated host response to infection (Singer et al. 2016). Sepsis is a complex nosocomial disease caused by the host’s immune response triggered by an infectious pathogen, most likely gram-negative bacteria (bacterial sepsis) leading to systemic health complications and high mortality, particularly in elderly patients, preterm infants, or immunocompromised patients (Levy et al. 2003; Munford 2006; Collins et al. 2018). Notwithstanding the ongoing advances in therapeutic strategies and medical technologies, worldwide sepsis-related morbidity and mortality are constantly increasing, and are estimated to be one of the major leading causes of death in ICUs worldwide, albeit to a great extent in low-middle income countries (Friedman et al.1998; Fleischmann et al. 2016; Reinhart et al. 2017). Pathogenesis of sepsis is determined by complex host–pathogen interplay, which is known to be controlled by genetic features of both the pathogen and the host, together generating the multifaceted characteristic features of sepsis in different patients (Goh et al. 2017; Christaki and Giamarellos-Bourboulis 2014; Davenport et al. 2016). Bacterial sepsis is principally caused by a host systemic immune system severe response towards the infectious pathogen virulence factors, such as the microbial toxins LPS featured in gram-negative bacteria and LTA in gram-positive bacteria (Chandler and Ernst 2017; van der Poll and Opal 2008; Mattsson et al. 1993). Moreover, host septic responses towards bacterial virulence factors, including LTA and LPS, vary between individuals in prevalence, severity, progress, and outcome, indicating the major role of host genetic features associated with bacterial sepsis pathogenesis. Furthermore, studies of sex differences in sepsis are contradictory due to the complexity of the disease and the involvement of multiplex genetic factors of the host and pathogen; several studies suggest no difference, while others suggest higher risk in females, or higher risk in males, which will require further investigation (Failla and Connelly 2017). Mapping of the host genetic susceptibility components will lead to advances in the direction of personalized medicine for prevention and medical cure of sepsis to halt the increase in sepsis morbidity and mortality. Two ongoing studies in our lab are aimed at mapping the host genetic response towards LPS and LTA, separately, using the CC lines challenged by LPS or LTA injections. The LPS study consists of 296 mice generated from 16 CC lines, injected (15 mg/1 kg mouse) with IP LPS (L2630 Sigma) from Escherichia coli (E. coli 0111:B4) dissolved in PBS and monitored for 72 h post-infection to assess the host phenotypic response measuring survival time (h), body temperature (°C), and weight (g) modifications (unpublished data). Body weight was measured using digital balance and rectal body temperature using an EcoScan temp4 thermometer with a special probe designed for small animals, at different time points post injection (0, 2, 4, 6, 24, 30, 48, 54, 72 h). The CC lines varied significantly in their phenotypic responses at all levels, including survival time (Fig. 4a), body weight (Fig. 4d), and temperature modifications (Fig. 4c). Interestingly, sex differences within the CC lines in their host response were significant (p < 0.05) and varied between the CC lines, showing for several CC lines that females are much more resistant towards LPS than males of the same CC line (Fig. 4b). Altogether, these data support the accumulated evidence for the complexity of sepsis, despite the controlled environmental conditions and the use of a microbial specific toxin. At the time of writing this report, further CC lines are under assessment for LPS and LTA in order to increase the study population size which should enable QTL mapping of sepsis-related genomic features.

Fig. 4
figure 4

Septic response towards infection of CC lines with the microbial toxic component Lipopolysaccharide (LPS). a Shows CC lines mean (± SEM) survival time (Hr) profile during the 72 h after the infection, b shows the data split by sex in each CC line due to the presence of a significant sex effect. The white columns represent females and the blue columns represent males. The X-axis represents the CC lines, while the Y-axis represents the mean survival time in hours (± SEM). c and d show multiple patterns of temperature (c) and body weight (d) changes during the 72 h post-infection. Changes in temperature and body weight were calculated as a percentage of the initial value. The X-axis represents time after LPS injection (Hr), the Y-axis represents percentage of the initial body temperature (c)/weight (d) (Nashef et al. 2017)

Insights into human periodontal disease using the CC population

Periodontitis (dental infection) is a polygenic inflammatory disease that compromises the integrity of the tooth-supporting tissues with an adverse impact on systemic health and a complex etiology at multiple levels (Hajishengallis 2014). The transition from periodontal health state to disease (periodontitis) is associated with a dramatic shift from a symbiotic to dysbiotic microbial community in the oral cavity. However, although independent microbial dysbiosis may not necessarily precipitate periodontitis, it could initiate the disease in conjunction with the co-occurrence of other risk factors such as host genotype, diet, and environmental factors such as smoking (Hajishengallis 2014; Hajishengallis et al. 2012a, b). Within the mentioned risk factors, host genetic background is a major determinant for host susceptibility to periodontitis, inasmuch as alterations in gene expression levels or regulation contribute to the disease development, progress, and severity.

Various GWAS studies in human were previously carried out to identify biomarkers for periodontal disease susceptibility at the host genotypic level (Divaris et al. 2012; Shimizu et al. 2015; Hashim et al. 2015; Schaefer et al. 2010; Teumer et al. 2013; Ernst et al. 2010). However, lack of proper control and standardization of human studies along with the high heterogeneity of periodontitis itself constitutes a major constraint in the mapping of the genetic basis for periodontitis (Vaithilingam et al. 2014). Hence, we have assessed recently the power of the CC population in conjunction with a genetic analysis of the human orthologous chromosomal regions to dissect the genetic basis of periodontitis. Our study was performed by integrating QTLs associated with experimental periodontitis in the CC population, with imputed human genotype data from two large case–control clinical sub-types of periodontitis: aggressive periodontitis (AgP) and chronic periodontitis (CP) (Nashef et al. 2018). The initial stage of the study assessed 25 CC lines (286 mice) for host response towards a mixed culture of periodontal pathogens (Porphyromonas gingivalis and Fusobacterium nucleatum) based on the well-established mixed infection model (Baker et al. 1994; Polak et al. 2009). Phenotypic host response was estimated by measuring the alveolar bone loss phenotype, as a major hallmark of periodontitis development and progress, using the Compact fan-Beam-type Computerized Tomography system (µCT) (Wilensky et al. 2005). The CC lines showed significant wide variations in their phenotypic response to the mixed infection challenge, ranging between moderate and high levels of bone loss for the different CC lines, as well as post-infection bone formation for some CC lines, similar to condensing osteitis (Fig. 5). Subsequently, linkage analysis was conducted for QTLs associated with percentage Alveolar Bone Loss using the phenotypic and genotypic data of the different CC lines. The QTL analysis revealed two significant QTLs (Table 2) in linkage with percentage alveolar bone loss, located on chromosome 1: 180–181.5 Mbp (1.5 Mb) and chromosome 14: 93.5–96.5 Mbp (3 Mb) and designated as Perio3 (Periodontitis) and Perio4 in continuation of the previously reported QTLs (Shusterman et al. 2013b). In addition to that, eight suggestive QTLs were mapped and designated as Perio3 to Perio10 (Table 3). Interestingly, the Perio3 QTL overlaps with the previously mapped Perio3 QTL in a similar study by our group using the progenies of the A/Jx BALB/cJ intercross F2 population (Shusterman et al. 2013b). Using 408 F2 mice, Shusterman et al. (2013b) mapped two significant QTLs in linkage with periodontitis located on chromosomes 5 (Perio1 QTL) and 3 (Perio2 QTL), and a suggestive QTL located on chromosome 1 (Perio3 QTL) at 50% genome-wide significance (log p = 2.3) with log p 2.47. As expected, performing the study using the CC mouse model enhanced the significance level of Perio3 to make it highly significant (at 95% genome-wide significance) with higher resolution (Fig. 6a). Moreover, due to the contribution of the wild-derived strains among the CC founder strains, a new QTL was mapped when using this model (Perio4) (Fig. 6a). A mouse genome database (http://www.informatics.jax.org) search for candidate genes located within the genomic intervals of Perio3 and Perio4 QTL revealed in total 80 candidate genes. Next, we used the Merge analysis approach (Durrant et al. 2011) to identify variants that give rise to the significant QTLs Perio3 and Perio4. The merge analysis revealed six casual-effect variants and two linked-effect variants at both Perio3 and Perio4. Estimation of founder haplotype effect showed that the most significant loci on chr1 and chr14 were shown to be less affected by WSB/EiJ (wild-derived strain) than by the rest of the parental strains, which seem to have quite similar effects on the trait (Fig. 6b). Subsequently, candidate genes identified by merge analysis (Fig. 6c, d), significant QTLs, and suggestive QTLs were tested for their association to periodontal diseases in the GWAS-Catalog (Welter et al. 2014), as well in the available data of the human case–control samples. Briefly, orthologous human chromosomal regions were analyzed using available imputed genotype data (OmniExpress BeadChip arrays) derived from case–control samples of aggressive periodontitis (AgP; 896 cases, 7104 controls) and chronic periodontitis (CP; 2746 cases, 1864 controls) of northwest European and European American descent, respectively.

Fig. 5
figure 5

a The means of the alveolar bone volumes (+ SEM) of 25 different CC lines. The X-axis represents the different CC lines while the Y-axis represents the mean of the alveolar bone volume in mm3 evaluated by micro CT. The black bars represent the mean of the control alveolar bone volume (CBV) while the gray bars represent the mean of residual bone volume after mixed infection (RBV). Asterisks represent the significant differences between the two groups (p value < 0.05). b Percent alveolar bone volume loss (PBL) due to the mixed infection (+ SEM) of 25 different CC lines. The X-axis represents the different CC lines while the Y-axis represents the mean percent of alveolar bone volume loss among the different lines. (Nashef et al. 2018)

Table 2 The locations of the mouse QTLs associated with host susceptibility to periodontitis, their genomic intervals, and their corresponding human orthologues are listed in the table, according to significance threshold of QTL. (Nashef et al. 2018)
Table 3 Candidate genes that were revealed through integration of mouse QTL associated with periodontitis and human GWAS
Fig. 6
figure 6

Reproduced with permission from (Nashef et al. 2018)

a Genome scans of susceptibility to alveolar bone volume loss percentage in 25 different CC lines. The X-axis represents genome location; the Y-axis represents the minus log p value of the test of association between locus and percentage of bone loss. Two QTLs associated with percentage of bone loss after infection mapped on chromosome 1 with genome coordinate (181.5–182.5 Mbp) and chromosome 14 with genome coordinate (93.5–96.5 Mbp). b Founder effect estimated haplotype effects at Perio3 and Perio4 QTLs for alveolar bone loss after mixed infection with P. gingivalis and F. nucleatum. Effects are shown as deviations relative to WSB/EiJ, which is arbitrarily assigned the trait effect of 0. The X-axis of each plot shows the founder strains; the Y-axis shows the estimated haplotype effects of the CC founders. c Merge analysis of sequence variants at Perio3 and d Perio4. The X-axis is genome location; the Y-axis is the minus log p value of the test of association between locus and alveolar bone loss phenotype. The continuous black lines are sections of the genome scans a while gray dots are the results of ANOVA tests of sequence.

Three out of seven candidate genes that were selected based on merge analysis and 14 out of 31 human orthologous genes of the significant QTLs, previously showed gene-centric associations of periodontal sub-phenotypes (Rhodin et al. 2014). In addition, we found that one of the suggested orthologous genes, CCDC121, within the Perio3 QTL, is located 5 kb upstream of a previously reported risk variant of chronic periodontitis (p value = 8.0 × 10−6, OR = 3.46) (Teumer et al. 2013). However, we could not replicate these associations in our available AgP and CP samples. Further analysis was performed using the corresponding human genes with an additional 1229 mouse genes located within the genomic intervals of the suggestive QTLs. Data analysis revealed six candidate genes (NRG3, ZNF579, FIZ1, ZNF524, PARK2, and PACRG) showing nominal significant associations with either AgP or CP. Regional association plots of three genes (NRG3, PARK2, and PACRG) that showed associations with both AgP and CP in our human data are shown in Fig. 7. Eventually, our study revealed seven candidate genes based on the integration of mouse QTLs and human GWAS (Table 3).

Fig. 7
figure 7

Reproduced with permission from (Nashef et al. 2018)

Association plots of imputed genotypes at the NRG3, PARK2, and PACRG loci. Regional association plots of imputed genotypes for aggressive periodontitis (a) severe chronic periodontitis (b) and moderate chronic periodontitis (c) for the chromosomal region spanning NRG3. Regional association plots of imputed genotypes for aggressive periodontitis, (d) severe chronic periodontitis, (e) and moderate chronic periodontitis (f) for the chromosomal region spanning PARK2, and PACRG. The minus log p values of the analyzed SNPs were plotted as a function of the genomic SNP position. SNP annotation provided by Locus Zoom databases.

Overall, our findings confirm that utilization of the CC mouse model populations is a powerful method for mapping the susceptibility to alveolar bone loss using a minimal number of 25 CC lines. The high genetic diversity of the CC mouse model enabled successful mapping of two significant QTLs to a particularly narrow region of ~ 1.5 to ~ 3 Mb compared to previous findings of ~ 35 Mb revealed by using an F2 approach (Shusterman et al. 2013b). Furthermore, our observation of several candidate genes (suggestive QTLs) which were replicated in AgP, moderate CP, and severe periodontitis emphasizes the potential of the CC mouse model for dissection of the genetic basis of human complex diseases. Phenotype–genotype and gene expression data from larger human and CC mouse cohorts will be required for enhanced identification of true positive associations related to the complex etiology of periodontitis.

Conclusions

Herein, we have described a notable assemblage of genetic studies in our lab targeting identification of the genetic basis of host susceptibility or resistance towards various infectious pathogens using the Collaborative Cross (CC) mouse population. The CC is a next-generation mouse genetic reference population designed for the genetic study of human complex trait diseases and livestock agriculture, identification of candidate genes associated with the host phenotypic variations, and characterization of the gene-networks involved in disease pathogenesis. Our studies emphasize the leading role of host genetic background in determining infection potency, severity, and pathogenesis using the powerful CC mouse genetic reference population. Given the ability to map QTLs associated with any given trait with high resolution in the CC population, it will now be possible to identify genes within the mapped loci responsible for phenotypes of interest by using a specific gene knockout approach. Furthermore, it will now be possible to translate CC mouse results to human by using human GWAS analysis to identify human genes orthologous to those mapped in the CC population, a successful approach described by Nashef et al. (2018).