Background

In ruminant livestock and some wild-life species, Mycobacterium avium ssp. paratuberculosis (MAP) causes Johne's disease, a chronic inflammatory bowel disorder (IBD) that parallels human Crohn's disease in many respects. Since MAP is a slow-growing intracellular pathogen, infected cattle typically remain asymptomatic for 2 to 10 years making it difficult to control Johne's disease in dairy herds [1]. During this asymptomatic period, the pathogen can be horizontally transmitted to other herd members via contaminated feces, and vertically transmitted to calves via contaminated milk and colostrum [1].

Although it is debatable, the presence of MAP in milk poses a potential zoonotic risk to humans [2]. This may be particularly relevant for individuals that are genetically predisposed to IBD, since MAP has been implicated as one of several potential pathogens associated with Crohn's disease [3]. A meta-analysis of studies examining the presence of MAP in patients with Crohn's disease or ulcerative colitis for example, showed that there was a greater likelihood of detecting MAP in diseased versus healthy individuals [4]. Additionally, clinical studies have also shown that anti-mycobacterial treatment of some patients with Crohn's disease can lead to pathological remission [5].

Variability in the susceptibility of cattle to MAP infection is evident. In a typical commercial dairy herd where there is a consistent prevalence of MAP infection for example, it is common to find animals that appear resistant to infection, even after several years of exposure. Additionally, there is evidence that susceptibility to MAP infection, and the development of clinical symptoms associated with Johne's disease is inherited; heritability estimates in dairy cattle have been estimated to range from 0.010 to 0.183, depending on the criteria used to diagnose MAP infection or Johne's disease [68]. Given this, it may be possible to use selective breeding strategies to enhance resistance to MAP infection thereby reducing the incidence of Johne's disease in dairy cattle and the risk of human exposure to MAP.

Since resistance to MAP infection is likely polygenic in nature, it is essential that multiple genes be investigated for their contribution to disease resistance. Therefore, the focus of this study was to identify single nucleotide polymorphisms (SNPs) in several immune-related genes and investigate their association with MAP infection status in dairy cattle. Interleukin-10 (IL10) and its receptor (subunits IL10RA and IL10RB), transforming growth factor beta 1 (TGFB1) and two of its receptors (TGFBR1 and TGFBR2), and natural resistance-associated macrophage protein 1 (SLC11A1) were investigated in this study based on their previous associations with various types of human IBD [912]. Interleukin-10 and TGFB1 collectively act to control the host inflammatory response to microbial antigens; IL10 primarily operates as a feedback inhibitor of T cell responses, and TGFB1's major function is to maintain T cell tolerance to self and commensal antigens by influencing the differentiation and homeostasis of effector and regulatory T cells [13]. Natural resistance-associated macrophage protein 1, also known as solute carrier family 11 member 1, is an iron transporter that exhibits pleiotropic effects on the early innate macrophage response to intracellular bacteria [14]. Of the 13 SNPs identified, four in IL10RA (984G > A, 1098C > T, 1269T > C, and 1302A > G) were tightly linked, and showed a strong additive and dominance relationship with MAP infection status.

Methods

Cohort population

Six commercial Holstein operations in Southwestern and Eastern Ontario were selected for sample collection based on a previous history of a high prevalence of MAP infection. Blood was collected between the months of July and September 2007 via the coccygeal (tail) vein from dry and lactating cows ranging in age, breed, stage of lactation, infection status, and history of MAP screening. The protocol for collection was approved by the University of Guelph animal care committee. Current infection status was determined by identifying the presence of MAP-specific plasma antibodies using the commercially available HerdChek M. pt. Antibody ELISA Test Kit (IDEXX Laboratories, Westbrook, ME, USA) according to manufacturer's instructions. A total of 380 cows were sampled, on average 63 ± 31 from each herd. Of which, at least 4% from each herd tested positive (S/P > 0.25) or suspect (0.10 < S/P < 0.25) of MAP-infection as diagnosed by serum ELISA. Suspect animals were not included in either cohort. Infection-free animals making up the healthy (negative) control cohort (n = 242) included animals that were older than 4.5 years of age and had tested negative for MAP infection in previous years (n = 197), and those that were older than 5.5 years of age without previous screening (n = 45). The mean age of this cohort was 6.4 years (range, 4.5 to 12.7 years). The infected (positive) cohort (n = 204) was made up of animals that were considered to be infected based on the presence of MAP-specific plasma antibodies (n = 16), and a second group of animals considered to be infected based on milk MAP-specific antibodies screening carried out by Canwest DHI (Guelph, ON, CAN) (n = 188); these milk samples were generously provided between July 2006 and November 2007, and due to client anonymity, information such as age, pedigree and herd location was not available. Genomic DNA was extracted from the buffy coat of blood samples using the DNeasy blood and tissue kit (Qiagen, Santa Clara, CA, USA), and from milk according to methods previously described [15].

SNP discovery

All SNPs were identified by sequencing PCR amplicons from each candidate gene using a DNA pool constructed with DNA from 40 Holstein bulls according to methods described in previous studies [16, 17]. Briefly, for each bull, genomic DNA was extracted from semen and adjusted to a concentration of 5 ng/μl after several rounds of quantification using the Quant-iT PicoGreen dsDNA reagent (Invitrogen, Carlsbad, CA, USA) followed by dilution. The resultant DNA pool was amplified using the Repli-g Ultrafast mini kit (Qiagen, Santa Clara, CA, USA), and was then used as a template for PCR amplification of the 5' region and coding exons of each candidate gene. Primers were designed using Primer3 [18]. PCR amplicons were sequenced in both 5' and 3' orientation using an ABI Prism 3730 DNA sequencer (Applied Biosystems, Foster City CA, USA), and SNPs were identified by visual inspection of the electropherograms. Seven genes were selected for SNP discovery, IL10 [GeneId#, 281246], IL10RA [513478] and IL10RB [767864], TGFB 1 [282089], TGFBR1 [282382] and TGFBR2 [535376], and SLC11A1 [282470]. Sequences were compared against GLEAN models using the Apollo Genome Annotation and Curation Tool to confirm correct gene structure (Version 1.6.5) [19]. In the event of a disagreement between respective GLEAN and NCBI gene models, as was the case for IL10RA, the GLEAN model was chosen.

Genotyping and haplotype reconstruction

SNP genotyping was conducted using the iPLEX MassARRAY system (Sequenom inc., San Diego, CA, USA). Two of the thirteen SNPs, IL10 -30A > C, and SLC11A1 650C > T, were not genotyped using this assay due to failed primer design or inadequate quality of results (Table 1). Two groups of SNPs, IL10RA 984G > A, 1098C > T, 1269T > C and 1302A > G, and IL10RB 503C > T and 569A > G appeared to be tightly linked due to nearly matching genotype records (Pearson's r ≥ 99%), thus all but one SNP from each group was removed from the analysis. For haplotype analysis, only the SNPs in IL10RA were included since no other genes contained multiple informative SNPs, or resided on the same chromosome. The haplotypes were reconstructed in both cohorts using PHASE (version 2.1) [20].

Table 1 Characteristics of SNPs discovered in IL10, IL10RA/B, TGFB1, and SLC11A1.

Statistical analysis

Tests for significance of pair-wise linkage disequilibrium were performed as described in Krawetz and Womble [21]:

where: η = number of cows genotyped; ρAB = frequency of haplotype AB; ρA, ρa = frequency of alleles A and a, respectively; ρB, ρb = frequency of alleles B and b, respectively.

SNP associations were determined using a logistic regression approach (PROC LOGISTIC) in SAS (version 9.1, SAS Institute Inc., NC, USA) using the following model:

where: y ij = MAP infection status (1 = infected, 0 = healthy) for the j- th cow; μ = overall mean; s = number of SNPs on the particular chromosome considered; a i = additive effect for the i- th SNP; w i = genotype of the i- th SNP recoded as number of alleles (0, 1 and 2); d i = dominance effect for the i- th SNP; v i = genotype of the i- th SNP recoded as homozygote or heterozygote (0 and 1); a k = additive effect for the k-th null SNP; m k = genotype of k-th null SNP recoded as number of alleles (0, 1 and 2); and e ij = random residual effect for the j-th cow. Four null SNPs were included as covariates in each logistic regression model as a means to correct for bias that may arise due to stratification between the negative and positive cohorts [22]. These SNPs are unlinked with respect to one another and to all candidate SNPs considered, are not expected to be associated with any traits related to MAP-infection, and are polymorphic within the Canadian Holstein population (minor allele frequency (MAF) > 35%) (unpublished results).

Haplotype analysis was performed in SAS using a similar model, only is replaced by ; where: β i = linear regression coefficient (haplotype effect) for the i-th haplotype; Hap ij = the probability for the i-th haplotype for the j-th cow, where h is the number of observed haplotypes.

Experimental-wise significance levels for all tests were determined by Bonferroni correction. Regression results are presented as odds ratios (OR) and their respective 95% confidence intervals (CI).

To assess the level of multicollinearity among the SNPs included in the analysis, principal component analysis (PCA) was performed using PROC PRINCOMP in SAS, followed by calculation of the condition index [23]:

where: λmax, λmin = the largest and smallest eigenvalue for the variables considered, respectively. Akaike's information criterion (AIC) was used to select the final regression model when multicollinearity was shown to be unacceptable.

Results

In total, thirteen SNPs were identified: two in IL10 [-285T > C (dbSNP ssID# ss104807640) and -30A > C (ss104807641)]; six in IL10RA [633C > A (ss104807642), 984G > A (ss104807643), 1098C > T (ss104807644), 1185C > T (ss104807645), 1269T > C (ss104807646), and 1302A > G (ss104807647)]; two in IL10RB [503C > T (ss104807648), and 569A > G (ss104807649)], one in TGFB1 [258C > T (ss104807650)], and two in SLC11A1 [650C > T (ss104807654) and 1066C > G (ss104807655)]. All SNPs were submitted to NCBI dbSNP (Build 130).

Due to nearly matching genotype distributions (Pearson's r ≥ 99%) it was assumed that four SNPs in IL10RA (984G > A, 1098C > T, 1269T > C and 1302A > G), and both SNPs in IL10RB (503C > T and 569A > G) were tightly linked. Hence, all but one SNP from each of these genes was dropped from analysis in order to minimize redundancy. Pair-wise linkage disequilibrium (r2) estimates for the remaining SNPs in IL10RA were 0.07 and 0.41, between SNPs IL10RA 633C > A and 1185C > T, and 984G > A and 1185C > T, respectively, and significant at p < 0.001; similar results were previously reported in Canadian Holstein bulls [24]. As such, it was a concern that there would be a high degree of correlation (multi-collinearity) between them in the present dataset, thereby inflating standard error of parameter estimates and thus, reducing the significance of resultant associations [25]. Principal component analysis (PCA), followed by calculation of the condition index, suggested that these three SNPs were in a state of strong multi-collinearity (K > 140), whereas the removal of any one SNP returned the condition index to an acceptable range (7.7 < K < 10.5) [26]. Model selection based on AIC subsequently determined that IL10RA 1185C > T was the most appropriate SNP to remove from the final multiple regression model.

Logistic regression analysis revealed that only the SNPs in IL10RA were associated with MAP infection. The group of tightly linked SNPs, IL10RA 984G > A, 1098C > T, 1269T > C and 1302A > G, were found to have a strong additive and dominance relationship with MAP infection status (OR, 0.51 (0.34-0.78), p = 0.002, and 2.27 (1.40-3.67), p = 0.001, respectively), which were retained at an experimental-wise significance of 5% (Table 2). The results suggest that the allele GCTA is dominant over the ATCG allele, and is associated with a higher probability of being positive for MAP. The SNP IL10RA 633C > A also showed modest additive relationship with MAP infection status (OR, 0.54 (0.29-1.00), p = 0.049), in which the C allele is associated with a higher probability of being positive for MAP.

Table 2 Genotypic frequencies and associations of SNPs in IL10, IL10RA/B, TGFB1, and SLC11A1 with MAP-infection status.

Haplotype reconstruction of the three informative SNPs in IL10RA (633C > A, 984G > A and 1185C > T) identified four combinations: AGC, AAT, CAC and AAC. Haplotype AAC was found in less than 1% of the sample population, whereas haplotypes AGC, AAT and CAC represented 56%, 24% and 19% of the entire sample population, respectively (Table 3). Individual tests for haplotype association with MAP infection revealed that haplotype AGC was associated with a higher probability of being positive for MAP (p = 0.018) and haplotype AAT with a higher probability of being negative for MAP (p = 0.030) (Table 3). Haplotype contrasts against the most frequent haplotype, AGC, identified a significant effect for haplotype AAT (p = 0.013)(Table 3), which was retained at an experimental-wise level of 5%.

Table 3 Haplotype frequencies in the 3' coding region of IL10RA and their association with MAP-infection status.

Discussion

In the following cohort study, two significant associations with MAP infection status were observed for the IL10RA gene. First, a strong association between the tightly linked bovine SNPs IL10RA 984G > A, 1098C > T, 1269T > C and 1302A > G and MAP infection status was detected. For these SNPs, the allele GCTA is dominant over the ATCG allele, and is associated with a higher probability of being positive for MAP. Second, when haplotype analysis was performed on SNPs IL10RA 633C > A, 984G > A and 1185C > T, equally strong, inverse associations for the haplotypes AGC and AAT with MAP infection status were observed. Considering the strong individual relationship of IL10RA 984G, 1098C, 1269T and 1302A, with MAP infection status, it is not unreasonable to assume that these SNPs are the primary contributor to these associations. Furthermore, removing 633C > A from the haplotype reconstruction resulted in haplotype frequencies and associations that were negligibly different from those reported above (unpublished results). Contrasts indicated a strong, significant effect in reducing the proportion of infected animals when replacing the most frequent haplotype, AGC, with AAT. This would suggest that it may be possible to lower the overall frequency of MAP infection as diagnosed by serum ELISA by increasing the frequency of the AAT haplotype through selective breeding. However, since the impact of such a breeding decision on the prevalence of other diseases is currently unknown, such a strategy should be approached with caution.

Interleukin-10 has emerged as an essential immunoregulatory cytokine during bacterial infections. In the context of Mycobacterium spp. for example, IL10 helps to control excessive T helper 1 and CD8+ T cell responses that contribute to the immunopathology associated with infection; it also prevents the overproduction of IL4, IL5, and IL13, which can lead to severe fibrosis during the T helper 2 response [27]. This may be particularly relevant at mucosal surfaces, since human studies have implicated functional SNPs in the IL10 gene as risk factors for IBD [10] and tuberculosis [28]. In cattle, IL10 is up-regulated during subclinical and clinical MAP infections [29, 30], and its neutralization has been shown to promote the activation of MAP-infected bovine macrophages and subsequent killing of the organism [31]. Similar findings have also been demonstrated with human infection studies performed in vitro using Mycobacterium tuberculosis[32, 33].

Although the present study found no association between variants in the bovine IL10 gene and MAP infection, it did provide evidence that variants in the IL10RA gene, which encodes the ligand-binding subunit of the IL10R and is a major determinant of IL10 responsiveness [34, 35], may contribute to susceptibility to MAP infection. We are unaware of previous studies indicating that variants in the IL10R gene influence the susceptibility to mycobacterium infection. However, previous work has shown that antibody-mediated neutralization of the IL10R complex reduces the susceptibility of mice to Mycobacterium avium and increase the efficacy of anti-mycobacterial therapy [36]. In support of this, associations have been reported between SNPs in the human IL10RA and IL10RB genes and the level of IL10 expression in mucosal tissues [37]. Our group has also recently identified associations in Holstein dairy bulls between the presently studied IL10RA haplotype and estimated breeding values for somatic cell score, a trait highly correlated to the incidence of the inflammatory disease mastitis [24]. Furthermore, based on alignment with the murine homologue, all of the SNPs identified within IL10RA, with exception to 633C > A, appear to code for a region of the cytoplasmic domain that defines cellular responsiveness to IL10 and mediates cellular proliferation [38]. It is important to note, however, that before any definite conclusions can be made concerning the magnitude of IL10RA's contribution to susceptibility/resistance to MAP infection, it will be necessary to functionally characterize the IL10RA haplotype, since the SNPs it represents are all synonymous mutations that may be linked to a nearby gene on BTA15 that confers susceptibility/resistance to MAP infection. Traditionally, synonymous SNPs are viewed as "silent" and thus may not warrant functional validation, however, several studies addressing the role of codon usage bias, as well as mRNA folding, have reported otherwise [3941].

Two limitations related to the design of the present study warrant that the reported associations be approached with caution. First, the cows in the negative cohort were categorized as such using the serum ELISA, and in some cases, the milk ELISA as well. These tests, while known to be very specific, have relatively low sensitivities; 25-94% and 29-61%, for the serum and milk ELISA, respectively [42]. Therefore, it is likely a proportion of negative animals are in fact MAP infected, but have yet to produce a strong enough antibody-response required for positive diagnosis. A major implication is that such bias could skew allelic frequencies for certain SNPs, leading to an under- or over-estimation of the true effect. In the present study, the control cohort consisted of cows older than 4.5 years of age (mean = 6.4 years) that had been diagnosed negative by serum ELISA, the majority of which with previous screening by milk ELISA. Given that it is generally accepted that cattle become MAP-infected as calves [43], we believe that a cohort consisting of older cows will likely contain a sufficiently low number of false negatives, thereby minimizing any related bias. This is supported by a number of studies reporting a positive correlation between seroprevalence and age [4446].

The second limitation of the present study was the absence of information about the family structure of a substantial number of the genotyped cows; this prevented us from classifying population stratification and substructure, which can bias genetic association studies [47]. To address this issue, we included null markers as covariates in the logistic regression SNP association analysis. This approach has be shown to be very effective for controlling bias due to stratification in candidate gene association studies [22], and whole genome association approaches [48]. While these SNPs are suitable for inclusion as null markers since they are unlinked to one another and to the candidate SNPs considered, and are polymorphic within the Canadian Holstein population (MAF > 35%), they have not been validated for their ability to identify stratification within said population. Thus, it is essential that the results presented in this study are validated in a larger and more diverse cohort study in order to ensure that the associations drawn from the present study are not skewed by biasing effects such as family structure.

Conclusions

In summary, several SNPs were identified in the bovine genes encoding IL10, IL10RA, IL10RB, TGFB1, and SLC11A1. A strong association between a group of tightly linked synonymous SNPs in the 3' coding region of IL10RA, 984G > A, 1098C > T, 1269T > C and 1302A > G, and MAP infection status Canadian dairy cattle was established. Haplotype reconstruction of the SNPs in IL10RA also revealed a strong association with MAP infection status. These results provide initial evidence that variants in IL10RA may contribute to susceptibility to MAP infection in dairy cattle.