Human Genetics

, Volume 133, Issue 7, pp 883–893

CUBN and NEBL common variants in the chromosome 10p13 linkage region are associated with multibacillary leprosy in Vietnam

Authors

  • Audrey V. Grant
    • Laboratoire de Génétique Humaine des Maladies Infectieuses, Branche NeckerInstitut National de la Santé et de la Recherche Médicale, U980
    • Université Paris Descartes, Sorbonne Paris Cité, Institut Imagine
  • Aurelie Cobat
    • McGill International TB CentreThe Research Institute of the McGill University Health Centre
    • Departments of Medicine and Human GeneticsMcGill University
  • Nguyen Van Thuc
    • Hospital for Dermato-Venerology
  • Marianna Orlova
    • McGill International TB CentreThe Research Institute of the McGill University Health Centre
  • Nguyen Thu Huong
    • Hospital for Dermato-Venerology
  • Jean Gaschignard
    • Laboratoire de Génétique Humaine des Maladies Infectieuses, Branche NeckerInstitut National de la Santé et de la Recherche Médicale, U980
    • Université Paris Descartes, Sorbonne Paris Cité, Institut Imagine
  • Andrea Alter
    • McGill International TB CentreThe Research Institute of the McGill University Health Centre
    • Departments of Medicine and Human GeneticsMcGill University
  • Nguyen Ngoc Ba
    • Hospital for Dermato-Venerology
  • Vu Hong Thai
    • Hospital for Dermato-Venerology
  • Laurent Abel
    • Laboratoire de Génétique Humaine des Maladies Infectieuses, Branche NeckerInstitut National de la Santé et de la Recherche Médicale, U980
    • Université Paris Descartes, Sorbonne Paris Cité, Institut Imagine
    • St Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller BranchThe Rockefeller University
  • Alexandre Alcaïs
    • Laboratoire de Génétique Humaine des Maladies Infectieuses, Branche NeckerInstitut National de la Santé et de la Recherche Médicale, U980
    • Université Paris Descartes, Sorbonne Paris Cité, Institut Imagine
    • URC, CIC, Necker, and Cochin Hospitals
    • McGill International TB CentreThe Research Institute of the McGill University Health Centre
    • Departments of Medicine and Human GeneticsMcGill University
    • Montreal General Hospital Research Institute
Original Investigation

DOI: 10.1007/s00439-014-1430-8

Cite this article as:
Grant, A.V., Cobat, A., Van Thuc, N. et al. Hum Genet (2014) 133: 883. doi:10.1007/s00439-014-1430-8
  • 229 Views

Abstract

Leprosy is caused by infection with Mycobacterium leprae and is classified clinically into paucibacillary (PB) or multibacillary (MB) subtypes based on the number of skin lesions and the bacillary index detected in skin smears. We previously identified a major PB susceptibility locus on chromosome region 10p13 in Vietnamese families by linkage analysis. In the current study, we conducted high-density association mapping of the 9.5 Mb linkage peak on chromosome region 10p13 covering 39 genes. Using leprosy per se and leprosy subtypes as phenotypes, we employed 294 nuclear families (303 leprosy cases, 63 % MB, 37 % PB) as a discovery sample and 192 nuclear families (192 cases, 55 % MB, 45 % PB) as a replication sample. Replicated significant association signals were revealed in the genes for cubilin (CUBN) and nebulette (NEBL). In the combined sample, the C allele (frequency 0.26) at CUBN SNP rs10904831 showed association [p = 1 × 10−5; OR 0.52 (0.38–0.7)] with MB leprosy only. Likewise, allele T (frequency 0.42) at NEBL SNP rs11012461 showed association [p = 4.2 × 10−5; OR 2.51 (1.6–4)] with MB leprosy only. These associations remained valid for the CUBN signal when taking into account the effective number of tests performed (type I error significance threshold = 2.4 × 10−5). We used the results of our analyses to propose a new model for the genetic control of polarization of clinical leprosy.

Introduction

Leprosy caused by Mycobacterium leprae is one of the world’s oldest documented diseases. In patients, the bacteria are preferentially found in Schwann cells in peripheral nerves and mature tissue macrophages in the skin. Leprosy is classified into two subtypes, paucibacillary (PB) or multibacillary (MB) leprosy with MB patients manifesting a greater number of lesions, extent of nerve involvement and bacillary load. Since the introduction of multidrug therapy in 1981, the global burden of leprosy has steadily declined. Yet, there remain 16 endemic countries where the combined number of new cases reported has stabilized at approximately 220,000 per year, with a widely varying proportion of MB cases. The processes of transmission of and infection with M. leprae are not well understood, but are thought to occur via the respiratory tract. Household contact studies have shown that MB cases display a higher transmission potential as compared to PB cases (Sales et al. 2011; Sarno et al. 2012). This suggests that early recognition and treatment of MB cases are critical for transmission blocking interventions. Environmental factors hypothesized to influence leprosy clinical presentation (and therefore subtype classification) include those influencing adaptive immunity. The most common potential source is the BCG vaccine, but a meta-analysis has shown that the protective effect of BCG vaccine on development of leprosy is impartial to subtype (Merle et al. 2010).

Considerable evidence supports a genetic basis underlying human susceptibility to leprosy including familial aggregation studies and segregation analyses (Abel and Demenais 1988; Casanova and Abel 2002; Alter et al. 2011a). A two-stage model explaining genetic susceptibility to leprosy has been proposed where a first set of genetic factors control susceptibility to pre-clinical disease (leprosy per se), and a second set of genes control the polarization of clinical disease into PB or MB leprosy (Alter et al. 2011a). So far, the majority of susceptibility genes identified impact on leprosy per se. Fine mapping of leprosy per se linkage peaks (Mira et al. 2003) identified variants in the regulatory region for PARK2 (Mira et al. 2004; Alter et al. 2013) and a functional SNP in the LTA promoter as strong risk factors for leprosy per se (Alcaïs et al. 2007). Additionally, two independent sets of correlated SNPs were discovered as leprosy per se risk factors, with one set showing evidence for being an eQTL for the class I HLA-C gene and the other tagging the HLA-C*15:05 allele (Alter et al. 2011b). Associations in other genes using the candidate gene approach have been explored, although evidence for replication has not been consistent, including genes coding for innate immune receptors, cytokines or adaptors involved in host immunity (Misch et al. 2010). Nonetheless, a number of associations, particularly with HLA Class II alleles, have shown differential association by subtype: HLA-DR2 alleles were found to be associated with both PB and MB individuals in different ethnicities, whereas HLA-DR3 has been found to be associated with PB individuals [reviewed in (Gorodezky et al. 2004)]. More recently, a genome-wide association study in a Chinese population revealed a striking genetic overlap of leprosy susceptibility with risk of Crohn’s disease (Zhang et al. 2009, 2011; Schurr and Gros 2009). This finding was partly replicated in Vietnamese, Indian and African leprosy populations and suggested that knowledge of human genetic susceptibility factors underlying leprosy may contribute to a better understanding of granulomatous diseases in general (Grant et al. 2012; Wong et al. 2010).

A locus involved in clinical polarization of leprosy is situated at chromosome region 10p13. An initial genome-wide linkage scan conducted on 245 leprosy sib pairs in south India (including 232 PB-concordant sib pairs) identified a single linkage peak on chromosome region 10p13 (p < 2 × 10−5) (Siddiqui et al. 2001). A subsequent genome-wide linkage scan on 205 siblings affected with leprosy from southern Vietnam detected a weak signal on 10p13 (p > 0.08). Evidence for linkage became stronger when considering the 17 families consisting of PB cases only (p > 0.003) with significant evidence for heterogeneity by subtype (p = 0.028) (Mira et al. 2003). Based on these results, we conducted an association and functional study of the MRC1 innate immunity candidate gene located in the linkage interval (Alter et al. 2010). MRC1 encodes the human mannose receptor that recognizes pathogen-associated molecular patterns. The G396 allele in exon 7 conferred greater risk of MB disease (Alter et al. 2010). The association observed from the MRC1 targeted study clearly could not explain the PB-specific linkage results. However, the greater evidence found for MB leprosy demonstrated that the 10p13 chromosomal region harbors genetic variants that impact on leprosy polarization. Here, we report the results of a gene-centered high-resolution association scan of the 10p13 linkage interval, focusing on MB and PB leprosy subtypes, in two independent familial Vietnamese leprosy samples.

Methods

Study population

Patients with leprosy who had parents willing to provide blood for DNA extraction were recruited into the study based on records at the Dermato-Venerology Hospital in Ho Chi Minh City, Vietnam between 1998 and 2008. The diagnosis of leprosy patients and their classification according to subtype were based on clinical and histological criteria (Ridley and Jopling 1966). Lepromin was not used in the classification of leprosy lesions. The diagnosis was always established by two physicians independently. In cases of initial discordance the patient was re-evaluated and classification was resolved. Bacillary index was used when possible and when collected reliably. Information on the number of skin and nerve lesions was available for 425 out of the 495 affected offspring. Subjects were classified as multibacillary if at least two nerve or five skin lesions were recorded. According to the Ridley–Jopling scale, lepromatous lepromatous (LL), lepromatous borderline (BL) and borderline borderline (BB) patients were considered MB. Those classified as borderline tuberculoid (BT) or tuberculoid tuberculoid (TT) were considered PB. The discovery sample comprised 303 leprosy-affected offspring and their parents, including 10 multiplex nuclear families with 2 affected offspring and 284 simplex nuclear families. Of the 303 affected offspring, 190 were classified as MB and 113 as PB. The replication sample comprised 192 simplex nuclear families including 105 offspring classified as MB and 87 classified as PB (Table S1). Mean age and proportion of males were calculated for the discovery and replication samples and by subtype using SAS v. 9.2 (SAS Institute, Cary, NC, USA).

Informed consent was obtained from all study participants. The study was approved by institutional review boards and health authorities in Ho Chi Minh City, Vietnam, The Research Institute of the McGill University Health Centre, Montreal, Canada and The Rockefeller University, New York.

SNP selection

The borders for fine mapping were defined based on the results of two linkage studies in Vietnamese (Mira et al. 2003) and Indian PB leprosy patients (Siddiqui et al. 2001). The 95 % confidence interval (CI) for the linkage peak in the Vietnamese sample extended from microsatellite markers D10S1653 (15,677,861–15,678,063) to D10S1660 (23,394,094–23,394,266). We expanded these borders by 500 kb to include all genes within the boundaries in their entirety, defining a 9.7 Mb interval from 15,138,615 to 24,837,691 (Genome reference assembly GRCh37.p11) on chromosome 10 (RPP38 → KIAA1217). This region also contains the 95 % CI for the linkage peak for the Indian study. Genome Reference annotation of the region contains a correction (HG544 PATCH) indicating that a 246,930 bp sequence starting at position 17,881746 on chromosome 10 was erroneously thought to be a duplicate of the sequence immediately following it. In the corrected sequence, MRC1 and TMEM236 were retained and MRC1L1 and FAM23B were removed. Considering the consequences of HG544 PATCH, the region includes 39 protein-coding genes (see gene locations on Fig. 1) defined as merged Ensembl/Havana protein-coding genes, accessed from http://www.ensembl.org on September 19, 2013). Coverage of variation in the region was estimated based on variation catalogued in the International Hapmap Project for 45 unrelated individuals from the Han Chinese population from Beijing, China (CHB) (release 23, 90 individuals, 2.2 million filtered SNPs) including intragenic intervals ±3 kb from the start and stop codons. SNP selection was performed using the tagger algorithm (de Bakker et al. 2005) considering SNPs with a minor allele frequency (MAF) >10 % in the CHB sample. An R2 cutoff of 80 % was set to define bins of correlated SNPs. Moreover, assay availability and quality for the Illumina and Sequenome panels were also taken into consideration in the selection of tag SNPs to represent the region. A total of 1,522 intragenic SNPs were selected for genotyping in the discovery sample in the chromosome 10p13 linkage region.
https://static-content.springer.com/image/art%3A10.1007%2Fs00439-014-1430-8/MediaObjects/439_2014_1430_Fig1_HTML.gif
Fig. 1

Fine-mapping association results in the discovery sample of 294 nuclear families with at least one leprosy offspring according to leprosy phenotype: a paucibacillary leprosy, b multibacillary leprosy and c leprosy per se. The minimum p value obtained from family-based tests of association under the additive, recessive and dominant models against chromosome 10 position is shown for each of 1,335 SNPs. The p = 0.01 significance threshold is shown as a horizontal line. The positions of protein-coding genes and gene symbols are shown below according to build GRCh37.p11 and corrected for HG544 PATCH

Genotyping methods

SNPs in the discovery sample were genotyped using either the ultrahigh-throughput Illumina platform, which uses the GoldenGate assay to resolve individual SNP genotypes, or the high-throughput SEQUENOM MassARRAY platform, which uses the iPLEX assay to incorporate mass-modified terminal nucleotides in the single base pair extension step which are then detected by MALDI-TOF MS. SNPs in the replication sample were genotyped using SEQUENOM custom panels.

Quality control of SNPs

Following genotyping, we calculated the genotyping success rate for each SNP and sample, and removed SNPs with a success rate under 90 %. All genotyped individuals achieved a success rate >90 %. Departures from Hardy–Weinberg equilibrium (HWE) proportions to compare observed vs. expected genotype counts and the MAF at each marker were tested among the founders. SNPs displaying an HWE p < 0.0001 and/or an MAF <0.02 were removed from further analysis. SNPs resulting in ≥4 Mendelian errors were also removed. These quality-control filters were implemented using PLINK software (http://pngu.mgh.harvard.edu/~purcell/plink/). Out of the 1,522 genotyped SNPs in the discovery sample, 1,335 passed these quality-control filters. All 70 SNPs genotyped in the replication sample passed these quality-control filters.

Statistical methods

Allele frequencies of all SNPs were calculated among parents using PLINK. Pairwise linkage disequilibrium (D′ and R2) was calculated across the region using Haploview software (Barrett et al. 2005) (http://www.broadinstitute.org/scientific-community/science/programs/medical-and-population-genetics/haploview/). Tagging bins were defined within Haploview using an R2 cutoff of 0.8 among associated SNPs to identify independent signals. We performed classical tests of transmission disequilibrium included in the family-based association tests (FBATs) implemented in the FBAT v2.0.3 software (Horvath et al. 2001). For each marker, additive, recessive and dominant tests were conducted and the test under the model giving the most significant p value was retained in each case. A two-sided type I error rate of 0.01 was set in the discovery sample. This approach of selecting the best model based on p values should allow for retention of the most likely correct model in the hypothesis testing framework. Risk estimates for each test were obtained by conditional logistic regression after recoding genotype data for each affected child and up to three unaffected pseudosiblings who received possible untransmitted genotypes, (Schaid and Rowland 1998) using the PHREG procedure as implemented in SAS v. 9.2. The conditional logistic regression framework made it possible to perform multivariate stepwise regression tests. These were conducted using both forward and backward selection, and a p value of 0.05 was used as the threshold for inclusion or retention, respectively, of SNPs in the model. In the replication population, FBATs were similarly performed, with a one-sided type I error rate set at 0.01 assuming the same risk allele as for the discovery population. Stratified analyses were conducted according to PB/MB leprosy subtype in both the discovery and replication samples. We tested for heterogeneity between the strata using the Chi-square test for heterogeneity (Cochran Q test) (Cochran 1954) which has been used in meta-analyses of GWAS studies as implemented in GWAMA (Higgins et al. 2003; Mägi and Morris 2010). Cochran’s heterogeneity statistic (Q) and associated p value (Qp value) were computed. Cochran’s Q test is calculated as the weighted sum of squared differences between individual study effects and the pooled effect across studies. The Q statistic follows a χ2 distribution with 1 degree of freedom for the comparison of the two MB and PB groups.

To correct for multiple testing, we estimated the significance threshold that would provide a 5 % overall false-positive rate, i.e., family-wise error rate (FWER) for tests performed in the combined discovery and replication sample, given that 1,335 SNPs were tested for association. The model displaying the minimum p value under the additive, recessive or dominant models was retained for the PB and MB subtypes as well as for leprosy per se (3 phenotype definitions). Given some degree of correlation among the SNPs, and among the test statistics for the models and phenotypes tested, we used spectral decomposition techniques based on the method of Nyholt (2004) for the three phenotype definitions, and the modified method of Li and Ji that is more accurate and less conservative for large correlation matrices for the SNP set (Li and Ji 2005), as previously described (Patin et al. 2012) to estimate the total effective number of tests performed. The total effective number of tests performed was calculated in two steps: (1) obtaining an effective number of tests for each phenotype to account for correlation from the LD among SNPs and (2) obtaining an effective number of tests to account for correlation among phenotypes across all SNPs. Considering LD among SNPs, testing 1,335 SNPs in each of the three phenotypes was equivalent to 819.3 independent tests (the average number obtained across the three phenotypes). The test statistic corresponding to the minimum p value obtained from the additive, recessive and dominant models across the three phenotype definitions, across all SNPs, was used as the basis for a 3 × 3 matrix of Spearman rank correlation coefficients. Testing for the three phenotype definitions in this way was equivalent to 2.5 tests. Thus in total, 819.3 × 2.5 = 2,048 effective tests were performed yielding a significance threshold of 2.44 × 10−5.

We determined whether we had sufficient power based on sample size for the PB subtype in the discovery sample to detect associations of the same magnitude (in terms of OR and the minor allele frequency) as obtained for the MB subtype in the discovery sample using QUANTO (Gauderman 2002). We assumed a two-sided type I error rate of 0.01 as used for screening the discovery sample results.

Results

The mean age and proportion of males for the discovery and replication samples were very similar comparing values across leprosy subtypes (see Table S1). A 9.5 Mb interval on chromosome region 10p13 (15.14–24.59 Mb correcting for HG544 PATCH on GRCh37.p11; RPP38 → KIAA1217) containing 39 genes and corresponding to the linkage region of a PB susceptibility locus was targeted for fine mapping in 294 Vietnamese nuclear families including 303 offspring affected with leprosy and diagnosed with a specific subtype. The 1,522 SNPs selected for genotyping tagged 86 % of SNPs comprising intragenic variation catalogued in the CHB International HapMap population with an R2 ≥ 0.8. Tests of association were performed and p values for the most significant model from the additive, recessive or dominant models were plotted against SNP position for the 1,335 SNPs that passed quality-control filters. Results stratified by leprosy subtype are illustrated for 113 PB patients in Fig. 1a and 190 MB patients in Fig. 1b. Results for leprosy per se are shown in Fig. 1c. These analyses identified 70 SNPs with a p ≤ 0.01 for genotyping in the replication sample (Table S2). In the replication sample of 192 Vietnamese nuclear families, tests of association were performed on each SNP for the same clinical phenotype, the same genetic model and the same risk allele that reached the significance threshold in the discovery sample. Three SNPs displaying a one-sided p value ≤0.01 were considered to be replicated and estimates for odds ratios (OR) and their 95 % confidence intervals (CI) were obtained by conditional logistic regression for each of the three SNPs (Table 1). The SNPs identified were rs1801241 and rs10904831 in cubilin (CUBN) and rs11012461 in nebulette (NEBL).
Table 1

Family-based association test results for the three replicated SNPs in the discovery, replication and combined Vietnamese study populations according to the clinical subtype (MB, PB and leprosy per se)

SNP

Gene

ma

Mb

MAFc

Modeld

Discovery

Replication

Combined

OR (95 % CI)e

pf

OR (95 % CI)g

ph

OR (95 % CI)e

pf

MB

 rs1801241

CUBN

G

A

0.26

ADD

0.49 (0.34–0.73)

2 × 10−4

0.57 (0.57–0.85)

9.2 × 10−3

0.52 (0.39–0.7)

1.3 × 10−5 i

 rs10904831

CUBN

T

C

0.26

ADD

0.49 (0.33–0.72)

2 × 10−4

0.57 (0.57–0.85)

9.2 × 10−3

0.52 (0.38–0.7)

1 × 10−5 i

 rs11012461

NEBL

T

C

0.42

REC

2.36 (1.42–3.92)

7 × 10−4

3.2 (3.2–7.50)

9.2 x 10−3

2.51 (1.6-4)

4.2 × 10−5

PB

 rs1801241

CUBN

G

A

0.26

ADD

0.88 (0.54–1.44)

0.617

1.09 (1.09–1.78)

0.384

0.96 (0.66–1.4)

0.849

 rs10904831

CUBN

T

C

0.26

ADD

0.91 (0.56–1.48)

0.71

1.09 (1.09–1.80)

0.382

0.98 (0.67–1.43)

0.924

 rs11012461

NEBL

T

C

0.42

REC

1.37 (0.75–2.51)

0.302

1.00 (1.00–2.59)

0.369

1.28 (0.75–2.18)

0.366

Per se

  

 rs1801241

CUBN

G

A

0.26

ADD

0.61 (0.45–0.82)

1.1 × 10−3

0.74 (0.74–1.00)

0.049

0.66 (0.52–0.83)

3 × 10−4

 rs10904831

CUBN

T

C

0.26

ADD

0.61 (0.46–0.83)

1.2 × 10−3

0.73 (0.73–1.00)

0.047

0.66 (0.52–0.83)

3 × 10−4

 rs11012461

NEBL

T

C

0.42

REC

1.9 (1.27–2.73)

1.4 × 10−3

1.91 (1.91–3.50)

0.037

1.87 (1.33-2.62)

3 x 10−4

aMinor allele

bMajor allele

cMinor allele frequency among all founders in the combined Vietnamese study population

dGenetic model (ADD additive, DOM dominant, REC recessive)

eOdds ratio (95 % confidence interval) for the two-sided family-based association test in reference to the minor allele

fp value for the two-sided family-based association test

gOR and one-sided 95 % confidence interval derived from the one-sided family-based association test in reference to the minor allele

hp value for the one-sided family-based association test

iSNP meets multiple-testing criterion for significance of 2.44 × 10−5

Minor alleles G of rs1801241 and T of rs10904831 in CUBN with an allelic frequency of 0.26 displayed a protective effect for MB leprosy under the additive model in the discovery (p = 2 × 10−4; OR 0.49; 95 % CI 0.34–0.73) and replication samples (p = 9.2 × 10−3; OR 0.57; one-sided 95 % CI 0.57–0.85). The two SNPs are highly correlated (R2 = 0.98) and therefore capture the same association signal (see Figure S1). In addition, the minor T allele (allelic frequency 0.42) of rs11012461 in NEBL was associated with MB leprosy under a recessive model in the discovery (p = 7 × 10−4; OR 2.36; 95 % CI 1.42–3.92) and replication samples (p = 9.2 × 10−3; OR 3.2; 95 % CI 3.2–7.5). The physical distance between the CUBN (represented by rs10904831) and NEBL signals is 4.3 Mb, indicating their mutual independence (see Figure S1). All three SNPs displayed stronger ORs when analyses were restricted to the MB subtype compared to the full leprosy per se study population (Table 1), suggesting the involvement of these variants in the polarization of clinical leprosy disease.

When we combined the results from the discovery and replication samples, no signals were replicated for the PB subtype (Fig. 2a). In contrast, evidence for association with MB leprosy of the respective alleles at SNPs rs11012461 (NEBL), rs1801241 (CUBN) and rs10904831 (CUBN) was substantially strengthened compared to the discovery sample (Fig. 2b; Table 1). Increased risk of MB leprosy was found under an additive model at rs10904831 in CUBN with an OR of 1.92 (95 % CI 1.43–2.56) for each copy of the C allele (p = 1.0 × 10−5). Increased risk of MB leprosy was found under a recessive model for the T allele at rs11012461 in NEBL with an OR of 2.51 for individuals carrying genotype TT vs. CT/CC (95 % CI 1.6–4; p = 4.2 × 10−5). It is only for the MB subtype that the formal significance threshold adjusted for the effective number of independent multiple tests performed of 2.44 × 10−5 was reached, and only the two SNPs comprising the CUBN signal reached the threshold (Table 1).
https://static-content.springer.com/image/art%3A10.1007%2Fs00439-014-1430-8/MediaObjects/439_2014_1430_Fig2_HTML.gif
Fig. 2

Association results for 70 SNPs selected for replication in the combined discovery and replication sample. a Paucibacillary leprosy subtype comprising 200 PB-affected offspring. b Multibacillary leprosy subtype including 295 MB-affected offspring. The minimum p value obtained from family-based tests of association under the additive, recessive and dominant models against chromosome 10 position is shown for each SNP. The three replicated SNPs are indicated by arrows. The p = 0.01 significance level is shown as a horizontal line. The positions of 39 protein-coding genes and gene symbols are shown below according to Build GRCh37.p11 and corrected for HG544 PATCH

Table 2

Multivariate analyses of replicated SNPs in the combined population of 495 nuclear families with offspring affected with leprosy including 295 with multibacillary leprosy

 

rs1801241

rs10904831

rs11012461

CUBN

CUBN

NEBL

MB

 Univariate pa

1.8 × 10−5

1.4 × 10−5

1.8 × 10−4

 Multivariate pb

c

2.9 × 10−5

3.4 × 10−4

 OR (95 % CI)d

c

0.52 (0.38–0.71)

2.51 (1.52–4.14)

Per se

 Univariate pa

4 × 10−4

4 × 10−4

1.3 × 10−3

 Multivariate pb

c

4 × 10−4

1.2 × 10−3

 OR (95 % CI)d

c

0.66 (0.52–0.83)

1.82 (1.27–2.61)

ap value obtained from univariate analyses using the conditional logistic regression framework

bp value obtained from multivariate analyses using the conditional logistic regression framework and forward and backward selection (PHREG procedure of SAS software (SAS, Cary, NC, USA))

cSNP not included in the final forward or backward multivariate regression procedure

dOdds ratio (95 % confidence interval) for the two-sided family-based association test in reference to the minor allele

Multivariate analyses were conducted on the three replicated SNPs in the combined population for leprosy per se and MB leprosy to further determine the independence of these signals. For leprosy per se and the MB subtype, one of the two CUBN correlated SNPs, rs10904831 was included in the final model for the MB subtype and leprosy per se. The NEBL SNP was also included in the final model for both phenotype definitions indicating that the two signals contributed independently (Table 2). Finally, we also tested the three replicated SNPs for evidence of MB vs. PB subtype heterogeneity. We detected statistically significant heterogeneity by PB/MB subtype in the combined sample for the two CUBN SNPs (Qp = 0.011 for rs1801241) and borderline significance for the NEBL SNP rs11012461 (Qp = 0.059) (Table 3).
Table 3

Test of homogeneity by PB/MB subtype across three associated SNPs in the discovery, replication and combined Vietnamese study populations

SNP

Discovery

Replication

Combined

Qa

Qp valueb

Qa

Qp valueb

Qa

Qp valueb

rs1801241

3.5

0.06

2.8

0.094

6.5

0.011

rs10904831

3.8

0.053

2.8

0.094

6.3

0.012

rs11012461

1.8

0.176

2.2

0.135

3.6

0.059

aCochran’s Q test statistic of heterogeneity

bp value obtained for Cochran’s Q from the \(\upchi^{2}_{1df}\) distribution

Given that the PB subset was smaller (corresponding to N = 113 trios) compared to the MB subset (corresponding to N = 190 trios) in the discovery sample, we performed power calculations on the PB subset using ORs from the MB subset under a nominal α = 0.01. Statistical power between 0.66 and 0.69 was reached for the three replicated SNPs for α = 0.01. Thus, lower power for the PB subtype compared to MB subtype may have prevented the detection of PB-specific associations (Table 4).
Table 4

Power to detect associations in the PB stratum (equivalent to 113 parent–offspring trios) as identified in the MB stratum (equivalent to 190 parent–offspring trios) in the discovery Vietnamese study population

SNP

MAFa

Modelb

ORc

Power (α = 0.01)

rs1801241

0.26

ADD

0.49

0.66

rs10904831

0.26

ADD

0.59

0.66

rs11012461

0.42

REC

2.36

0.69

aMinor allele frequency among all founders in the combined Vietnamese study population

bGenetic model (ADD additive, DOM dominant, REC recessive)

cOdds ratio in reference to the minor allele

Next, we identified SNPs from the initial panel of 1,335 SNPs that were in LD (R2 > 0.4) with the three replicated SNPs to verify whether the heterogeneous effect observed between PB and MB subtypes was also observed. For all three replicated SNPs, five to seven SNPs in high LD with these were present in the initial panel (Figure S1 and Table S3). Tests of associations for leprosy per se and for the PB and MB subtypes using the same genetic model and corresponding minor allele (Table S3) showed that most SNPs in LD with CUBN and NEBL signals were significant at p ≤ 0.05 for leprosy per se and MB leprosy. Estimates of ORs were very similar to those of the corresponding replicated SNP, and for all significant results the OR in the MB stratum was higher than that for leprosy per se. These associations provided additional support for the MB-specific associations of common variants in the linkage interval.

To investigate the possible existence of causal variants that were not genotyped in this study and to facilitate future study in other populations, we searched for proxy SNPs based on LD patterns for the three associated SNPs in public databases. We scanned the Chinese and Japanese populations (CHB/JPT/CHD) of the Hapmap Project and the 1000 Genomes Project databases for SNPs with R2 > 0.4 with the three associated SNPs using the web-based application SNAP (http://www.broadinstitute.org/mpg/snap/) (see Table S4). For CUBN, 65 common proxy SNPs were identified (MAF 0.19–0.5) and for NEBL 86 common proxy SNP were identified (MAF 0.33–0.47) (Table S4) and almost all of these were located in introns. To investigate the possible involvement of rare variants in the associations reported herein, we sought to evaluate the numbers of SNPs in the 1000 Genomes Project with a MAF <0.01 in high LD (D′ = 1) with the replicated SNPs (see Table S5). For CUBN, 113 and NEBL, 101 rare variants were identified (Table S5).

Discussion

We conducted a high-density intragenic fine-mapping study of chromosome region 10p13 among 294 Vietnamese nuclear families with leprosy in an initial discovery phase followed by a replication sample of 192 Vietnamese nuclear families with leprosy. The leprosy subtype distribution was 37 % PB and 63 % MB in the discovery sample, and 45 % PB and 55 % MB in the replication sample, reflecting the known proportion of the leprosy subtypes in Vietnam (World Health Organization, Western Pacific Region 2011). We identified two signals in CUBN and NEBL, and multivariate analyses demonstrated their independence. Odds ratio (OR) estimates were greater for MB leprosy compared to leprosy per se for each of these signals in the combined study population (Table 2). Signals in CUBN and NEBL displayed p values <10−4, although formally only the signal in CUBN reached the significance threshold of 2.44 × 10−5 that took the number of effective tests performed into consideration. The test of heterogeneity by subtype reached statistical significance for the CUBN signal and borderline significance for the NEBL signal. In both CUBN and NEBL, SNPs in high LD with the associated SNPs in the discovery sample showed associations supporting the original finding of a stronger association for the MB subtype. Thus, while the findings in both CUBN and NEBL require validation in other ethnicities, the MB-specific associations in CUBN are backed by stronger evidence than that for NEBL.

Given the PB-specific linkage peak at chromosome region 10p13, the MB-specific associations were unexpected. We failed to detect the reported association of rs692527 in MRC1 with PB leprosy in our sample (Wang et al. 2012). However, overall a higher proportion of MB-specific vs. PB-specific associations has been supported in the literature, and thus a greater proportion of common variation appears to underlie MB leprosy compared to PB leprosy. Notably, in the only published leprosy GWAS, in which the proportion of PB vs. MB subjects was balanced and an equal distribution of MB- and PB-specific variants would be expected under the two-stage model of leprosy pathogenesis (first-stage genes underlie infection and second-stage genes are subtype specific and underlie the clinical manifestation of disease). Out of six independent loci identified across the genome, however, variants at three loci demonstrated a stronger association among MB subjects, while an additional variant at a fourth locus reached significance only among MB subjects. The mean difference in orders of magnitude of significance (−log p values) between the MB-specific and PB-specific results was 19. Thus, the Chinese GWAS provided strong evidence in favor of an asymmetry in the distribution of common variants in the MB- and PB-specific groups (Zhang et al. 2009).

Although it appears clear that MB leprosy is preferentially impacted by common genetic variants (compared to PB leprosy), the details of the underlying genetic architecture to explain this observation remain incomplete. In a speculative scenario, we suggest that PB-specific variants may have been under-detected to date because they are rare and do not feature in studies focusing on common variants like the present fine-mapping study. Given that the MB-associated SNPs each tag a number of rare variants based on the 1000 Genomes Project, it is possible that the number of rare variants is correlated with severity of leprosy going from a lower (PB) to a higher (MB) bacterial load. Hence, while a greater number of rare variants in a leprosy patient would lead preferentially to MB rather than PB leprosy, rare variants would individually have a stronger penetrance for PB leprosy than for MB leprosy. The two-stage model (Alter et al. 2011a) may thus merit replacement by a model reflecting less successful immunological control (with MB leprosy as less successful compared to PB leprosy). The model pertains to the part that host genetics plays in leprosy polarization as a complex disease phenotype that is also affected by other non-genetic factors including socioeconomic status and history of exposure to immunologically active substances. Our proposed genetic model precludes the progression from one leprosy subtype to the other. In the present scenario, the severity would be correlated with the number of rare susceptibility variants in a given individual affected with leprosy. This hypothesis can be verified by sequencing the genomes of leprosy patients.

CUBN and NEBL are not straightforward candidates for leprosy pathogenesis, but close consideration of the literature focusing on these genes provides possible mechanisms for their involvement. However, finding the unexpected is one of the strengths of unbiased genome-wide approaches as recently demonstrated for the leprosy susceptibility gene PARK2 (Manzanillo et al. 2013; Behr and Schurr 2013). CUBN encodes cubilin, an extracellular receptor originally identified as a B-12 receptor in ileal mucosa (Seetharam et al. 1997, 1988) but also known to bind the following ligands: vitamin D binding protein, albumin, transferrin, alipoprotein A1 and Ig light chains (Christensen et al. 2012). Cubilin functions in the absorption of proteins and is interdependent with megalin (LRP2) in the kidney proximal tubule and with amnionless (AMN) in the intestine. A recent study in Mycobacterium tuberculosis has shown that the ABC protein Rv1819c mediates uptake of vitamin B-12 during infection (Gopinath et al. 2013). The Rv1819 M. leprae homolog ML2084 is functional and possibly has a similar function to Rv1819c. It is tempting to speculate that host and bacterial transporters might compete for the same substrate and that such competition is more stringent in the presence of increasing numbers of M. leprae bacilli, i.e., MB leprosy. In addition, a deletion mutation in CUBN has been associated with proteinuria (Ovunc et al. 2011) and rs10904850, at 787 kb of the CUBN SNPs in our study, has been associated with serum iron levels in African Americans (McLaren et al. 2012). Clearly, changes in iron stores could impact the growth of M. leprae (Scollard et al. 2006) and increased expression of the hormone iron regulator hepcidin has been reported in MB lesions (de Souza et al. 2012).

NEBL gives rise to 14 splice variants with an array of functions for which at least several products, including the largest (1,014 amino acids), nebulette, are members of the Nebulin family of actin binding proteins (http://www.ensembl.org). For example, one splice variant results in LIM-nebulette, also called Lasp-2 (Katoh and Katoh 2003). The MB-associated NEBL SNP rs11012461 is located within intron 3 of the Lasp-2 portion of NEBL. Lasp-2 is most strongly expressed in the brain (Deng et al. 2008) and crosslinks to actin filaments (Deng et al. 2008). Since Lasp-2 has multiple assembly patterns, it probably has diverse cellular roles possibly by acting as a molecular scaffold important for actin filament and focal adhesion stabilization and organization (Pappas et al. 2011; Midroni and Bilbao 1995). In its actin filament stabilization role, Lasp-2 may be important in the recruitment of fibroblasts or innate immunity cells to leprosy infection sites. The connection of actin-based motility with the ubiquitination pathway in both leprosy and Crohn’s disease pathogenesis (Schurr and Gros 2009) is an active current area of investigation (Mostowy and Cossart 2011).

Our findings in the Vietnamese population require validation in other ethnicities. GWA studies on the leprosy subtypes should also provide an overview of the contribution of common variation to these subtypes and results are expected to differ from the pattern of subtype heterogeneity observed in leprosy per se studies. Future studies of chromosome 10p13 capturing rare variants in leprosy using deep sequencing should help clarify the genetic architecture underlying PB and MB leprosy in the region.

Acknowledgments

We thank all the family members who participated in this study. This study was supported by grants from the Canadian Institutes of Health Research (CIHR) to ES, MALTALEP from l’Ordre de Malte to AA and ES, the Agence Nationale de la Recherche (ANR) to AA and the Heiser Fondation to LA. AVG was supported by the Fondation Pour la Recherche Médicale (FRM). AC was the recipient of a Banting postdoctoral fellowship from the Government of Canada.

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary material

439_2014_1430_MOESM1_ESM.docx (431 kb)
Supplementary material 1 (DOCX 437 kb)

Copyright information

© Springer-Verlag Berlin Heidelberg 2014