Whole-exome sequencing reveals genetic variants associated with chronic kidney disease characterized by tubulointerstitial damages in North Central Region, Sri Lanka

The familial clustering observed in chronic kidney disease of uncertain etiology (CKDu) characterized by tubulointerstitial damages in the North Central Region of Sri Lanka strongly suggests the involvement of genetic factors in its pathogenesis. The objective of the present study is to use whole-exome sequencing to identify the genetic variants associated with CKDu. Whole-exome sequencing of eight CKDu cases and eight controls was performed, followed by direct sequencing of candidate loci in 301 CKDu cases and 276 controls. Association study revealed rs34970857 (c.658G > A/p.V220M) located in the KCNA10 gene encoding a voltage-gated K channel as the most promising SNP with the highest odds ratio of 1.74. Four rare variants were identified in gene encoding Laminin beta2 (LAMB2) which is known to cause congenital nephrotic syndrome. Three out of four variants in LAMB2 were novel variants found exclusively in cases. Genetic investigations provide strong evidence on the presence of genetic susceptibility for CKDu. Possibility of presence of several rare variants associated with CKDu in this population is also suggested.


Introduction
Chronic kidney disease (CKD) is a global public health issue due to increasing prevalence and economic burden incurred [1][2][3]. This has been attributed to an increasing incidence of conventional risk factors such as diabetes and hypertension in populations around the world. However, in the recent past involvement of genetic factors in pathogenesis of CKD has been highlighted and a number of genetic polymorphisms have been identified [3].
Increasing incidence of CKD during the past two decades was reported in the North Central Province and the neighboring provinces (hereafter referred to as North Central Region NCR) in Sri Lanka [4]. Aturaliya and coworkers reported that major risk factors for CKD could not be identified for 87 % of these CKD patients through the available diagnostic tests of the time. This entity characterized by tubulointerstitial damages has since been termed CKD of uncertain etiology (CKDu) [5].
Over the years, several hypotheses have been proposed on etiology and pathogenesis of CKDu, mostly on possible involvement of environmental factors. Involvement of heavy metals such as Cadmium and Arsenic has been suggested but, the results have been inconsistent [6][7][8][9]. Fluoride has also been one of the earliest suggestions as an etiological factor for CKDu [10]. In addition, involvement of herbicides, mycotoxins, pesticides, and algal toxins in CKDu has been proposed and debated in the recent past [9,11]. Despite continuous efforts, the investigations to date have been unable to provide adequate scientific evidence to identify the exact cause/s of CKDu and a complex involvement of both genetic and environmental factors have been highly suspected [4].
In one of the earlier epidemiological studies, it has been postulated that a family history of renal dysfunction as a risk factor for CKDu. This has been largely supported by the prevalence of the disease in families or familial clustering [4,12]. The inheritance pattern in CKDu did not support a single-gene disease but, matched that of multifactorial diseases such as diabetes, coronary heart disease and cancer [4]. Thus, it could be surmised that the pathogenesis occurs in genetically susceptible individuals when exposed to one or more environmental toxin/s. With this working hypothesis, first, we screened a population for genetic polymorphisms that may predispose to CKDu using Genome-Wide Association Study (GWAS), which identified a GWAS significant single-nucleotide polymorphism in SLC13A3 gene coding sodium dicarboxylate cotransporter 3 located in renal tubular cells with a population attributable risk of 50 % [13].
GWAS of complex traits have been proven successful in identifying common variant associations. However, such methods often overlook large contributions of ethnicityspecific minor alleles, which can also cause substantial risk of complex trait diseases [14,15]. As the second step, in the present study, whole-exome sequencing was performed to identify probable minor genetic variants associated with increased genetic susceptibility of this cohort for CKDu.

Ethics statement
Ethical approval for the study was obtained from the ethical committees of Kyoto University, Japan, and Faculty of Medicine, University of Peradeniya, Sri Lanka. All specimens, clinical reports, and questionnaire data were obtained following informed written consent.

Study population
A cohort of 311 CKDu male cases and 286 male controls from two highly affected areas (Medawachchiya and Girandurukotte) was established in 2011 for the GWAS (methodology of the recruitment process and sample collection is mentioned in detail in our latest publication) [13]. CKDu cases are defined by the absence of known risk factors for CKD and having typical pathological lesion of tubulointerstitial fibrosis. Hypertensive nephropathy was excluded by the absence of history of chronic or uncontrolled hypertension before the diagnosis of CKDu. Diabetic nephropathy was excluded by the absence of history of diabetes preceding the diagnosis of CKDu and having glycosylated hemoglobin (HbA1c) level\6.5 % at the time of recruitment. Other known risk factors such as autoimmune diseases were also excluded by clinical history, examination findings, antibody assays and immunohistochemistry. Controls were the unrelated apparently healthy males from the same region without diabetes (no history of diabetes and HbA1c \6.5 % at the time of recruitment), hypertension (no history of hypertension and blood pressure \140/90 mmHg at the time of recruitment) and no evidence of renal function impairment (no history of renal diseases, serum creatinine \1.2 mg/dL and alpha 1 microglobulin \15.5 mg/L at the time of recruitment). For the whole-exome sequencing eight individuals with CKDu and eight controls were randomly selected to equally represent the two endemic areas.
Following informed written consent, blood samples (10 mL) were collected from peripheral veins into K-EDTA tubes. Serum was separated immediately by centrifugation at 3000 rpm for 10 min. Blood samples were immediately stored at 4°C and then transferred to -30°C within 6 h of collection. The samples were shipped to Japan at -20°C and stored at -80°C until analysis at the human specimen bank in Kyoto University, Japan. DNA was extracted from the whole blood samples using a commercially available extraction kit (QIAamp DNA Blood Mini Kit, Qiagen, Chatsworth, CA, USA) as per the manufacturer's instructions. HiSeq 2000 platform was used to perform exome sequencing. The target regions were captured using SureSelect Human All Exon 50 M Kit (Agilent Technologies, Santa Clara, CA, USA) according to the manufacturer's protocols. Briefly, genomic DNA was extracted from peripheral blood and randomly fragmented by acoustic fragmentation (Covaris) and purified using a QIAquick PCR Purification (Qiagen) kit. Then, adapters were ligated to both ends of the fragments. The resulting DNA library was purified using a QIAquick PCR Purification kit, amplified by ligation-mediated PCR and captured by hybridization to the SureSelect Biotinylated RNA library ''baits'' (Agilent) for enrichment. Captured ligation-mediated PCR products were subjected to an Aglient 2100 Bioanalyzer to estimate the magnitude of enrichment. Each captured library was then loaded on a Hiseq 2000 platform (Aligent), and paired-end sequencing  [16,17]. Single-nucleotide variants and small insertion/deletion were detected using The Genome Analysis Toolkit (GATK) software http:// www.broadinstitute.org/gsa/wiki/index.php/The_Genome_ Analysis_Toolkit) [18]. Bioinformatics analyses were mainly performed using SVS version 7.6.10, GoldenHelix software (Golden Helix, Inc., Bozeman, MT, USA). Figure 1 illustrates the step wise filtering process used to identify the candidate variants. The obtained sequence reads (SNPs and INDEL variants) were aligned to the human reference genome GRCh 37. Non-coding/splicing variants with poor read depths (\25) were excluded from further analysis. Synonymous variants were also filtered out. Coding/splicing, non-synonymous variants were isolated, and the sorting intolerant from tolerant (SIFT) tool [http://sift.bii.astar.edu.sg] was used to predict SNPs with deleterious amino acid changes, and all SNPs with SIFT scores of [0.05 were eliminated [19].
Common variants with MAFs of C0.2 and indels with MAFs C0.01 that were present in the NHLBI exome sequencing project (ESP) database (data release-December 2011 https://esp.gs.washington.edu) were eliminated. The finalized SNPs and indels were used to check the gene associations with phenotypes using the kernel-based adaptive cluster method (KBAC) with a permutation (1,000 times) testing method [20]. All SNPs and indels with values of P \ 0.2 were selected, and the Endeavour software [http://homes.esat.kuleuven.be/*bioiuser/endeavour/ index.php] was used to identify the genes related to renal function. Seventeen known genes associated with kidney diseases (FGF23 [21], UMOD [22,23] Selected 301 CKDu cases and 276 controls from the cohort (10 CKDu cases and 10 controls were eliminated during quality control process) used for GWAS analysis were screened for all the shortlisted SNPs in whole-exome sequencing using TaqMan probes or the restriction fragment length polymorphism method.

Results
After removing the non-coding/splicing variants and variants with poor read depths (\25), 48,176 coding variants were left for further analysis. Among the remaining variants there were 22,919 synonymous variants which were also removed. Out of the 4,532 deleterious variants identified by the SIFT tool only 3,940 variants had MAF \0.2 for SNPs and\0.01 for Indels. These 3,940 variants were from 3,041 genes. KBAC method identified 120 genes with P \ 0.2 for association out of which 7 genes were identified as related to kidney function by Endeavour software. The analysis process is illustrated in Fig. 1. Seven kidney-related genes (containing 11 deleterious SNPs) filtered through the analysis process were: PRCP (prolylcarboxypeptidase angiotensinase C, Chr 11); LAMB2 (laminin, beta 2 (laminin S), Chr 3); KNG1 (kininogen 1, Chr 3); BANK1 (B cell scaffold protein with ankyrin repeats, Chr 4); KCNA10 (potassium voltage-gated channel, Chr 1); SLC7A13 (sodium-independent aspartate/glutamate transporter, Chr 8); and TJP1 (tight junction protein 1,Chr 15) ( Table 1). Out of the 11 SNPs, three were novel rare variants in the LAMB2 gene exclusively found in CKDu cases.
An association study was further conducted for these isolated 11 SNPs. The observed MAFs through direct  Table 1. rs34970857 (c.658G [ A/p.V220M) located in the KCNA10 gene encoding a voltage-gated K channel was the most suspected SNP with a significant odds ratio of 1.74 (P = 0.005 for QTL analysis; P = 0.06 for dichotomous analysis) ( Table 1). The bioinformatics analysis by SIFT Score (http://sift.jcvi.org/) and Polyphen (2.2.2) (http:// genetics.bwh.harvard.edu/pph2/) for p.V220M of KCNA10 predicted that the variant is possibly damaging and the missense mutations in LAMB2 gene are probably damaging.

Discussion
Studies have consistently demonstrated the importance of genetic contributions to renal diseases [3,37,38]. In our previous investigations using GWAS, we reported a single common genetic variant in SLC13A3 gene associated with CKDu [13]. Rare and generally deleterious genetic variants are also known to cause strong impact on risk of diseases in individual patients [15]. Rapid advances in next-generation sequencing techniques such as whole-exome sequencing have facilitated the identification of these rare variants. In the present study, we identified one genetic variant in KCNA10 gene and four rare variants in LAMB2 gene which were not identified in our previous GWAS. KCNA10 encodes a recently discovered voltage-gated Potassium (K) channel located in the heart, aorta glomerular endothelium, and apical membrane of renal proximal tubular cells [39,40]. Its presence in endothelial and vascular smooth muscle cells supports the notion that it may also regulate vascular tone [40]. Dysfunction of these K channels is known to be related to cardiovascular diseases, such as hypertension [41]. Previously reported SLC13A3 gene is also reported to have suggestive association with blood pressure [42]. Thus, both SLC13A3 and KCNA10 may play roles in blood pressure regulation and the pathogenesis of hypertension. In our previous study, as in the risk factor analyses for disease progression and mortality in CKDu, hypertension was identified as a major risk factor for disease progression [43]. This evidence suggests that although clinically evident hypertension is not known to precede CKDu, hypertension may have an unidentified albeit an important role in the pathophysiology and progression of CKDu.
LAMB2 encodes laminin b2 is a component of laminin-521, which is an important constituent of the glomerular basement membrane [30]. Detrimental missense mutations in the LAMB2 gene cause Pierson syndrome, a severe congenital nephrotic syndrome with ocular and neurologic defects [29,30,44,45]. In the present study, we could not find extra-renal manifestations ocular and neurologic defects associated with mutations in LAMB2 gene. Family history has been reported to be a significant predictor for CKDu, and our investigations so far supports that genetic susceptibility plays an important role in pathogenesis of CKDu [12,13]. In the present study, we identified four different rare variants by the exome sequencing of only a limited number of cases and controls. Thus, this suggests that there is a possibility for the existence of more rare variants in this population. Previous evaluations of the pedigrees of CKDu patients have provided evidence in favor of a multifactorial disease rather than a single-gene disease [4]. The past and present genetic analysis also shows that pathogenesis of CKDu cannot be completely explained by genetic factors but, genetic susceptibility enforces a discernible risk for disease occurrence.