Clinical exome sequencing is a powerful tool in the diagnostic flow of monogenic kidney diseases: an Italian experience

Background A considerable minority of patients on waiting lists for kidney transplantation either have no diagnosis (and fall into the subset of undiagnosed cases) because kidney biopsy was not performed or histological findings were non-specific, or do not fall into any well-defined clinical category. Some of these patients might be affected by a previously unrecognised monogenic disease. Methods Through a multidisciplinary cooperative effort, we built an analytical pipeline to identify patients with chronic kidney disease (CKD) with a clinical suspicion of a monogenic condition or without a well-defined diagnosis. Following the stringent phenotypical and clinical characterization required by the flowchart, candidates meeting these criteria were further investigated by clinical exome sequencing followed by in silico analysis of 225 kidney-disease-related genes. Results By using an ad hoc web-based platform, we enrolled 160 patients from 13 different Nephrology and Genetics Units located across the Piedmont region over 15 months. A preliminary “remote” evaluation based on well-defined inclusion criteria allowed us to define eligibility for NGS analysis. Among the 138 recruited patients, 52 (37.7%) were children and 86 (62.3%) were adults. Up to 48% of them had a positive family history for kidney disease. Overall, applying this workflow led to the identification of genetic variants potentially explaining the phenotype in 78 (56.5%) cases. Conclusions These results underline the importance of clinical exome sequencing as a versatile and highly useful, non-invasive tool for genetic diagnosis of kidney diseases. Identifying patients who can benefit from targeted therapies, and improving the management of organ transplantation are further expected applications. Electronic supplementary material The online version of this article (10.1007/s40620-020-00898-8) contains supplementary material, which is available to authorized users.


Introduction
The importance of genetic contributions in the development chronic kidney disease (CKD) is underlined by several observations: (1) inherited CKD (IKD) represents a high percentage of all CKDs [1][2][3], (2) the presence of a first-degree relative with end stage kidney disease (ESKD) confers a sevenfold increased risk of developing kidney failure [4], and (3) approximately 20-30% of patients report a positive family history of CKD in either a first-or seconddegree relative [5,6]. Thus, IKD represents one of the leading causes of CKD in children and adults, resulting in an increased risk of mortality, the need for organ transplantation, and high health care costs.
In the paediatric and young adult subset of patients, monogenic diseases represent up to 20% of patients who develop CKD before 25 years of age, with a variable diagnostic yield considering the different CKD categories [7][8][9].
Changes in DNA sequence are usually single nucleotide variants (SNVs) and small insertions or deletions (indels), but larger deletions or insertions called copy number variants (CNVs) may also occur, particularly in syndromic children.
There are several monogenic inherited diseases that cause CKD, including developmental disorders, cystic and noncystic ciliopathies, and glomerular and tubulo-interstitial diseases [10].
Establishing a genetic diagnosis strongly impacts patient management and prognosis [8,11,12], both by influencing treatment choices, as is the case for focal segmental glomerulosclerosis (FSGS), and by providing access to specific drugs, as is the case of vasopressin 2 antagonists for patients with autosomal dominant polycystic kidney disease (ADPKD). For these reasons, genetic testing is increasingly utilized in clinical nephrology due to accessibility of next generation sequencing (NGS) technologies [13,14], which are non-invasive and cost-effective, and are becoming part of the diagnostic flow for several diseases, due to their decreasing costs, high throughput abilities and reduced sequencing times [15]. In this context, NGS technology enables us to simultaneously investigate hundreds of genes, thus opening up the possibility to rapidly identify genetic factors that underlie IKDs.
This study reports on the set up of an easy-to-use and accessible genetic testing platform which can be used to characterize undiagnosed cases of CKD eligible for NGS testing. Specifically, through this analytical pipeline, we aimed at (1) confirming diagnoses, particularly for patients in whom a monogenic condition was suspected, (2) finding the genetic cause of previously undiagnosed diseases, (3) identifying patients who could benefit from targeted therapies and (4) improving the management of organ transplantation, particularly in the living donor setting. The study included 160 patients, recruited in 13 nephrology or genetic counselling services across the Piedmont Region (north-west Italy, with a population of  approximately 4,356,000), and coordinated by the IGTS  between September, 2018 and December, 2019. The IGTS  performed genetic testing, while recruiting centres are  reported in Table 1. Overall, these centres follow > 3100 patients on dialysis and approximately 2500 transplanted patients (detailed data are available at www.trapi antip iemon te.it).

Patients
All patients included in the study provided written informed consent.

Set-up of the platform for genetic diagnosis of kidney diseases potentially leading to organ failure
We set up a web-based genetic service to provide initial genetic counselling to support regional nephrology centres in Piedmont that requested genetic evaluation (Fig. 1). Whenever possible, the referral centre provided IGTS with the patient's medical records including a detailed family history, clinical data from routine diagnostic procedures, parameters of kidney function, imaging data and biopsy results (https ://www.cse.crtpi emont e.it/auth/CRT%20Log inGEN new.html).
The platform allowed remote multidisciplinary consultation in order to decide whether patients were eligible for this type of genetic test and to allocate patients with CKD into one of the following categories: (a) Patients with a positive family history for CKD; (b) Patients for whom genetic confirmation of the clinical diagnosis was required; (c) Patients with CKD with no clinical diagnosis of a definite disease.

Genetic testing
DNA was extracted from blood samples, evaluated for integrity, and then processed for NGS analysis. Sequencing data were analysed by bio-informatics tools to identify, annotate and prioritize variants in order to generate a technical report. Variants were included in the genetic report that was then shared with the referring physician. Sanger sequencing on a second independent DNA extraction was performed to confirm NGS results. When possible, family segregation studies were performed. The outcome of the genetic test was shared with the clinical team to plan the following steps (Fig. 1). Patients were referred to the closest genetic counselling centre.

Diagnostic cohort
The validity of the platform and of the analytical pipeline was tested in a training cohort of 29 blindly tested patients for whom clinical and genetic diagnoses were already available. In each case previous genetic diagnosis was confirmed, suggesting that the adopted workflow is effective in the identification of monogenic kidney disease causative variants.
Of the 160 patients for whom genetic analysis was requested, 22 were excluded after a second re-evaluation (due to older age or confounding co-morbidities), while 138 were eligible for NGS analysis. Among them, 52 were children (< 18 years old, 37.7%), while 86 were adults (62.3%). Seventy-eight/138 (56%) were male [24 in the paediatric (46.2%) and 54 (62.8%) in the adult cohort, Table 2]. Sixtyseven out of 138 patients (48.5% in total; 34.6% in the paediatric and 57.0% in the adult subset) had a positive family history for kidney disease (Table 2).

Clinical exome sequencing and raw data processing
Libraries were prepared using the TruSight One Expanded Sequencing Kit (Illumina, San Diego, CA, USA) following the manufacturer's instructions. Raw data were processed as reported in the Supplemental Methods (Online Resource). The choice of the clinical exome approach was dictated by the experimental need of performing a single sequencing, followed by flexible in silico analysis of organ-specific gene panels (e.g. kidney or liver), further tailored on the basis of the clinical suspicion, if available. The list of causative genes associated with kidney disease is updated twice a year, with eventual re-analysis of patients with negative genetic reports every 24 months, without the need for DNA re-sequencing.

Design of an ad hoc pipeline of analysis to identify causative genes
To perform variant calling and identify causative variants, we designed an ad hoc pipeline of analysis based on sequential inclusion/exclusion steps. After reads were aligned to the GRCh37 as the reference genome using BWA, Isaac Aligner, GATK tools from Illumina, variants were processed using Variant Interpreter software, filtering-in mutations on the basis of a phenotype to genotype correlation. Inheritance mode was considered next. Specifically, if heterozygous mutations were found in genes associated with autosomal recessive (AR) diseases, they were carefully re-analysed to check for variants in genes known to be responsible for clinical phenotype in association with other genes (digenic diseases).
Filtered variants were then annotated (1) on the basis of the main public databases reporting associations between  (2) by considering the impact on protein structure or function by in silico prediction tools. Variants classified as "pathogenic C5" and "likely pathogenic C4" were always included in the genetic report, as were "variants of unknown significance (VUS) C3" in genes associated with diseases with autosomal dominant (AD) or X-linked recessive (in males) mode of inheritance, while C3 variants in genes associated with diseases having AR mode of inheritance were reported only if they were in line with the clinical phenotype (Fig. 2). Confirmation by Sanger sequencing and family segregation studies were performed whenever possible.
Classification of the identified variants and their description in the genetic report were in line with The American College of Medical Genetics and Genomics (ACMG) policy statement on clinical sequencing (https ://www.acmg.net/) and with the Italian Society of Human Genetics (SIGU) [16].

Costs related to the NGS approach for the diagnosis of genetic kidney diseases
Overall, the cost of analysis per sample was differentiated on the basis of clinical suspicion: if a specific disease with < 3 causative genes was suspected the cost charged to the national health system was 1062 euros. For all other cases the cost charged to the national health system was 2262 euros.

Overall genetic findings
Overall, by adopting the reported bio-informatics analysis pipeline, we detected 129 variants in 65 genes, with 28 patients carrying more than one variant. Interestingly, of all these variants, only 3 were recurrently present in more than one patient, while all the others were uniquely carried by individual patients.
Genetic variants were classified according to ACMG guidelines. In 78/138 (56.5%) patients, at least one variant was compatible with the clinical phenotype, as indicated in Table 3. In the remaining (60/138; 43.5%) patients, variants were either not present, or heterozygous in autosomal recessive genes or they were not in line with the clinical phenotype (not shown). Among patients for whom we identified variants compatible with the phenotype, 43 (55.1%) presented heterozygous variants in genes associated with autosomal dominant diseases, 16 (20.5%) were homozygous or compound heterozygous with variants in genes associated with autosomal recessive disease (among which 1 was a copy number loss) and 11 (14.1%) were characterized by variants in genes mapping on chromosome X (among which 2 were copy number losses). Lastly, 8 patients (10.3%) presented with variants in genes with both an autosomal dominant and autosomal recessive mode of inheritance (Table 3; Fig. 3a).
Furthermore, when classifying all the variants identified by clinical exome sequencing according to ACMG guidelines to describe mutations in genes that cause Mendelian disorders, we found 27 variants defined as "pathogenic C5" (21.0%), 35 as "likely pathogenic C4" (27.1%) and 67 as "variants of unknown significance C3" (51.9%) ( Table 3; Fig. 3c), considering that 28 patients were characterized by the presence of more than one variant with different classification.

Association between clinical and molecular diagnosis
The diagnosed cases, defining patients for whom genetic variants in line with the clinical phenotype were identified, were differentially distributed when considering the clinical suspicion categories (Table 3; Fig. 4). A high detection rate was obtained in glomerular diseases (14/21 cases; 66.7%), especially Alport disease and ciliopathies (22/32 cases; 68.8%), particularly ADPKD, while for tubular diseases and HUS, causative variants were identified in 4 out of 11, and 1 out of 4 cases, respectively. In the nephrolithiasis and nephrocalcinosis subset, one patient presented with a potentially causative variant in a relevant gene. With regard to the remaining categories, phenotype-related variants were detected in 50% of cases (4 out of 8). Moreover, our NGS approach identified the genetic culprit in a significant proportion of cases presenting with organ-failure of unknown origin (32/60 cases; 53.3%).
Among the cohort of patients with variants identified by NGS, all cases were validated by Sanger sequencing performed on a second independent aliquot of DNA. When possible, specifically in 23/52 paediatric patients, variant(s) were validated in the proband and in the trio. This analysis confirmed the segregation of variants in the family and helped clarify the clinical significance of "C3 VUS".

Discussion
In this study we describe an ad hoc-designed web-based platform built to connect regional Nephrology and Genetics centres to a centralized facility that provides genetic testing for patients with CKD. Herein we share our preliminary 15-month experience in applying this targeted sequencing to achieve a genetic diagnosis for undiagnosed patients with CKD in a well-selected cohort of patients from north-west Italy.
Our study presents two points that are worthy of interest. First, the feasibility of a centralized platform to support multidisciplinary consultation in patients with a high clinical suspicion of a monogenic condition. Second, an improvement in the diagnostic rate of patients with CKD and no previous definite diagnosis.
Our approach is based on a web-based platform as an extension of the existing regional transplant network. By using this platform, we attempted to optimize multidisciplinary consultations for patients for whom a monogenic condition was suspected. In order to explain the philosophy   Patients' samples and informed consent were obtained through the Nephrology or Genetic Counselling Services, thus overcoming the need for the patient and his/her family to travel. The connection of the IGTS to the various nephrology Units throughout the Region was made possible by a capillary network of the Regional Centre for Transplantation. To speed up the connection between "the edge" and "the centre" of the hub, an IT platform was set-up allowing clinicians and geneticists to share clinical data and genetic reports. Recruited patients were initially evaluated by geneticists for their eligibility for NGS based on several criteria including family history and clinical data. Analyses of sequencing data and identification of the causative variants were performed based on an ad hoc pipeline.
Close to 300 patients were recruited between September, 2018 and March, 2020, and a final genetic report was available for 160 of them after a median time for genetic analysis of 6 months. The remaining 140 patients were in different steps of the diagnostic process at the time of this interim analysis.
In this study, we performed clinical exome sequencing followed by an in-silico analysis focused on selective genes in a cohort of 138/160 recruited patients affected by CKD. In 56.5% of cases NGS analysis was able to determine the molecular genetic cause of the disease, revealing 129 variants in 65 genes. These results are on average higher than those reported in the literature, likely due to patient preselection on the basis of positive family history and clinical suspicion [9].
With regard to the need to provide genetic confirmation of a previous clinical diagnosis, NGS analysis was able to confirm 68.8% of ciliopathies, a percentage that is in line with previous publications [17]. The detection rate was higher in glomerular diseases (66.7% vs. 14% reported in the literature) and nephrolithiasis (50% vs. 15-30%) [9]. This high percentage is due to selection of patients with a suspicion of Alport disease, at least based on biopsy results. In contrast, the percentage of solved cases presenting with Congenital Anomalies of the Kidneys and of the Urinary Tract (CAKUT) and haemolytic uraemic syndrome was quite low, with a considerable number of cases remaining undiagnosed. A reason for these results could be related to either the genetic heterogeneity of the disease, with many causative genes still to be identified, or to non-genetic causes [8].
In a considerable subset of the recruited cohort, patients were referred to genetic analysis because of a kidney disease of unknown origin. As expected, based on previous experience from other centres, this approach proved to be efficient in revealing causative variants: in a significant number of these cases, we were able to identify genetic variants that were in line with the clinical phenotype, thus helping clinicians in the management of these patients. Surprisingly, in our cohort, the percentage of patients for whom a genetic variant in line with the clinical phenotype was identified was not so different when considering paediatric (57.7%) and adult (55.8%) subgroups. One explanation is that our adult cohort was carefully selected for patients with a strong suspicion of an underlying genetic condition. In line with the selection of the cohort is the limited number of cases that were re-classified. Of note, 18 out of 60 patients lacking a definitive diagnosis were children. NGS application to this subgroup appeared to be a useful tool as it resulted in the detection of variants in an appreciable number of cases (10 out of 18; 55.5%), and provided a genetic explanation for their clinical condition.
Establishing a precise genetic diagnosis, especially for childhood-onset CKD, allows for pre-emptive screening for extra-renal manifestations. In some cases, the kidneys are not the only affected organs and variants in selective genes may cause syndromic diseases. In other cases, the phenotype is the result of hypomorphic mutations leading to variable expressivity and thus resulting in varying clinical manifestations. Moreover, it must be kept in mind that some diseasecausing genes may manifest as de novo variants, with a non-inherited history. Finally, because of the high phenotype heterogeneity, several forms of IKDs may become evident only later in life, when patients reach ESKD. Establishing an early and accurate diagnosis will result in better patient management, improving quality of life, and avoiding useless treatments. Furthermore, it allows early screening of at-risk family members.
This technical approach has some known drawbacks. In exome sequencing, variants occurring in the intronic and promoter regions cannot be identified, and not all genomic regions are equally covered. Moreover, regions with high guanine-cytosine content, and high sequence homology with pseudogenes may be missed. Even detection of copy number variations or structural variants can be difficult and need to be further validated by alternative approaches. An additional limitation of this type of sequencing is represented by the detection of pathogenic variants in the MUC-1 gene, represented by duplicated C or inserted A nucleotides within the Fig. 4 Clinical and genetic diagnosis in the Piedmontese CKD cohort. Patient cohort is divided on the basis of the clinical suspicion (inner pie). Number and percentage of patients for each macrocategory are indicated outside the outer pie, which instead represents the percentage of patients with identified causative variants (variants in line with the clinical phenotype) and patients with no causative variants identified or variants incompatible with the clinical phenotype for each disease category. Specific percentages of these cases are reported on the right with a colour-code legend coding variable-number tandem repeats (VNTRs), which cannot be identified by exome or genome sequencing, but can only be identified by targeted analysis [18]. Finally, we have to underline that some genes known to be associated with specific CKD phenotypes are not included in this clinical exome panel, and thus variants occurring in these genes cannot be investigated. It is also worth pointing out that the list of genes involved in CKD is progressively expanding [9], therefore, applying the updated list of genes in the reanalysis of previously sequenced patients who received a non-conclusive or negative genetic diagnosis may result in the identification of causative genes. Likewise, variants of unknown significance identified by NGS can be re-classified over time, benefiting from periodic updates. These latter observations also justify the choice of the experimental approach adopted in this study based on clinical exome sequencing instead of limited and fixed targeted sequencing panels.
In conclusion, this study shows that clinical exome sequencing is a non-invasive, highly effective tool for genetic diagnosis if the program is supported by careful candidate selection. It can be useful in identifying patients who would benefit from targeted therapies, such as vasopressin 2 antagonists in the case of ADPKD. Furthermore, it may impact on therapy choices, particularly in the case of FSGS, and in the selection of the ideal family member as a kidney donor. This approach is especially applicable in geographic areas where the interaction between a robust nephrological network and genetic facilities is long-standing. Lastly, it can be cost-effective, especially if it is applied early in the diagnostic flow of the patient as it may (1) provide an early diagnosis and (2) avoid unnecessary treatment, while guiding the nephrologist towards the best management of the patient. For all these reasons, this approach could become, in wellcharacterized cases, an essential step of the diagnostic path.