Introduction

Hereditary anemias (HA) are a genetically and phenotypically diverse group of disorders associated with mutations in more than 70 genes [1]. Genetic defects occurring in HA arise from a variety of different mutations that affect production of red blood cells in the bone marrow or cause red cell destruction due to defects in structural proteins of the cell, synthesis of the globin chains, or the expression of intracellular enzymes. Clinicians rely on morphologic/biochemical tests and clinical characteristics; however, these approaches fail to detect a percentage of the cases, and classification through these methods is not accurate [2]. Traditional laboratory workflow for HA diagnosis encompasses several lines of tests, adding cost and delay to the results. Thus, the genetic analysis would provide more informative clues for differential diagnosis, patient stratification, and genetic counseling.

Molecular diagnosis using routine sequencing approaches is impractical due to the large number and size of involved genes. In this context, targeted sequencing analysis using NGS appears as an advantageous tool since it enables simultaneous testing of several genes and has proven useful for accurate, rapid, and cost-effective diagnosis of several diseases, including hereditary anemias [3,4,5,6,7].

We developed a comprehensive NGS panel interrogating 35 genes related to red blood cell disorders, excluding hemoglobinopathies. The panel covered the coding regions, splice site junctions, and some regulatory regions, providing a high-throughput assay.

Materials and methods

Sample collection

A total of 36 samples of unrelated patients under clinical suspicion of hereditary anemia were included in this study. The available clinical characteristics along with osmotic resistance assay non-incubated or incubated with various salt solutions are presented in Supplementary Tables 1–3. All the patients with hemolytic anemia had splenomegaly and icterus. When available, first-degree relatives of the probands were also recruited for validation of the pathogenicity of potentially causative variants identified in the panel. Family members of patients with SPTA1 null mutations (patients 24, 35, 27, and 18) were also evaluated; clinical and molecular findings were shown in Supplementary Figs. 14, respectively. All patients gave written informed consent to the study; all procedures were approved by the University Ethics Committee of Campinas University.

A reference sample (NA12878) from the Coriell Institute for Medical Research Repository (Coriell Institute, Camden, NJ, USA) was used for comparing sequencing data. This sample was evaluated in triplicate, along with the 36 patients selected for the study. The libraries of each of the replicates were individually prepared using HaloPlex amplification followed by NGS.

The suspected clinical diagnosis before genetic testing was obtained by blood cell assays as peripheral blood or bone marrow morphology, osmotic fragility assays, and enzyme activity assays.

HaloPlex capture probe design

Our custom-made targeted NGS panel covers 35 genes known to be mutated in red blood cell diseases, including 13 genes causing membranopathies, 14 genes related to red blood cell enzymopathies, and 8 genes known to cause congenital dyserythropoietic anemia (CDA), as detailed in Table 1. All exons and the promoter regions of ANK1 and PKLR genes were included. Coverage of the target regions was 99.63%.

Table 1 Genes included in our panel and related phenotypes

Targeted gene capture and library construction for NGS was performed using HaloPlex as described by the manufacturer (Agilent Technologies, Santa Clara, CA).

Next-generation sequencing

Samples were equimolarly pooled and sequenced using 2 × 150 paired-end sequencing on the Illumina HiSeq 2500 instrument according to manufacturer’s protocol.

Data analysis and filtering methods

Sequencing reads were aligned against reference genome (UCSC hg19), and variants were called and annotated using the SureCall software (v.3.5.1.46; Agilent Technologies) and CLC Genomic Workbench (Qiagen).

Variants that had metric values of read depth (coverage) less than 20, quality score less than 20, or resulted in synonymous amino acid changes were excluded.

Variants previously classified as pathogenic by databases such as ClinVar, HGMD, dbSNP, deleterious variants expected to produce truncated or abnormal protein, or splice site variants were considered as causatives.

Novel missense variants classified as pathogenic for at least three in silico algorithms such as: SIFT (http://sift.jcvi.org/), Mutation Taster (http://www.mutationtaster.org/), Mutation Assessor (https://omictools.com/mutationassessor-tool), and Polyphen-2 (http://genetics.bwh.harvard.edu/pph2/), among others, were also filtered as causatives.

The degree of evolutionary conservation of the encoded amino acid was estimated by Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/). We also accessed the InterVar software [8] to predict pathogenicity of the variants according to ACMG Guidelines [9]; benign variants were excluded.

Sanger sequencing

To validate the potentially pathogenic variants, bidirectional Sanger sequencing was performed using the BigDye Terminator Cycle Sequencing kit (Applied Biosystems, Life Technologies Corporation, Carlsbad, CA, USA) on a ABIPrism 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). Primer sequences are available upon request. Sequences were compared to reference sequences using CLC Sequence Viewer 7.8.1 (Qiagen, www.qiagenbioinformatics.com). Chromatograms were visualized with CHROMAS v.2.6.4 (www.thechnelysium.com.au).

Results

Hereditary spherocytosis (HS) was presumed in 26 patients, hereditary elliptocytosis (HE) in 1 patient, enzymatic deficiency (ED) in 3 patients, and congenital dyserythropoietic anemia (CDA) in 4 patients. Additionally, 2 patients had inconclusive diagnosis due to absence of clinical and biochemical tests. Of the 26 patients with HS, three patients displayed a moderately severe phenotype, three patients displayed a moderate phenotype, and 14 presented mild HS (detailed clinical characteristics are showed in Supplementary Tables 1–3). Criteria for disease severity stratification have been previously defined [10, 11].

The comparison between the detected variants in our sample with exome sequencing databases of the reference sample NA12878 showed a sensitivity of 98.7% and specificity of 99.99% for the developed panel (Supplementary Table 4).

Potentially causative variants were identified in 26/36 (72%) of the subjects (Tables 24). In silico pathogenicity prediction by combined annotation-dependent depletion (CADD) score [12] was greater than 21.7 for all identified potentially pathogenic variants. The amino acid position conservation was 100% for all new missense variants. Deleterious variants with strong impact on the protein sequence including frameshift, nonsense, and splice site accounted for 64% (18/28) of the variants. Remarkably, 71% (20/28) had not been previously described and appear to be novel findings.

Table 2 Genetic variants, phenotype, and family history of 21 patients with membrane disorders
Table 3 Genetic variants, phenotype and family history of 3 patients with enzyme deficiency
Table 4 Genetic variants, phenotype, and family history of 2 patients with congenital dyserythropoietic anemia (CDA)

Genetic variants were detected in SPTB gene (β-spectrin) in nine cases, in the ANK1 gene (ankyrin) in six cases, in SPTA1 gene (α-spectrin) in four cases, in SLC4A1 gene (band 3) in two cases, in PKLR gene (pyruvate kinase) in two cases, in G6PD gene (glucose-6-phosphate-dehydrogenase) in one case, and in CDAN1 gene (codanin) in two cases.

Positive family history was obtained in 15 patients. Samples of affected and/or non-affected family members were available for 17 out of 36 patients. As part of the validation of putative pathogenic variants, we used Sanger sequencing and the medical history of first-degree relatives of the probands to confirm association of the variant with the phenotype. De novo mutations were confirmed in three cases. All nine SPTB variants were in heterozygosis. Novel deleterious mutations expected to produce truncated β-spectrin were the most frequent (Table 2).

We identified two families (patient 24 and 35) with αLELY in trans with a null SPTA1 allele segregating with mild HS phenotype. Some affected relatives of these two patients with the null allele and the absence of α-LELY presented a very mild phenotype, characterized by the presence of few spherocytes in the blood smear, no alteration in osmotic fragility, and a slight increase in reticulocyte counts (Supplementary Figs. 12). Two other families (18 and 27) with null mutations in one SPTA1 allele presented mild HS (Supplementary Figs. 34). The clinical characteristic of the affected patients is very similar to the relatives of family 35.

Of the three cases with enzymatic deficiency, variants in the pyruvate kinase gene (PKLR) were identified in two patients (17 and 34). In patient 17, the variant was present in homozygosis, and the condition is explained by inbreeding in the family. In patient 34, two different potentially pathogenic variants were identified in heterozygosis, one is a frameshift mutation not previously described. One asymptomatic male with extremely reduced activity of glucose-6-phosphate dehydrogenase had the G6PD p.R166C variant located in the X chromosome. This variant was described in 2012 [13] (Table 3).

Of the four patients previously diagnosed with congenital dyserythropoietic anemia (CDA), two (25 and 42) had variants in the CDAN1 gene (Table 4), characteristic of CDA type I.

Discussion

This study aimed to design a targeted NGS panel that could improve diagnosis and classification of patients with HA. We evaluated the performance of our panel in 36 patients with clinical suspicion of HA. Despite most of our patients had previous diagnosis based on clinical/biochemical tests, variants that explain the etiology of the hematological disorders were identified in 26 cases (72%), allowing genetic counseling and appropriate medical supervision for their family members carrying the deleterious variants.

A previous Brazilian study identified ANK1 variants in only 10% of HS patients, suggesting that mutations in ANK1 might not be as common in Brazil as described for the Northern European population, where ANK1 mutations account for ~ 50% of HS cases [10, 14, 15]. On the other hand, heterozygous variants in the SPTB gene were present in approximately 25% of HS patients of European ancestry [15]. In our study, we identified SPTB variants in 34.6% and ANK1 variants in 23.1% of the HS patients.

Despite patients with β-spectrin deficiency typically have a mild to moderately severe presentation of the disease and ankyrin defects being associated with a wider clinical severity ranging from mild to severe [16], the disease severity between β-spectrin- and ankyrin-deficient patients did not differ in our study; both were identified mostly in patients presenting mild HS (Supplementary Table 5).

Genuine nondominant HS where probands are compound heterozygous for α-spectrin genetic variants have rarely been described [16, 17]. Most of the reports identified null mutations associated with low-expression alleles, generally the α-LEPRA (low-expression Prague) allele [17,18,19,20]. α LEPRA mutation (c.4339-99C > T) occurs in SPTA1 gene in about 5% of Caucasians [17]. Interestingly α-LEPRA was not identified in any individual of our study, corroborating previous unpublished results of our group.

The p.L1858V polymorphism in exon 40 of α-spectrin (SPTA1) is in linkage disequilibrium with a variant NC_000001.10:g.158587858G > A in intron 45, associated with partial skipping of exon 46. The allele containing both variants is known as α-LELY, a low-expression allele. The frequency of α-LELY is around 23% in the general population (http://exac.broadinstitute.org/); however, we found a higher proportion in our sample (61%). These high frequencies of this variant were not observed in previous studies with non-affected Brazilian individuals and could reflect an anecdotal finding.

Null mutations in SPTA1 gene were identified in four families with hereditary spherocytosis (18, 24, 27, and 35). The clinical characteristics between individuals with null mutations in only one allele were very similar (spherocytes in blood smear, high reticulocytes rate, and normal osmotic fragility) supporting that null mutations in SPTA1 produce dominant mild HS. In two patients of these families (24 and 35), the α-LELY were identified in trans with the null allele and appeared to aggravate the phenotype (higher reticulocytes, lower hemoglobin count, and increased osmotic fragility) in comparison to family members without the α-LELY (Supplementary Figs. 12).

In hereditary elliptocytosis, it is well known that α-LELY in trans to a missense mutation in alpha-spectrin enhances the severity of the disease [21,22,23,24]. In spherocytosis, however, one previous study concluded that αLELY in trans to null HS allele of the SPTA1 gene do not cause HS, due to the sufficient supply of spectrin to meet the needs of the membrane [19]. The divergence between this previous report and our results may be caused by differences in the genetic background or methodologies to classify the disease. Other studies have demonstrated that the clinical severity of HS varies considerably even within a single family, reflecting the genetic heterogeneity of the disorder.

The clinical diagnosis of patients with hereditary anemias is often difficult. Thus, tools for accurate genetic analysis become crucial, mainly in cases with an ambiguous phenotype. Despite increasing affordability and development of automated analyses of NGS, the identification of pathogenic variants in association with HA through sequencing panels remains challenging, due to the high price per sample and laborious data processing. We believe the budgetary limitations for broader use of NGS platforms would be solved with the gradual reduction of cost of this technology over time. Cost reduction will allow simultaneous evaluation of family members to establish the inheritance pattern, introducing testing of newborns with a family history of HA and confirmation of the pathogenetic role of the identified variants.

Conclusion

The panel developed in this study proved to be effective for the detection of causative mutations in hereditary anemias and has also contributed to a better understanding of the genetic basis and phenotypic consequences of these rare conditions in the Brazilian population. Even though our study was mostly limited to cases with membrane disorders, the gene panel has the capacity to interrogate other causes of inherited anemias (e.g., enzyme disorders and CDA). The SPTB gene was the most frequent cause of HS in our sample, resulting in variable phenotypic severity. Twenty new variants were identified; most of them were null (nonsense or frameshift) mutations. The α-LELY variant was identified in most cases and appeared to modulate the severity of HS when in trans to a null allele in SPTA1 gene.