Introduction

Severe combined immunodeficiency (SCID), characterized by extremely low or absent T cell production, defective T cell function and absent antibody responses, can be caused by defects in any of several genes and if untreated leads to early death due to infections [1, 2]. Population based newborn screening for SCID has been recommended to identify affected infants before the onset of devastating infections so that effective treatment can be provided [36]. A newborn screening test for SCID, now implemented in several states, ascertains T-cell receptor excision circles (TRECs), DNA byproducts of T cell antigen receptor gene rearrangement, as a biomarker of normal T cell development [5, 710]. TRECs are measured by quantitative PCR (qPCR) of DNA isolated from infant dried blood spot (DBS) samples universally collected in nurseries. For infants with undetectable or low TRECs, or with unsatisfactory DNA amplification, a differential white blood count and analysis of lymphocyte subsets by flow cytometry are obtained to establish the absolute number of naïve T cells, after which further clinical and laboratory evaluations are performed to arrive at a definitive diagnosis [9].

Beyond typical SCID cases, TREC screening has detected a spectrum of infants with inadequate numbers of diverse, autologous T cells. As predicted before the start of screening, “leaky” SCID and Omenn syndrome, both due to hypomorphic mutations in SCID genes, have been found in infants with low TRECs, as have cases of DiGeorge syndrome/chromosome 22q11 deletion in which a substantial degree of thymic insufficiency exists. In addition, secondary causes of T lymphocytopenia have included abnormal loss of T cells from the peripheral circulation, such as with chylothorax or hydrops. A challenging and less anticipated category of cases with abnormal TREC results has been the infants with persistent T lymphocytopenia of 300–1,500 T cells/μL, no maternal T cell engraftment, and absence of identified deleterious mutations in common SCID genes. These infants have been designated as combined immunodeficiency (CID) or SCID variants by TREC newborn screening programs [9]. CID or SCID variant cases have been of particular interest, providing an opportunity to discover previously unappreciated causes of newborn T lymphocytopenia. In the absence of clues to narrow the number of potential candidate genes to account for variant SCID in asymptomatic infants who appear healthy, high throughput deep sequencing may be useful; this approach has led to gene identification in other primary immunodeficiencies [1113].

Using whole exome sequencing (WES), we found two infants with variant SCID who had deleterious mutations in the Ataxia Telangiectasia Mutated (ATM) gene. Prompted by the prospective discovery of these patients’ diagnoses and a recent report of low TRECs in archived DBS from cases of ataxia telangiectasia (AT) [14], we reviewed 13 cases of AT in our clinical cohorts. By retrieving their residual DBS samples taken in the newborn nursery and measuring their TREC numbers, we showed that over half of AT patients could be identified as abnormal, making AT a secondary target of SCID screening.

Methods

Subjects

Infants V003 and V004 were identified as positive by routine California SCID screening by TREC test and confirmed to have T lymphocytopenia. Informed consent for research, including cellular immune studies and WES, was obtained for the infants and their parents under approved protocols at Children’s Hospital Los Angeles (CHLA) and the University of California San Francisco (UCSF). Additional patients from the pediatric immunology services at CHLA and UCSF were enrolled with institutional review board approval.

DNA Samples

Genomic DNA from EDTA anticoagulated whole blood was prepared using a Gentra Puregene Blood kit (Qiagen USA: Germantown, MD).

Exome Sequencing

Libraries were prepared by ligating a pair of TruSeq adaptors (Illumina: San Diego, CA) to genomic DNA sheared to a mean fragment size of 200–300 bp (S2 sonicator, Covaris: Woburn, MA). Specific sequence tags were added to different samples to differentiate each individual of origin. Libraries with these adaptors and barcode sequences were enriched with 10 cycles of PCR. For infant V003 and parents, exon capture was performed by pooling 500 ng of each of 6 libraries incubated with Illumina TruSeq version 2 biotinylated exon-encoded DNA oligonucleotides for 20 h. For V004 and parents, exon capture was performed by incubation with a Roche Nimblegen version 3 capture array. Exon-enriched DNA was captured with streptavidin-labeled magnetic beads, washed and eluted. Capture reactions were repeated to enhance specificity. After 10 cycles of DNA amplification, the exome libraries were sequenced (HiSeq2000, llumina). Paired 100 bp end reads were generated (>50 M reads/subject), to yield an average of >65 reads covering the targeted regions with >90 % covered by at least 10 reads.

Whole Exome Sequence Analysis

Raw reads were aligned against reference genome hg19 using BWA (0.5.9) software [15]. The resulting files were converted to compressed binary format (BAM), sorted by coordinate, indexed, and marked for PCR duplicate reads using the Picard toolkit (http://picard.sourceforge.net). BAM files were processed to reduce artifacts and improve call accuracy using GATK software (v 1.4.15) [16, 17]. Specifically, local realignment was performed around known insertion or deletion (indel) locations, and base quality scores were re-calibrated using co-variates such as position in read and sequencing chemistry effect.

Variants were called using the GATK UnifiedGenotyper. The called single nucleotide polymorphisms (SNPs) had their scores re-calibrated by variant quality score recalibration (VQSR) using the exomes in this report plus 24 others sequenced at our site. HapMap v3.3 and the Omni chip array sets from the 1,000 genomes project (October, 2011 release) were training data, and HapMap 3.3 provided truth sites [18, 19]. A truth sensitivity cutoff of 99 % was used. For indel recalibration and quality selection, we used QD <2.0, ReadPosRankSum <−20.0, Fischer Strand >200.0. Additional annotations including region, effect, dbSNP 135 and 1,000 genomes membership and OMIM phenotype were added using custom scripts. Filtering of exome data was performed using a combination of custom scripts and vcftools [20]. Aligned sequences were viewed using the Savant Genome Browser [21].

PCR and Genomic Sequencing

ATM exons 7, 10, 39, and 46 were amplified from genomic DNA using published primers (labeled as exons 9, 12, 41, and 48 in Thorstenson et al. [22]) with an M13 extension 5′ to each reverse primer as follows:

  • ATMe7_Forward 5′ - GTA AAA CGA CGG CCA GTC AGC ATA CCA CTT CAT AAC TG

  • ATMe7_Reverse 5′ - TCA TAT CCT CCT AAA GAA CAC

  • ATMe10_Forward 5′ -TGT GAT GGA ATA GTT TTC AA

  • ATMe10_Reverse 5′- GTA AAA CGA CGG CCA GTT GTG ATG GAA TAG TTT TCA A

  • ATMe39_Forward 5′- TGT GGT TTT TGG GAA TTT GTA

  • ATMe39_Reverse 5′- GTA AAA CGACGG CCA GTT GTG GTT TTT GGG AAT TTG TA

  • ATMe46_Forward 5′ - GTA AAA CGA CGG CCA GTT CTT GTC ACT ACA AAA GTT CCT TT

  • ATMe46_Reverse 5′ - TCT TTT TCC CTC AGG CTT TC.

Sequencing was performed with the M13 forward primer, and results were compared with reference ATM NG_009830.1, using Sequencher 4.10.1 software (Gene Codes Cooperation: Ann Arbor, MI).

Neonatal Dried Blood Samples

Residual DBS originally collected for routine NBS and stored at −20° by the Genetic Disease Laboratory (GDL) of the California Department of Public Health (CDPH) were retrieved, and TREC and β-actin gene copy number determined by the GDL newborn screening laboratory, using the protocol of Chan and Puck [5] modified for high throughput and implemented by PerkinElmer, Inc (lab within a lab at California Department of Public Health, Richmond, CA; parent company based in Hershey, PA) with cutoff values as reported [9].

Results

Infant Clinical and Immunologic Findings

Newborns in California are screened for SCID and classified as positive if TREC copy number is ≤5 with β-actin >5,000 copies, or TREC copy number is between 6 and 25 with β-actin >10,000 copies. The tests are classified as incomplete and are repeated if there are low TRECs, but also low copies of the β-actin gene segment amplified as a control. T cells are measured by flow cytometry in cases that are positive or that have two incomplete DBS samples.

Infants V003 and V004 were unrelated, healthy females born at term following normal pregnancies. Family history for both was negative for immune disease or consanguinity. Lymphocyte flow cytometry was ordered for infant V003 after two DBS yielding incomplete results, while V004 had an initial positive result with 21 TRECs and 13,300 β-actin copies (Table I).

Table I Immunologic Phenotype of Infants Identified by SCID Newborn Screening

At age 3 months Patient V003 had a normal total white blood cell count, but only 4 TRECs/μL (with normal β-actin copies) and only 1,600 lymphocytes/μL (Table I). There were 996 T and 52 B cells/μL (normal >2,000 and >300, respectively), and the number of CD45RA naïve CD4 T cells was low. NK cell number was normal. Low T and B cell numbers persisted, and low IgG levels with failure to produce antibodies after vaccination led to institution of immunoglobulin replacement and trimethoprim-sulfamethoxazole antibiotic prophylaxis. Lymphocyte proliferation to phytohemagglutinin and TCR Vβ diversity assessed by spectratyping [23] were normal, but an Epstein Barr virus transduced B cell line from the patient had only half normal phosphorylation of STAT5 in response to IL-2 [24], suggesting an intrinsic lymphocyte impairment. Later, at age 14–16 months, V003 was reported by her mother to have “unsteady” gait; physical examination first showed mild truncal ataxia at 20 months.

Infant V004 had flow cytometry at 21 days of age, showing only 1,060 T cells, with low CD45RA naïve helper CD4 T cells (Table I). B and NK cell numbers and lymphocyte proliferation were normal, but Vβ spectratyping showed decreased T cell diversity (not shown). As with infant V003, in vitro phosphorylation of STAT5 after IL-2 activation was diminished, but not absent, as would be the case in SCID due to defects in the IL-2 receptor common γ chain or Janus kinase 3 [9, 24]. Physical examination of infant V004 has been normal to date, but she did not mount robust antibody to T-cell dependent protein-conjugated H. influenzae vaccination.

Exome Analysis and Gene Confirmation

To investigate the genetic etiology underlying their observed immunodeficient status, DNA samples from V003, V004 and their parents were subjected to WES to generate a list of small nucleotide polymorphism (SNP) and small insertion or deletion (indel) variants. These lists were filtered to retain successively fewer candidate variants as shown for SNPs and indels separately for each infant in Fig. 1b. After initial quality filtering and removal of variants common enough to be found in dbSNP 135 and the 1,000 Genomes database, further filters were applied to keep only the non-synonymous variants lying in captured exonic and splice-site regions of genes, and only those variants with a high (>30) genotype quality score (Fig. 1a, step 4).

Fig. 1
figure 1

WES variants filtering paths. a Trapezoids represent filters with resulting numbers of variants retained after each step indicated by a circled digit. Starting with initial total variant lists (1), filters were applied for quality (2) and then to keep rare alleles (3) that alter splice sites or produce non-synonymous codon changes and are absent in local exomes (4). Subsequent strategies were: focusing on variants from a list of genes associated with T cell phenotypes (yellow shading, 5); or demanding a recessive inheritance pattern (red shading, 6). b Numbers of variants retained for the exome of each proband, V003 and V004, after each filtering step in A, showing individual numbers of SNPs/indels, left, and genes harboring variants, right. For steps (7) and (8), number of genes containing candidate variants in each proband are shown. The final lists of genes at step (8) are as follows: for V003 –ATM, PCDH15, PHF2; for V004 – ATM, EYS, PCDP1, PRUNE2, SH3D21, TSHZ3, TTN. *, 2 variants, both in the ATM gene. **, genes with rare variants conforming to a recessive disease model in the family trio

We then further limited the disease gene candidates either by function of gene products (Fig. 1a, yellow) or by genetic segregation (red). For functional selection, we used a list of 49 candidate genes involved in T cell development or reported to be defective in human primary T cell deficiencies by the International Union of Immunological Sciences Committee on Primary Immunodeficiency [25]. In infant V003, this filtering left three heterozygous variants, of which two were frameshift deletions within the ATM gene. Aligned sequence reads supporting one of these, variant c1787delAA (K468fs), a two base deletion in ATM exon 10, are illustrated in Fig. 2a. Evidence for the second ATM variant of V003, c6238delA (F1952fs), a single base deletion in exon 39, was equally robust (not shown).

Fig. 2
figure 2

Sequence evidence for a heterozygous exon 10 2-bp deletion of ATM in infant V003. a Aligned paired-end reads from whole exome sequence, chr11:108121571–108121612, viewed with Savant Genome Browser.18 Center black bars indicate deletion of c1787delAA, K468fs, found in 27 of the 54 reads that include this sequence. Dark blue, forward reads; light blue, reverse reads. Colored rectangles, mismatch base calls compared to reference genome (judged to be artifacts because of singular occurrence). b Sanger genomic reverse sequence confirming the heterozygous deletion. c Reference and mutated amino acid codons, showing the frameshift, which led to 17 missense codons followed by a termination

In infant V004, inclusion in the T cell gene list yielded only two heterozygous SNPs, both in ATM, c.1260C>T (P292L), and c.7064C>T (R2227C). These results suggested disease-causing compound heterozygosity in both V003 and V004.

By the segregation filtering method, we retained gene altering variants fitting a homozygous or compound heterozygous model of recessive inheritance - that is, those genes for which the patient inherited one rare allele from each parent - using exome data from each infant/parent trio. By this method, 5 candidate genes remained in the family of V003 and 9 in the family of V004. By filtering out variants also found in the unrelated local exomes used for VQSR, these numbers were reduced to 3 and 7 genes, respectively. Infant V003 shared the ATM mutation K468fs with her mother and F1952fs with her father, while V004 shared P292L with her mother and R2227C with her father. The other genes harboring parentally shared variants were not associated with any recognized immunologic phenotype.

Sanger sequencing confirmed the ATM mutations seen by exome sequencing for both V003 (Fig. 2b) and V004, as well as in both sets of parents (results not shown).

Subsequent to the sequence findings, both infants underwent measurement of serum alpha fetoprotein (AFP) levels; elevated AFP compared to age-adjusted normal ranges is a reliable marker for AT in children. V003 at age 16 months had AFP 307 μg/L, while V004 at 7 months had 112 μg/L (normal range for these ages, 8–80 μg/L [26]). Western blot showed absent ATM protein in both patients (data not shown).

TREC Analysis in Additional AT Cases

To test whether T lymphocytopenia detected by low TRECs is common in infants with ATM mutations, medical records of California-born patients with AT followed at CHLA and UCSF over the past 25 years were reviewed, and their residual neonatal DBS were retrieved by the CDPH for TREC testing. As summarized in Table II, 13 patients with AT, 9 females and 4 males, none of whom had been suspected to have AT at birth, had their newborn DBS samples retrieved. Upon testing, 7 samples had TRECs ≤25; thus newborn screening would have flagged these AT patients in infancy to receive follow-up lymphocyte immunophenotyping. The AT patients’ ethnic distribution was not different from the overall distribution of California births.

Table II Characteristics of AT patients whose newborn dried blood spots were tested for TRECS

All 13 AT patients initially presented with symptoms of ataxia and abnormal gait between ages 12 months and 8 years (median 17 months). Patients experienced a delay from 2 months to 10 years between onset of symptoms and AT diagnosis, which occurred between ages 1.5 and 12 years (median age 3 years 5 months). At diagnosis all 13 AT patients had high serum AFP concentrations, from 16.8 μg/L at 1 year 7 months (Patient 6) to 310 μg/L at 12 years 3 months (Patient 3). AFP levels in utero and at birth are high, but fall to <8 μg/L for children over age 2 [26]. We confirmed the correlation between AFP and age (R = 0·63), as reported [26, 27]. Furthermore, all 13 patients had T lymphocytopenia with <1,500 T cells/μL at the time of their diagnosis of AT; subsequently, 8 had recurrent immunological manifestations including soft tissue infections (1 patient), cutaneous and visceral granulomas (3 patients), and chronic respiratory infections (4 patients). Out of 13 patients, 4 also had hematological malignancies, leading to fatality in Patients 1 and 10.

Comparing the 7 patients whose SCID newborn screens were positive (≤25 TRECs with normal β-actin) vs. the 6 whose were negative (TREC >25 copies), there were no significant differences in age at presentation with neurological symptoms, AFP levels, total CD3 T cell counts, or time between symptom onset and diagnosis, though our T cell information from retrospective chart review did not provide lymphocyte subset data in infancy; the median age for the first recorded immune panel was 4 years. Interestingly, there was a correlation between TREC copy numbers in the patients’ archived newborn DBS and their subsequently measured CD4 T cell counts (R = 0·64). Further analysis with a larger sample of AT patients might reveal additional relationships between newborn TREC numbers and phenotypic clinical and laboratory features of AT.

Discussion

Newborn screening by TRECs was developed with SCID as its primary target, but a spectrum of conditions are also identified that feature clinically significant T lymphocytopenia, defined by the California newborn screening program as <1,500 T cells/μL or a lack of CD45RA naïve T cells. Through screening with the TREC assay, two apparently healthy California newborns with unexplained lymphocytopenia were identified to have AT, with deleterious mutations in the ATM gene detected by WES and confirmed by Sanger sequencing, AFP elevation and undetectable ATM protein expression. The abnormal TREC screening results in infants V003 and V004 allowed physicians to avoid exposure to live attenuated rotavirus vaccine, contraindicated in infants with T cell immunodeficiency. Although T and B cell immunodeficiency is a well recognized, but variable feature of AT patients of older ages, the degree of immune compromise in early infancy has not been documented. Our early diagnosis of AT has provided an opportunity to observe prospectively the evolution of immunological, neurological and malignant features of AT.

Clinical features of AT include the loss of motor milestones between 1 and 2 years of age, with falls, slurred speech, oculomotor apraxia, and truncal instability. Ocular telangiectasias appear between 2 and 8 years. Family history, if positive, may assist diagnosis. Laboratory features often include absent IgA and always include elevated serum AFP, though the interpretation of AFP requires age-adjustment because this fetal serum protein remains abundant in infancy [26, 27]. Magnetic resonance imaging of the brain is unhelpful in early disease, as cerebellar abnormalities become apparent only after 2 years [28]. Increased cellular radiosensitivity is found in AT, though assays may not differentiate AT from other defects with defective DNA repair [29]. Thus, low TRECs and T lymphocytopenia detected by newborn screening can be the earliest and simplest warning that AT may be present, and validates the hypothesis of Borte et al., who suggested that AT might be detectable by prospective TREC screening, based on their survey measuring TREC and kappa chain B cell excision circles in archival DBS from patients with known primary immunodeficiencies [14]. The large size of the ATM gene, with 63 exons encoding an mRNA of 13,147 nucleotides, makes genomic sequencing costly and laborious; thus, WES may become a cost effective first line approach to examine this gene.

Deep Sequencing in the Context of Newborn Screening with TRECs

Our study demonstrates the value of a “short-list” of gene candidates of interest when searching deep sequencing data for rare, disease-causing variants. By focusing on a list of genes known to be associated with T-cell deficiencies, we were able to single out clinically relevant mutations with a minimum of other filters. The variants in our families occurred in regions with excellent coverage, as illustrated by the abundant bidirectional reads in Fig. 2a. However, high-quality coverage of all AT exons by WES or other deep sequencing is not guaranteed. In the exome datasets of our infants and their parents, between one and four ATM exons had <15x coverage, arguably inadequate to make accurate variant calls. As quality, coverage and affordability of deep sequencing improve, the chance of missing a variant will decline, but should be considered when using WES clinically.

Another consideration regarding deep sequencing analysis is the utility of filtering against local exome data other than dbSNP and 1,000 Genomes. Many indels are not annotated in existing databases despite being relatively common. Thus we observed large numbers of indels compared to SNPs persisting through the first 3 steps of our analysis (Fig. 1b). Most indels were removed by genotype quality filtering against local exomes. Also, indel calling is less reliable than SNP calling; filtering indel variants against unrelated local exomes reduces the rate of local errors and artifacts.

Infant V003’s gene mutations have not been previously reported, but, like most ATM mutations, produce early stop codons, predicted to lead to nonsense mediated decay of mRNA [30]. One of infant V004’s mutations has been reported [31]; the other is predicted to be damaging by SNAP and Polyphen-2 algorithms [32, 33].

Epidemiology and Public Health Considerations.

We have demonstrated the ability to diagnose AT within the context of a newborn screening program and have provided documentation for the first time of sufficiently low T cell counts in two infants with AT in the first months of life that live vaccines should be avoided. The incidence of AT has been reported to be between 1:40,000 and 1:100,000 births [34] with high rates in some populations due to founder mutations [35]. A more recent estimate was lower, 1:300,000 births [36]. The two infants reported here are the only newborns found to have AT in the first 18 months of California’s SCID newborn screening program, in which over 740,000 infants were screened. With 7 of 13 AT cases (54 %) from archived samples having abnormal TREC screening, our data in the highly diverse California population suggest an incidence figure around 1:200,000. California has the largest number of annual births of any state in the U.S., and population-based newborn screening with TRECs will provide a prospective, unbiased method to establish the incidence of AT.

The detection of AT by newborn screening illuminates challenges surrounding the inclusion of new tests into a public health screening program. Currently there is no cure for AT; the neurological deterioration is progressive and irreversible such that patients usually become wheelchair dependent by their teenage years, and lifespan is decreased. In addition, patients with AT have increased risk for malignancies, attributed to compromise of DNA repair mechanisms. Consistent with the reported 30–40 % lifetime risk and 10–15 % childhood or early adulthood risk of lymphoid malignancy, 3 of our 13 retrospective patients had lymphomas and one had leukemia [37, 38]. ATM heterozygous mutation carriers also have an increased risk of breast cancer as well as other epithelial malignancies [39, 40]. Thus, although screening programs are designed to identify newborns with diseases for which there are treatments, the early diagnosis of AT as a secondary target of TREC screening for SCID can provide important information for family and genetic counseling. Patients should avoid undue irradiation and should be monitored for malignancy as well as protected from infection, while carriers of ATM mutations should be made aware of their own increased risk of cancer, and any anti-cancer therapy they require should be tailored in light of their increased sensitivity to radiation [41]. A diagnosis of AT by newborn screening can also help parents plan for the care of the affected child and obtain genetic counseling when considering future pregnancies.

Furthermore, while there is currently no effective treatment for AT, multiple lines of research are aiming towards that goal. Antioxidant therapies [4244], and more recently HDAC4 inhibition [45], have shown promise in mouse models; and in vitro experiments with human cells suggest that for some patients ATM function could be restored using mutation-targeted therapy [46]. For any of the potential therapies under development, early intervention made possible by identification through newborn screening would be most likely to show benefit by allowing therapy to take effect before extensive degeneration of relevant tissues has occurred.

Conclusion

This study demonstrates that T lymphocytopenia revealed by newborns TREC screening can be an identifying feature of AT. We also show the utility of exome sequencing to arrive at a gene diagnosis for infants with variant SCID or CID. Although there is no current cure for the progressive neurological impairment of AT, early detection provides information to improve patient management and offer family genetic counseling. Whole exome sequencing coupled with newborn screening in an ethnically diverse, large population will reveal unbiased data about rare diseases associated with T lymphocytopenia.