Background

Chronic lymphocytic leukemia (CLL) displays a heterogeneous clinical course [13], some patients living for years with asymptomatic disease and others experiencing early progression requiring therapeutic intervention. Modern treatment algorithms must take into account age, comorbidities, and prognostic/predictive factors, including genetic lesions [4]. Adverse prognostic factors include stage [5], positivity for CD38, ZAP70, and CD49d [68], and, among genetic features, the unmutated configuration of the variable region of the immunoglobulin heavy chain gene (IGHV) [6] and specific molecular cytogenetic lesions revealed by fluorescent in situ hybridization (FISH). More recently, karyotype aberrations were shown to represent strong prognostic factors [914], and large retrospective studies demonstrated that TP53, NOTCH1, and SF3B1 gene mutations have a negative impact on the time to first treatment (TTFT) and overall survival (OS) [1517]. These data were in part confirmed by prospective clinical trials using homogeneous treatment protocols [18, 19], and recurrent genomic lesions were included within comprehensive prognostic indexes [20, 21] helping clinicians to counsel patients more appropriately, to define the follow-up interval, and, potentially, to provide a rational basis to design early intervention protocols for high-risk patients [22].

Next-generation sequencing (NGS) techniques documented that, besides the aforementioned genes, a number of previously unidentified genes may be mutated in CLL and that the disruption of putative core cellular pathways represents an important mechanism promoting disease progression and drug resistance [2326]. NGS may detect minor cell populations (subclones) harboring a variety of gene mutations, including NOTCH1, SF3B1, BIRC3, and TP53 mutations, the latter having a negative prognostic impact that was similar to TP53 clonal mutations [2729] as detected by conventional sequencing techniques (i.e., Sanger sequencing).

Thus, NGS is becoming of age for usage in clinical practice, and indeed, over 50 % of CLL patients were shown to carry mutations in one or more genes [30, 31], potentially making NGS a sensitive tool for the detection of mutations including subclonal mutations.

To assess whether an extended mutational screening by NGS at diagnosis could allow for a refinement of our capability to predict TTFT and OS, we designed a CLL-specific gene panel, covering hotspots or complete coding regions of 20 genes more frequently mutated in CLL. We performed NGS of these 20 genes using a resource-efficient platform in 200 consecutive newly diagnosed patients representing over 90 % of CLL incident cases in our region. By correlating mutational data obtained by an extensive genetic/cytogenetic characterization with clinic-biological parameters and outcome, we were able to show that NGS screening was an independent prognostic factor for TTFT and that complex karyotype was a strong predictor of an inferior survival in this patient population.

Methods

Patients

The study cohort consisted of 200 consecutive untreated CLL patients diagnosed and followed between 2007 and 2014. All patients were diagnosed according to NCI criteria [32]. Only patients with a Matutes immunophenotypic score [33] ≥3 (i.e., typical CLL) were included. CD38 and ZAP-70 were tested on peripheral blood (PB) cells, as described [34]. When needed, mantle cell lymphoma was excluded by the evaluation of cyclin D1. The study was approved by the local ethics committee. Indications for treatment included increased white blood cell count with <6 month lymphocyte doubling time, anemia or thrombocytopenia due to bone marrow infiltration or autoimmune phenomena not responding to steroids, and disease progression in the Binet staging system. Fludarabine and bendamustine (since 2010), containing regimens in association with or without rituximab, were used as first-line treatment; chlorambucil was used in elderly and unfit patients according to shared treatment policy adopted at our center.

Cytogenetic and FISH analyses

Interphase FISH was performed on PB samples obtained at diagnosis using probes for the following regions: 13q14, 12q13, 11q22/ATM, and 17p13/TP53 (Vysis/Abbott Co, Downers Grove, IL) as described [35]. Each patient was categorized into a FISH risk group according to the following classification: favorable group (isolated 13q14 deletion or absence of FISH aberrations), unfavorable group (deletions of 11q22 or of 17p13), and intermediate group (trisomy 12).

Cytogenetic analysis was performed on the same samples used for FISH analysis using CpG-oligonucleotide DSP30 (2 μmol/l TibMolBiol Berlin, Germany) plus IL2 (100 U/ml Stem Cell Technologies Inc., Milan, Italy) as described [36]. The complex karyotype was defined by the presence of at least 3 chromosome aberrations.

IGHV analysis

IGHV genes were amplified from genomic DNA and sequenced according to standard methods with the cutoff of 98 % homology to the germline sequence to discriminate between mutated (<98 %) and unmutated (≥98 %) cases, as reported [35].

Ion Torrent Personal Genome Machine (PGM) analysis

NGS analysis was performed on the same samples used for FISH and cytogenetic analyses. In all samples, the percentage of CLL cells was over 90 % as assessed by flow cytometry analysis. Agilent HaloPlex Target Enrichment kit (Agilent Technologies, Santa Clara, CA, USA) was used to produce libraries of exonic regions from 20 genes (ATM, BIRC3, BRAF, CDKN2A, PTEN, CDH2, DDX3X, FBXW7, KIT, KLHL6, KRAS, MYD88, NOTCH1, NRAS, PIK3CA, POT1, SF3B1, TP53, XPO1, ZMYM3) starting from genomic DNA from PB samples, according to HaloPlex Target Enrichment System (Agilent Technologies, Santa Clara, CA, USA). Diluted libraries were linked to Ion Sphere Particles, clonally amplified in an emulsion PCR and enriched using Ion OneTouch emulsion PCR System (Life technologies, Foster City, CA, USA). Exon-enriched DNA was precipitated with magnetic beads coated with streptavidin. Enriched, template-positive Ion Sphere Particles were loaded in one ion chip and sequenced using Ion Torrent PGM (Life technologies, Foster City, CA, USA). Sequencing data were aligned to the human reference genome (GRCh37). Data analysis and variant identification were performed using Torrent Suite 3.4 and Variant Caller plugin 3.4.4 (Life technologies, Foster City, CA, USA) [37].

Statistical analysis

The Mann-Whitney and the Pearson’s chi-squared tests were applied for quantitative and categorical variables, respectively. TTFT was calculated as the interval between diagnosis and the start of first-line treatment. OS was calculated from the date of diagnosis until death due to any cause or until the last patient follow-up. Survival curves were compared by the log-rank test. Proportional hazards regression analysis was used to identify the significant independent prognostic variables on TTFT. The stability of the Cox model was internally validated using bootstrapping procedures [15]. Statistical analysis was performed using Stata 14.0 (Stata Corp, College Station, TX).

Results

Patients and mutation analyses of the 20 genes by NGS

The clinical and biologic characteristics of the 200 CLL patients are presented in Table 1.

Table 1 Clinical and biological characteristics of the 200 CLL patients

Parallel sequencing of exonic regions from the 20 genes showed somatic mutations in 84/200 (42.0 %) cases. One hundred thirty-six mutations were found in these 84 patients; 114 missense mutations, 7 nonsense mutations, 14 frameshit deletions, and 1 frameshit insertion. Mutations were detected with a frequency ranging from 5.0 to 96.7 % of the reads. Sixteen cases (8.0 %) showed mutations in the TP53 gene, 16 (8.0 %) in the NOTCH1 gene, 15 (7.5 %) in the SF3B1 gene, 10 (5.0 %) in the ATM gene, 8 (4.0 %) in the BIRC3 gene, 7 (3.5 %) in the MYD88 gene, 7 (3.5 %) in the PTEN gene, 6 (3.0 %) in the FBXW7 gene, 5 (2.5 %) in the POT1 gene, 5 (2.5 %) in the BRAF gene, 5 (2.5 %) in the ZMYM3 gene, and 19 (9.5 %) cases in the remaining 9 genes (Additional file 1: Table S1). 36/84 (42.8 %) mutated patients presented 2 or more mutations (Additional file 2: Table S2). TP53 mutations (p = 0.027) were significantly more frequent among patients with 2 or more mutations while a trend was observed for BIRC3 mutations (p = 0.059) and mutations of genes less frequently mutated in CLL (p = 0.057) (Additional file 3: Table S3).

Correlations between mutational status by NGS, molecular cytogenetic findings, and clinico-biological parameters

The presence of somatic mutations did not correlate with sex, age, and Binet stage while the occurrence of mutations by NGS analysis was significantly associated with CD38 positivity (p = 0.010), IGHV unmutated status (p = 0.009), intermediate high-risk cytogenetics by FISH analysis (p < 0.001), and the complex karyotype (p = 0.003; Table 2).

Table 2 Correlations between mutational status by NGS analysis and clinical biological parameters

A higher risk as assessed by FISH analysis was associated with the presence of mutations affecting TP53 (p = 0.012), BIRC3 (p = 0.003), and FBXW7 (p = 0.003) while the complex karyotype was significantly associated with TP53, ATM, and MYD88 mutations (p = 0.003, 0.018, and 0.001, respectively: Table 3; Fig. 1).

Table 3 Correlations between mutations by NGS analysis, FISH results, and karyotype complexity
Fig. 1
figure 1

Gene mutations and correlation with genomic features: circos diagrams illustrating pairwise co-occurrence of gene mutations with IGHV status, FISH results, and complex karyotype

The median follow-up for the 200 CLL patients was 52.3 months. In univariate analysis (Table 4), the occurrence of mutations and the presence of 2 or more mutations were significantly associated with a worse TTFT (Fig. 2) along with advanced Binet stage; CD38 positivity; IGHV unmutated status; intermediate unfavorable FISH results; 11q22 deletion, 17p13 deletion, and/or TP53 mutations (here referred to as TP53 disruption); and complex karyotype. A shorter TTFT was also observed for TP53-, NOTCH1-, ATM-, and BRAF-mutated patients. By multivariate analysis (Table 5), we found that the multi-hit profile (≥2 mutations by NGS) predicted a shorter TTFT (p = 0.004) along with TP53 disruption (p = 0.040), IGHV unmutated status (p < 0.001), and advanced stage (p < 0.001).

Table 4 Univariate analysis for TTFT and OS
Fig. 2
figure 2

TTFT according to number of mutations by NGS analysis (p < 0.001)

Table 5 Multivariate analysis for TTFT and OS

When considering OS (Table 4), a poorer prognosis was associated with the occurrence of mutations by NGS analysis, the presence of 2 or more mutations, with TP53 mutations, and with advanced stage, CD38 positivity, IGHV unmutated status, TP53 disruption, and complex karyotype. In multivariate analysis, advanced stage (p = 0.010), IGHV unmutated status (p = 0.020), TP53 disruption (p < 0.001), and the complex karyotype (p = 0.007) independently predicted a worse outcome (Table 5).

Discussion

CLL is the most frequent leukemia in western countries and has a significant socioeconomic impact. It is therefore important to define which patients are at higher risk of progression and therefore require stricter follow-up and which genetic lesions are associated with risk of relapse and/or chemorefractoriness ultimately determining a shorter survival [22]. Unlike previous reports analyzing prognostic/predictive factors in CLL requiring treatment at the time of progression, we were able to perform an extensive biologic characterization in an unselected prospective series of 200 patients diagnosed over an 8-year span and followed for a median of 52.3 months over the last 10 years. Our center has a >90 % capture of each incident case of CLL in our region of approximately 400,000 inhabitants because the diagnosis of CLL in our province was centralized since 2006. With the exception of frail patients with a significant number of comorbidities precluding any form of specific treatment, whom were not submitted to extensive molecular cytogenetic characterization, the patient population included in this analysis is highly representative of the true nature of CLL and allows meaningful analyses of TTFT and OS in a real-world scenario.

The Ion Torrent PGM is a NGS platform that uses semiconductor sequencing technology. In clinical practice, PGM may represent a very sensitive tool for mutational screening of patients with CLL, allowing multiplexing of samples and gene targets in one experimental setup [30] and resulting in higher speed of analysis and lower costs [38]. Parallel sequencing of exonic regions in these 20 CLL-related genes showed somatic mutations in 84/200 (42.0 %) cases by using a 5 % cutoff. Mutations were detected with a frequency ranging from 5.0 to 96.7 % of the reads, clearly showing that both major and minor clonal mutations were present, the former representing early leukemogenetic events and the latter representing late-appearing aberrations possibly associated with disease progression or chemorefractoriness [39, 40].

In this series, the frequency of mutations involving TP53, NOTCH1, SF3B1, ATM, and BIRC3 genes clearly reflects the nature of our patient cohort that included untreated CLL analyzed early during the natural history of the disease and comprising 80.5 % of Binet stage A cases. Approximately, the same incidence for these mutations was reported in a series of CLL patients observed in the general practice and not enrolled in clinical trials [17]. The frequency of mutations involving the other investigated genes was in line with data published in literature using whole exome sequencing [4144].

Interestingly, we observed that 18.0 % of the cases presented more than one mutation. In the CLL11 trial, 161 patients were evaluated at the time of treatment requirement and NGS analysis revealed mutations in 42 out of 85 analyzed genes, with 76.4 and 42.2 % of the patients presenting at least one or ≥2 genes affected by mutations, respectively [14].

In our series of patients, the occurrence of mutations was associated with adverse molecular and genetic findings including IGVH unmutated status, intermediate high-risk FISH results, and the presence of a complex karyotype. Noteworthy, a higher incidence of concurrent mutations was observed in TP53-mutated patients, while the presence of a complex karyotype was associated with TP53-, ATM-, and MYD88-mutated cases. These results suggest that concurrent mutations, as well as complex karyotype, might represent an aspect of genetic instability correlated to a defective DNA damage response [45].

We then analyzed the correlation between the mutational status and outcome. A shorter TTFT was observed in those patients with mutations by NGS and with mutations involving TP53, NOTCH1, ATM, and BRAF. The prognostic significance of BRAF mutations needs to be confirmed on larger series because it was derived from a limited number of patients, most of whom had concurrent mutations of other genes. By multivariate analysis, we found that the multi-hit profile (≥2 mutations by NGS) was independently associated with a shorter TTFT along with TP53 disruption, IGHV unmutated status, and advanced stage.

Given the complexity of CLL genetic landscape, we suggest that not only the presence of clones or subclones [46] but also the concurrent presence of mutations may play a significant role in prognostication. This study, to our knowledge, provides the first demonstration that at diagnosis, in an unselected CLL patient population followed up at one center having a >90 % capture of incident cases, a multi-hit profile derived from an extensive NGS analysis is independently associated with a shorter TTFT. Noteworthy, concurrent gene mutations are also frequent in patients with relapsed/refractory CLL and are associated with a worse outcome [47].

When considering OS, a poorer outcome was associated with the presence of mutations by NGS, with mutations in TP53 and NOTCH1 genes, with the multi-hit profile, with IGHV unmutated status, with TP53 disruption, and with the complex karyotype. However, by multivariate analysis, only TP53 disruption was independently associated with a worse outcome along with advanced stage, IGHV unmutated status, and the complex karyotype.

Whereas the strong independent impact on TTFT and OS of IGHV mutational status and TP53 disruption was previously demonstrated [12, 14, 15, 45, 46], the finding of an independent impact on OS of the complex karyotype is noteworthy, especially when considering that an extensive clinic-biologic characterization was performed in this patient cohort. Recently, an independent prognostic relevance on OS of the complex karyotype has emerged in CLL patients investigated at different phases of the disease: at diagnosis [13, 34], before first-line treatment [14], and in refractory relapsed patients treated with ibrutinib [48]. We may assume that the complex karyotype probably reflects a high level of genomic instability that appears to be a better predictor of worse OS in comparison to single and multiple concurrent mutations, with the only exception of TP53 mutations. Thus, karyotyping seems to substantially contribute to the identification of CLL patients with most adverse prognosis and should be considered in an extensive diagnostic work-up in future CLL trials [49, 50].

Conclusions

Altogether, our data suggests that NGS may play an important role in the definition of the risk of disease progression and therefore could be useful in the diagnostic work-up of CLL patients as an efficient, sensitive, and affordable technique for routine screening of mutations. Indeed, NGS analysis, in combination with clinical stage, TP53 disruption, and IGHV assessment, may identify those patients that are at higher risk of progression and therefore need a stricter follow-up whereas karyotyping could represent along with TP53 disruption the best genetic predictor of OS. However, some issues need to be better defined before the introduction of the extensive NGS approach into the routine clinical practice: (i) which genes and how many genes should be included in the work-up panel for an efficient and affordable routine applicability, (ii) what cutoff for mutational analysis should be considered clinically relevant, and (iii) how to develop a standardized methodology ensuring reproducibility of the results [51].