Introduction

Minimal or measurable residual disease (MRD) in acute leukemia is defined as the presence of leukemic blasts from the limit of detection (usually 0.001–0.01%) to 5% [1]. Precursor B-lymphoblastic leukemia (B-ALL) MRD measurements are now the standard of care for managing B-ALL patients. The choice of an optimal method for MRD measurement depends on the test characteristics and clinical scenarios [2]. MRD values are reported to have a general prognostic and therapeutic implication for B-ALL [3, 4]. Blinatumomab, a bi-specific T-cell engager (BiTE), is approved for use in B-ALL patients in first or second remission with MRD ≥ 0.1% (10−3). It has shown a high response rate and prolonged leukemia-free survival [5]. MRD evaluation also serves as a prognostic indicator and therapy modification variable in stem cell transplantation, where this has led to more improvement in childhood and adult ALL cure rates [6].

Different methods for MRD detection are available where cells are either identified through differential patterns of marker expression by multiparametric flow cytometry (MFC) or through differential gene expression through analysis by PCR or next-generation sequencing (NGS). On comparing MFC to NGS, the turnaround time (TAT) is much shorter for MFC than NGS with a much lower cost. In addition, MFC has been widely implemented in many flow cytometry laboratories for MRD testing. One of the advantages of NGS though is that the sample can be frozen and stored after DNA extraction to be analyzed later unlike MFC which requires a fresh sample. It should also be noted that to date more work has already been done in the MFC field with standardized protocols being set in different consortia whereas the NGS still has limited standardization. One of the main limitations of PCR is the need for using patient-specific reagents. To overcome this, MRD assessment combining both PCR and NGS approaches have been developed so that PCR “consensus primers” are used to allow the amplification of the complete set of immunoglobulin (Ig) gene sequences in a patient sample instead of relying on unique patient-specific PCR primers and probes [7]. To date, MFC and real-time quantitative PCR are considered the gold standard methods for MRD detection. NGS was shown to have a high concordance with these techniques in addition to being highly sensitive and versatile [8]. ClonoSEQ is a platform that received the first approval of an NGS-based assay by the US Food and Drug Administration (FDA) for MRD measurement in B-ALL and multiple myeloma [9]. It uses both the multiplex PCR and the NGS techniques to identify and track unique disease-associated Ig heavy chain (IgH), and kappa (IgK), and lambda light chain (IgL) gene rearrangements as well as IgH-BCL1/2 translocations in the malignant B cells [10].

So far, there is no national or international consensus as to when, where, and by which method(s) the MRD studies should be performed, although regional recommendations do exist [11]. In this retrospective study, we compared MRD evaluations for B-ALL by two different techniques: MFC using the standardized COG panel and the NGS-based ClonoSEQ. We aimed to compare the test characteristics, study abnormal immunophenotyping for B-ALL MRD, and observe B-ALL clonal evolution and the impact of Blinatumomab therapy on MFC testing. We hope the data presented in this study will help in future consensus guidelines development.

Material and methods

Sample collection for MRD testing

MFC reports were searched in our Laboratory Information System database through COG B-ALL MRD panels. Once MFC cases were identified, patients’ records including molecular reports were retrieved from the electronic medical record. This retrospective study included 74 bone marrow samples from 31 B-ALL patients (17 males and 14 females) collected at the Department of Hematopathology at our institution during the time from October 2020 to April 2022. Samples were obtained at clinical remission and at approximately 1–6-month intervals thereafter. All patients had a diagnosis of B-ALL initially established by histopathology morphologic evaluation, phenotyping by flow cytometry and immunohistochemistry, and genetics studies performed. All samples were subjected to MRD evaluation using a COG MFC panel and NGS performed in parallel using ClonoSEQ (Adaptive Biotechnologies Corporation, Seattle WA, USA). DNA was extracted from the original diagnostic sample as a baseline. The baseline DNA sample was sent along with subsequent bone marrow for ClonoSEQ MRD analysis.

Bone marrow sample examination

All bone marrow samples including bone marrow aspirate, touch imprints, and biopsy were examined microscopically including a 500-cell manual differential. Routine hemotoxin and eosin (HE) stain and immunohistochemistry studies were also performed on biopsy cores or clot sections.

MFC for B-ALL MRD testing

MFC for B-ALL MRD was performed using a 5-tube 8-color panel on a FACS Canto X flow cytometer (BD Biosciences), where the first 3 tubes represent the COG protocol [4]. The third tube contains SYTO-16 used for the quantification of nucleated cells. The fourth tube is customized for our institution, where CD24 is to analyze B cells. CD66b is added to exclude neutrophils since CD24 is also expressed on neutrophils. A target of 1,000,000 events was set, resulting in analytic sensitivity of 0.01%. However, the actual number of collected events ranged from 400,000 to 1 million events due to the suboptimal sample quality in some cases. MRD value was calculated as a percentage of leukemic cells of total nucleated cells. A cluster of more than 50 cells with abnormal MRD immunophenotype is classified as MRD positive. If more than 20 cells but less than 50 cells, i.e., 20–49, the MRD evaluation is reported as MRD positive but below the lower limit of quantitation (LLOQ). If the measured number of abnormal events in a single tube is less than 20, MRD evaluation may be reported as MRD negative, or suspicious for MRD but below the limit of detection (LOD) depending the absence or presence of measured abnormal immunophenotype.

ClonoSEQ for B-cell clonality MRD analysis

Detailed methodology for ClonoSEQ assay can be found elsewhere [10]. Briefly, the assay amplifies genomic regions present as diploid copies in normal gDNA to quantify the total nucleated cell content of a sample. A sequence is considered acceptable for tracking if it comprises at least 3% of all B-cell receptor sequences at a given locus and at least 0.2% of all nucleated cells in the sample (dominant sequence). The dominant sequence is well separated from the background repertoire. Sequence uniqueness is assessed by comparison with a large database of previously observed Ig rearrangements. Depending on its incidence in the database, each sequence is assigned a uniqueness score that reflects its likelihood of being detected in a healthy repertoire. Sequences with poor uniqueness scores are excluded from MRD tracking to avoid false MRD results. Once suitable disease-associated sequences have been identified, these ID sequences are compared with those found in successive MRD sample(s) for tracking. MRD value was calculated as the percentage of residual clonal cells over one million nucleated cells.

Statistical analysis

Statistical comparison of events collected for MFC and ClonoSEQ was performed by Welch two-sample t-test, whereas the correlations between COG MFC and ClonoSEQ MRD test results were evaluated by Pearson correlation analysis. The data were expressed as mean ± standard deviation (SD) and the p-value of less than 0.05 was considered as statistically significant.

Results

Study population

Patient’s age ranged from 2 to 76 years old with a median age of 46 years old, including 5 children and 26 adults. Patient’s demographic data, their Philadelphia chromosome status, and transplant status are shown in Table 1.

Table 1 Demographic data, Philadelphia chromosome status, and transplant status of the study population

Comparison of MFC and ClonoSEQ for the detection of B-ALL MRD

For cases that were positive for COG MRD MFC, antigen expression patterns including antigen intensity (semiquantitative) were recorded. The frequency of aberrant antigen expression was analyzed and summarized in Table 2. Figure 1 illustrates an example of a B-ALL antigen expression profile. Antigen expression intensity determination follows the College of American Pathologists (CAP) Flow Cytometry Proficiency Testing recommendations. As shown in Table 3, among the 74 evaluated bone marrow samples from 31 B-ALL patients, COG MFC and ClonoSEQ results were found to be concordant in 59 out of 74 samples (80%) with positive concordant results in 12 samples (16%) and negative concordant results in 47 samples (64%). Discordant results were noted in 15 out of the 74 samples (20%), where 14 samples (19%) showed positive results using the ClonoSEQ evaluation but were MRD negative when evaluated by MFC. Only 1 sample (1%) was MRD positive by MFC while MRD was not detected by ClonoSEQ. The details of the discordant cases including clinical status can be found in Table 4. The average events collected for MFC (779,915 ± 306,291, n = 74) were much lower than that for ClonoSEQ (2,571,168 ± 1,112,501, n = 74) (p < 0.05). Figure 2 depicts the total cells analyzed and MRD value for both MFC and ClonoSEQ for all cases. On further analysis of the cases showing MRD positivity by ClonoSEQ but not by MFC, it was noted that the MRD values ranged from 1 to 1400 cells/million nucleated cells. It is worth mentioning that 86% of these cases showed MRD values of < 100 cells/million nucleated cells. A strong positive correlation between COG MFC and ClonoSEQ results was noted among all evaluated cases (r = 0.96).

Table 2 Antigen expression frequency for B-ALL MRD positive cases using COG MRD MFC analysis
Fig. 1
figure 1

CD10, CD20, CD38, CD9 and CD58 antigen expression for normal B-cell maturation and B-lymhoblasts. Top row shows normal B-cell maturation pattern where mature B-cells are highlighted in blue and immature B-cells are in aqua. Normal B-cell precursors (hematogones) have bright CD38 and bright CD10, and CD9 is heterogeneously expressed. Bottom row shows residual precursor B-lymphoblasts highlighted in red with dim CD38 and diminished CD10. CD9 is homogeneously expressed. CD58 expression intensity in hematogones and B-lymphoblasts in this case is similar

Table 3 Comparison of MFC and ClonoSEQ on the detection of B-ALL MRD
Table 4 Details of the discordant cases
Fig. 2
figure 2

Total cells analyzed and MRD values by MFC and ClonoSEQ for all cases. MRD values were calculated as percentage of positive cell over total nucleated cells for MFC or percentage of clonal cells over total nucleated cells for ClonoSEQ

Dominant sequences identified by ClonoSEQ

Newly identified dominant sequences were detected using ClonoSEQ in 2 out of the 31 studied patients (6%) along their follow-up course, where 1 newly identified dominant sequence was reported in one case (case 1) and 2 newly identified dominant sequences were reported in the other (case 2). For case 1, the patient was Ph + positive B-ALL, treated with hyper-CVAD and dasatinib and clinically stable, had no fever or respiratory symptoms. The follow-up bone marrow at 8 months after initial diagnosis had no original dominant sequence identified, but a newly identified dominant sequence was detected with an MRD value of 4097 clonal cells out of 3,385,437 cells analyzed. Corresponding bone marrow was 60% cellular with trilineage maturing hematopoiesis and 3% blasts. A few small clusters of immature B-precursors are present. A clonoSEQ on a blood sample 6 months later showed a differently newly identified dominant sequence too, with an MRD value of 77,500 cells/2,282,583 cells analyzed. No clinical relapse during 2-year follow-up. For case 2, the patient was clinically stable without fever or respiratory symptoms, about 3 months post 4 cycles of blinatumomab therapy and hyper-CVAD chemotherapy but no transplant was given. The follow-up bone marrow at 15 months after initial diagnosis had no original dominant sequence identified, but two newly identified dominant sequence was detected with MRD values of 2357 and 2886 clonal cells out of 3,703,558 cells analyzed. The corresponding bone marrow biopsy had 1% blasts, with 5% plasma cells that were reported as kappa monoclonal by immunohistochemistry study, and a follow-up bone marrow biopsy 4 months later showed 1% blasts without evidence of B-ALL relapse, but persistent 5% monoclonal plasma cells without either original dominant sequence or previously identified new dominant sequence.

There was one case where no dominant sequence was identified from the diagnostic sample, yet follow-up samples showed dominant sequences.

Effect of blinatumomab therapy on CD19 measurements and B-ALL MRD detection

Many patients had gone through blinatumomab immunotherapy as part of the treatment protocol at our institution. MFC and ClonoSEQ were performed 1 to 14 months after blinatumomab therapy on 14 bone marrow samples from 8 patients. Five patients were either relapsed B-ALL or had positive MRD by MFC or ClonoSEG. All 14 samples were MRD negative by MFC but 3 cases were MRD positive by ClonoSEQ. Two out of the 3 samples had ClonoSEQ MRD values below LOD while one had 12/million cells. CD24/CD66b approach was also used to identify abnormal B-cells in these three cases (MFC-/ClonoSEQ +) with the same MFC results. Of note, all three patients with MFC-/ClonoSEQ + results are currently clinically stable without relapsed B-ALL disease. B-cells with positive CD19 expression and total CD3 events were measured. B-cells with CD19 expression were detected in about 15,081 normal B-cells that constituted about 1.6% of total white blood cells, while T-cells with CD3 expression were detected in about 25,449 events that contribute 16% of total nucleated cells measured by SYTO16.

Discussion

It has been extensively studied and widely accepted that MRD status is an important prognostic factor in adult and pediatric B-ALL patients [12, 13]. Currently, the most common methods to test MRD for B-ALL are qPCR and MFC with NGS emerging [13, 14]. Comparing the test characteristics of MFC and NGS for B-ALL MRD will help to develop a future testing algorithm.

Phenotypic features of B-ALL MRD

The interpretation of MRD MFC data, especially at MRD levels < 0.01%, is still expert-based and requires a lot of experience [15]. The qualitative designation of positive and negative MRD results largely depends on the presence of clusters of abnormal events. Most hematopathologists/flow cytometrists feel comfortable assigning qualitative significance to a clustered distribution of at least 20 cells (limit of detection, LOD). When a clearly positive cluster of cells is evident that bears a phenotype consistent with the patient’s disease and/or previously measured phenotype but the number of events is less than 20 events, suspicion for MRD may be reported. Cases with suspicion for MRD findings were also classified as positive in our study. Although the number of events may be defined, the distance from normal and homogeneity of the population is more complex to be objectively defined and are subject to variability based on subjective interpretation. To differentiate abnormal from normal events, there are certain B-cell maturation patterns to help the determination including asynchronous antigen expression, e.g., CD34 expression without CD10; cross-lineage antigen expression, e.g., CD15 expression on the leukemic blasts, over or under expression: brighter CD10 or lack of CD38, aberrant light scatter, etc. [16]. In COG protocol, MRD was identified based on the position of cells on dual parameter displays in areas known not to contain any normal elements (so-called empty spaces) [4]. In our study, we observed the most frequent phenotype changes, and the most reliable discriminator is the CD38 and CD10 expression intensities, especially the combination of both. The importance of this finding should be emphasized in the B-ALL MRD evaluation. CD38 is usually expressed in lower intensity in B-lymphoblasts (moderate or dim) than in hematogones (bright). CD10 can be expressed either stronger or dimmer/negative in B-lymphoblasts than expressed in hematogones. Our findings are similar to those reported by others [17]. Negant et al. [18] also noted that the combined use of both markers is more useful in the differentiation between both populations than using either of them alone. On examining the MFI ratio of CD38/CD10, they reported that this ratio was significantly higher on hematogones compared to that in B-lymphoblasts.

Impact of target therapy and clinical conditions on the MRD monitoring

The use of blinatumomab and other targeted therapy has compounded MRD testing for B-ALL disease monitoring, as pointed out by many investigators [5, 19]. Topp et al. reported after blinatumomab 16 of 20 patients converted from MRD positive to MRD negative examined by PCR method [5]. In our study, we observed three cases where MFC was negative, but ClonoSEQ was positive after blinatumomab therapy while the remaining 5 samples were negative for both MFC and ClonoSEQ. The three MFC-/ClonoSEQ + cases may represent CD19 negative persistent B-ALL MRD cases. CD19-negative B-ALL MRD was reported in 21–30% of patients after targeted therapy [22]. However, all three cases in our study had very low ClonoSEQ MRD values either below LOD or less than 20 abnormal events/one million cells that are below MFC detection. We did not find CD24/CD66b approach helpful in identifying abnormal B-cell populations for those three cases. It is challenging to monitor MRD after CD19 target therapy by MFC. One may consider NGS or other molecular approaches in this clinical scenario. Nevertheless, these three patients are clinically stable without clinical relapse of B-ALL. The normal B-cells (1.6% on average) detected by MFC from 14 bone marrow samples may represent regenerating B-cell precursors and/or mature B-cells. The presence of CD3 T-cells (16% of nucleated cells on average) argues against the loss of the cytotoxic effect of blinatumomab molecules.

The B-ALL MRD MFC interpretation is also compounded by other factors that include but are not limited to technical constraints, for example, poor sample quality, low tumor burden, immunophenotypic shifts, and clonal selection [12]. Compared to Ph- B-ALL, MRD monitoring for Ph + B-ALL patients is less defined [3]. We had 9 cases of Ph + B-ALLs in our study, and only 2 out of 9 patients had been transplanted. All patients but one was in molecular remission with negative MRDs. We did not observe higher positive MRD incidence in Ph + B-ALL cases compared with those Ph- cases. Presumably, this might be due to highly effective clinical chemotherapy/immunochemotherapy regimes. It seems that no difference is observed in our limited case study between Ph + B-ALLMRD by MFC and NGS and Ph- B-ALL MRD testing, but a definite determination relies on further investigation.

NGS evaluation for B-ALL MRD

NGS has high sensitivity, assumed to be 10−7, for MRD detection and has recently been introduced to clinical study through commercial assays such as LymphoTrack (Invivoscribe) and ClonoSEQ [20]. NGS for MRD testing has also been used for other hematopoietic malignancies [8] such as multiple myeloma [21] and chronic lymphocytic leukemia [22]. NGS-based MRD testing has 1- to 2- logs higher sensitivity than that by MFC [3]. In our study, those positive MRD detected by ClonoSEQ but not by MFC had low MRD values. One strength of the clonoSEQ for B-ALL MRD evaluation is its ability to track multiple receptor sequences from the same clonal malignant cells and tracking multiple sequences improves the precision of the assay [10].

ClonoSEQ assay requires a diagnostic sample with a relatively high disease burden to identify disease-associated clonotypes, namely dominant sequence(s). This requirement may limit access to ClonoSEQ testing sometimes. At our institution, DNA is routinely extracted from diagnostic bone marrow samples, which are submitted along with MRD follow-up samples. Hematopathologists and clinicians also need to realize that not all expanded clonal gene rearrangements detected by NGS arise from an underlying malignancy. Thus, identification of the diagnostic dominant sequence is very important for MRD measurement. We observed one case where no dominant sequence was identified for leukemic blasts in the diagnostic sample by ClonoSEQ, which may be due to lacking a rearranged Ig locus [9]. Not all lymphoid malignancies necessarily display a detectable Ig rearrangement, and this highlights the importance of concurrent use of other MRD-detecting methods such as MFC. Other causes that may lead to failure to detect any diagnostic clone include primer issues and biologically incomplete gene rearrangement, etc. [21].

In our study, we observed two cases with three newly identified dominant sequences detected by clonoSEQ, that may represent clonal evolution in at least one case. So far, there is limited research on the clonal evolution of Ig gene rearrangement during B-ALL follow-up. More study is needed to understand the clinical significance of clonal evolution in B-ALL patients.

Comparison of MFC and NGS for B-ALL MRD evaluation, how sensitive is enough?

The comparison between MFC and NGS for B-ALL MRD evaluation has been previously studied. Torra et al. reported at least one immunoglobulin clonal sequence identified in 91% of pre-treatment specimens [14]. In follow-up samples, both MFC and ClonoSEQ were performed and 82% of cases were concordant. Almost all discordant cases were ClonoSEQ positive and MFC negative. Patients with negative MRD by both NGS and MFC had excellent OS and RFS. In contrast, patients who were positive for MRD by both NGS and MFC had the poorest outcome, whereas patients who were NGS positive but MFC negative had an intermediate outcome. In a study by Wood et al. [23], high-throughput sequencing (HTS) was used to compare with COG MFC MRD detection in cases of B-ALL. Using a threshold of 0.01%, both methods showed similar 5-year EFS and OS for MRD-positive and negative patients. However, MRD was detected at levels higher than 0.01% in fifty-five patients by HTS but not by MFC. These cases represented 38.7% of patients and were found to have a worse 5-year EFS compared to other cases who had MRD levels below 0.01% by HTS (p = 0.036). Meanwhile, 17 patients were reported to be MRD + by MFC with values > 0.01% but their MRD values by HTS were found to be < 0.01%. The MRD values for 11 out of the 17 patients ranged from 0.001 to 0.01% by HTS but no further statistical analysis was possible because of the small number of patients in this group (15).

In our study, the two methods (MFC and ClonoSEQ) were compared regarding their ability to detect the presence of any residual leukemic cells. Overall, the concordance rate was 80%. Discordant cases were mostly MFC-negative and Clonoseq-positive (n = 14, 19%), whereas only one MFC-positive and ClonoSEQ-negative sample was seen (n = 1, 1%). The correlation of the measured tumor burdens between the two methods in the entire cohort as well as in the concordant cases was very high (r = 0.96). Our findings are similar to those reported by Torra except we do not have survival data due to the relatively short follow-up time period.

ClonoSEQ detects very low levels of residual clonal cells in our study. Some cases were found to have more than 0 but less than 1 residual clonal cell per million nucleated cells. However, in the employed COG MFC MRD panel, a cluster of 20 events is required to report definite residual disease. The clinical significance of such low levels detected by ClonoSEQ and their impact on the patient’s prognosis is unclear at this point. MRD results are highly dependent on the quality and concentration of the samples, which is directly related to preanalytical conditions [22]. At our institution, samples from the 1st and 2nd pulls are submitted for morphological and molecular evaluations (ClonoSEQ included) whereas samples for flow cytometric evaluation come from subsequent pulls which may play a role in the lower quality of the sample and lower incidence of detection of abnormal populations due to the dilution of the specimens. In our study, the total events obtained for MFC MRD is average at 779,915 cells while 2,571,168 cells for ClonoSEQ study. The volume and cellularity of sampled input material may also be a problem during treatment in cases of bone marrow aplasia [10]. Other previously noted possible explanations for discordance between molecular (such as PCR) and MFC results in MRD evaluation included the following: nonspecific amplification of DNA in PCR, oligoclonality, clonal evolution, quality of clonal PCR markers, immunophenotypic shifts and immunophenotypic modulation post therapy among others [24]. The case of MFC + /ClonoSEQ- in our study exemplifies that MFC and NGS are complementary for B-ALL MRD testing.

Conclusions

In conclusion, our results show a very strong correlation between COG MFC and ClonoSEQ results among all evaluated cases (r = 0.96) and are concordant in 80% of cases. At this point in time, we believe that both methods are complementary and that using different strategies to detect B-ALL MRD is important. The significance of very low levels of MRD detected by ClonoSEQ is unknown at this time and requires long-term follow-up to evaluate the prognosis of those patients. Clonal evolution may occur, and blinatumomab immunotherapy may impact MFC B-ALL MRD evaluation.