Introduction

Acute leukemia is the most common group of neoplasms diagnosed in the pediatric population. In particular, acute lymphoblastic leukemia (ALL) of B cell lineage (B-ALL) is both the most common subset of acute leukemia in children and notably more frequent than in the adult population. The opposite is true for acute myeloid leukemia (AML), which is relatively infrequent in the pediatric population and much more frequent in adults. In each of these disorders, immunophenotyping has come to play an integral role in the enumeration of the leukemic population, assignment of lineage, identification of prognostic subgroups, and subsequent post-therapeutic monitoring. This review will describe the current status of flow cytometric immunophenotyping as applied to the diagnosis and monitoring of pediatric patients with acute leukemia and suggest areas for future investigation.

Diagnosis

The diagnosis of acute leukemia requires the identification of an expanded population of hematopoietic progenitors having the morphologic appearance of blasts [1]. In the case of AML, a numeric criteria of 20 % blasts by morphology has been established by consensus as the requirement for diagnosis, except when certain cytogenetic abnormalities are present, e.g., t(8;21) or inv(16). However, in ALL, a similar numeric criterion has not been definitively established, although 25 % blasts by morphology in the marrow are commonly used to operationally distinguish ALL from lymphoblastic lymphoma. In part, this is a consequence of the difficulty in distinguishing normal B cell precursors (hematogones), which may be expanded in the bone marrow secondary to many non-neoplastic conditions in children, from leukemic blasts by morphology. Flow cytometric immunophenotyping can assist in establishing a diagnosis of acute leukemia by more objectively and definitively confirming both the presence of expanded hematopoietic progenitors and demonstrating immunophenotypic abnormalities on the progenitors that lie outside the perturbations seen during normal marrow regeneration [2]. It is the demonstration of immunophenotypic abnormality that provides specificity for the diagnosis of acute leukemia.

Demonstration of immunophenotypic abnormality

The recognition of immunophenotypic abnormality relies on the immunophenotypic principle that the maturation of normal hematopoietic cells from early progenitors to later-stage forms results in the consistent expression of antigens on cells of a particular lineage at defined stages of maturation. This is a consequence of tight regulation by the underlying genetic program that drives cell differentiation and maturation. When cells become neoplastic, they acquire multiple genetic mutations that disrupt normal genetic regulation, secondarily resulting in changes in protein expression that can be used as markers of neoplasia. Description of the immunophenotypic changes seen with normal maturation and following leukemic transformation is the work of numerous investigators and is summarized here [2].

Normal B cell maturation in bone marrow is characterized by three relatively discrete immature stages of maturation that are the reference points for the recognition of B-ALL. The first stage of maturation shows expression of early antigens CD34 on the cell surface and TdT in the nucleus in conjunction with bright CD10, low CD45, and an absence of CD20 on the cell surface. As the cells move to the second stage of maturation, they lose expression of CD34 and TdT with a decrease in CD10 and increases in CD45 and CD20. The third stage of maturation exhibits a high level of CD20 expression with dim CD10 and a nearly mature level of CD45. All three stages show expression of CD19, CD22, CD24, CD38, and CD58, albeit with slight variation at each stage. Pediatric B-ALL invariably deviates from this maturational scheme in a variety of ways, most commonly showing increased expression of CD10 and CD58 with decreased CD38 and CD45 (see Fig. 1). Infrequent cases show loss of CD10, often in conjunction with a loss of CD24 and acquisition of CD15.

Fig. 1
figure 1

B lineage acute lymphoblastic leukemia residual disease. CD19-positive B cells consist of a mixture of residual leukemic B cells having abnormal expression of CD10 (increased), CD34 (increased), CD38 (absent), CD58 (uniform and increased), and CD45 (decreased) relative to normal B cell precursors. The leukemic population represents 0.9 % of white cells

Normal T cell maturation occurs in the thymus, so the demonstration of immature T cell populations outside of that organ is a cause for concern. The earliest stage of T cell maturation is characterized by bright expression of CD7, dim CD5, acquisition of cytoplasmic without surface CD3, variable dim CD34, nuclear TDT, low CD45, and little to no CD4 without CD8 or CD1a. Maturation results in loss of CD34, an increase in CD4 followed by CD8 and CD1a coexpression, and an increase in CD5, surface CD3, and CD45, the common thymocyte immunophenotype. Subsequent maturation gives rise to either CD4 or CD8 expression, loss of CD1a, and assumption of mature levels of surface CD3, CD5, and CD45 expression. Pediatric T-ALL deviates from this maturational scheme by showing a generally more homogeneous immunophenotype with a loss of coordinated expression of the antigens just described (see Fig. 2). The most frequent T-ALL immunophenotype resembles the common thymocyte with expression of CD1a and variable coexpression of CD4 and CD8. However, more immature immunophenotypes lacking expression of CD8 and CD1a, having dim to absent CD5, and expressing either early (CD34 and/or HLA-DR) or myeloid (CD13, CD33, and/or CD117) antigens are seen in 10–15 % of cases and termed early thymic precursors (ETP).

Fig. 2
figure 2

T lineage acute lymphoblastic leukemia residual disease. CD7-positive cells consist of a mixture of residual leukemic T cells having abnormal expression of surface CD3 (absent), CD5 (decreased), CD7 (increased), CD8 (subset decreased), CD45 (decreased), and CD56 (increased on subset) in comparison to normal mature T cells or NK cells. Note that the leukemic population consists of two principal subsets, one more immature lacking CD48 without CD56 and the other expressing bright CD56 with CD8 and slightly decreased CD48, the latter being a more minor component at diagnosis that has expanded following therapy. The leukemic population represents 2.4 % of white cells

Normal myeloid maturation from the hematopoietic stem cell to early lineage-committed progenitors of neutrophilic, monocytic, erythroid, megakaryocytic, and plasmacytoid dendritic cell lineages is complex and somewhat more of a continuum than is seen with T or certainly B cell differentiation. A complete description of the maturational stages for each of these lineages is outside the scope of the manuscript, and the reader is referred to the literature [3]. In brief, the hematopoietic stem cell is characterized by expression of bright CD34 and low to absent CD38 with low CD13, CD33, CD117, CD133, and HLA-DR without lineage-defining antigens. Early neutrophilic differentiation shows acquisition of MPO in the cytoplasm and CD15 on the cell surface with an increase in CD13, CD33, and CD117 and early loss of HLA-DR, in contrast to monocytic differentiation that also shows acquisition of CD15 but with a decrease in CD13, early loss of CD117, and retention of HLA-DR. Early erythroid differentiation is characterized by an increase in CD71 to a high level, early loss of CD13, and retention of CD117 with subsequent expression of CD36 and later CD235a. Megakaryocytic differentiation shows early expression of CD41 and CD61, but the stages of maturation are less well described. AML shows immunophenotypes similar to one or more of the above stages of maturation, often with some degree of abortive or incomplete maturation, but with immunophenotypic deviation when carefully compared with discrete stages of differentiation (see Fig. 3).

Fig. 3
figure 3

Acute myeloid leukemia residual disease. The progenitor population as defined by CD45 and side scatter contains a residual leukemic population of CD34-positive progenitors having abnormal expression of CD33 (increased), CD38 (absent), CD56 (variable dim), and HLA-DR (decreased) in a background of normal CD34-positive progenitors, plasmacytoid dendritic cells, monocytes, etc. The leukemic population represents 0.5 % of the white cells

Blast enumeration

It is important to recognize that the current numeric criteria for the diagnosis of acute leukemia are defined by morphology, not immunophenotyping. Nevertheless, immunophenotyping can help to clarify morphology in difficult cases, distinguishing normal from abnormal progenitors and providing a somewhat more objective enumeration of leukemic progenitors. There are two important caveats when comparing morphologic and immunophenotypic progenitor enumeration. The first is that morphologically defined blasts typically consist of multiple discrete immunophenotypic stages of progenitor maturation, so one must sum all appropriate immature immunophenotypic stages to approximate the morphologic counterpart. This is particularly important when the leukemic progenitors lack expression of typical immature antigens such as CD34 or CD117, e.g., in monocytic leukemias. The second is that bone marrow samples are always variably hemodilute, and this can result in different proportions of cells in comparison to morphologic preparations where intact marrow spicules are selected for smear preparation. A related issue is that due to concerns about lysis or loss of immature erythroid cells during preparation for flow cytometry, progenitors are commonly reported as a percentage of CD45-positive or non-erythroid events, a different denominator than used for morphologic enumeration. The concerns regarding nucleated erythroid cell loss are somewhat exaggerated, and actual erythroid underestimation does not occur to any significant degree provided appropriate sample processing and instrument setup are practiced [4, 5]; nevertheless, the use of a CD45-positive denominator persists. Hemodilution remains a real issue without a reliable method for correction.

Lineage assignment

The determination of lineage is one of the principle and most important use of immunophenotyping in acute leukemia diagnosis. Distinguishing lymphoid lineage from non-lymphoid is of particular importance as current therapies for these two classes of disease differ significantly. As a general principle, the lineage of acute leukemia is established by comparison of the composite immunophenotype for the leukemia with that of its closest normal counterpart using antigens that have some degree of specificity for the lineages of interest, see Table 1. In most cases, examination of surface antigens alone is sufficient to establish lineage, but in cases where the lineage is ambiguous, evaluation of cytoplasmic antigens that are believed to appear earlier and have a higher degree of lineage specificity may be required. Cases where insufficient lineage associated antigens are identified to allow confident lineage assignment are generally termed undifferentiated or indeterminate for lineage. Leukemias that exhibit expression of antigens from more than one lineage either on the same progenitor population (biphenotypic) or on different discrete progenitor populations within the same sample (bilineal) may be termed mixed phenotype acute leukemia, but the current definitions in the WHO classification are purposely somewhat ambiguous and reflect general uncertainty as to how such cases should be identified and treated [1].

Table 1 Antigens commonly used for flow cytometric lineage assignment

Subclassification

Immunophenotypic subclassification of acute leukemia is of decreasing relevance as much of the prognostic information is redundant with that identified more specifically by cytogenetic and/or molecular evaluation. Nevertheless, there are some general associations of immunophenotype with specific molecular/cytogenetic lesions that can be diagnostically useful. In B-ALL, the absence of CD10 and presence of CD15 expression are associated with the presence of abnormalities of the MLL gene, typically t(4;11), and are considered a poor prognostic sign [7]. B-ALL containing t(12;21) typically lacks expression of both CD9 and CD20 [8], and B-ALL containing t(9;22) generally shows expression of CD13 and/or CD33 [9]. A recently identified poor prognostic subset of B-ALL having a Ph-like gene expression signature contains translocations or deletions involving the CRLF2 gene that result in CRLF2 overexpression, a finding that can be reliably detected by flow cytometry [10]. In T-ALL, the ETP immunophenotype has been associated with a poorer clinical outcome in comparison with the common thymocyte immunophenotype [11] but appears to have been abrogated by modern chemotherapeutic strategies (manuscript in preparation). t(15;17) AML has a promyelocytic immunophenotype but with characteristically elevated expression of CD33, absence of CD34 and HLA-DR, and low to absent expression of CD15 [12], an immunophenotype that is important to recognize so appropriate therapy may be initiated while confirmatory FISH or PCR is performed. t(8;21) AML invariably shows some combination of increased expression of CD34, CD56, CD19, and/or TdT [13].

Monitoring

Assessment of residual disease following therapy has emerged as one of the most important applications of flow cytometry to acute leukemia. The principles used are essentially the same as those for diagnostic immunophenotyping but require more careful attention to technical details that can give rise to artifact and a more highly informative antibody combination to allow detection of smaller populations of abnormal cells in a predominantly normal background [14, 15]. Two basic methodologic approaches have emerged for this application. The leukemia-associated immunophenotype (LAIP) approach evaluates acute leukemia at diagnosis with a particular reagent panel and defines regions where leukemic events are present outside of those seen during normal maturation for the relevant lineage [16]. These predefined regions are then employed for subsequent samples using the informative reagents from diagnosis and events appearing in the predefined regions counted as residual disease. While this method does work in some circumstances, it assumes stability of immunophenotype both for the leukemia and background normal or regenerating populations and consequently can give rise to both false positive and negative results. The other major approach relies on the identification of discrete populations of events that have an immunophenotype that differs from normal cells of similar type, i.e., difference from normal. A principal advantage of this approach is that even major shifts in immunophenotype can be detected provided they do not revert to normal, which is rare. In addition, knowledge of the pretreatment immunophenotype is not required, unlike with the LAIP approach, although such knowledge can be very helpful as a starting point for evaluation and may improve the sensitivity of the assay. In practice, both approaches are often commonly used simultaneously to varying degrees.

The specimen to be evaluated for residual leukemia in nearly all studies has been the bone marrow aspirate, largely by historical convention; however, the use of peripheral blood for this purpose is attractive for the ease of obtaining the sample, reduced patient discomfort, and decreased cost. It is now clear that there is a poor correlation for the enumeration of residual disease between blood and marrow for B-ALL, while in T-ALL, the correlation is relatively better [17]. In AML, there is a correlation between blood and marrow for residual disease, but the values seen in blood are on average 1 log lower [18]. This suggests the blood may be a suitable specimen for residual disease assessment in T-ALL and perhaps in AML, but not in B-ALL. Nevertheless, essentially all current protocols continue to use bone marrow as the specimen of choice.

Immunophenotypic instability

Progenitor cells and other immature cell types have an inherent capacity for maturation and differentiation, and this is retained to a variable degree in neoplasms derived from these cells, in particular, acute leukemia. Consequently, it is perhaps not surprising that acute leukemia often shows some change in immunophenotype under the influence of therapy. This has been well documented in B-ALL, where the use of steroids during induction therapy can induce the expression of more mature antigens (e.g., CD20 and CD45) and reduce the expression of immature antigens (e.g., CD10 and CD34) [19, 20]. The phenomenon may be partly reversible, as immunophenotypes at relapse in B-ALL often more closely resemble those seen at diagnosis rather than at earlier post-therapy time points where residual disease is detected [21]. Similar immunophenotypic changes have been noted in T-ALL [22] and AML [2325], the latter sometimes in association with the appearance of new cytogenetic or molecular abnormalities. Additionally, inherent tumor heterogeneity within AML further complicates residual disease monitoring [26]. The practical implication is that one should not rely on single antigenic abnormalities identified at diagnosis to monitor disease after therapy, but rather should evaluate for as many immunophenotypic abnormalities as available. In addition, panels of reagents should be used that are broad enough to allow the identification of new antigenic abnormalities that may arise secondary to therapy. At present, there is relatively poor standardization of reagent panels for this application, although efforts are underway to improve the situation. An example of reagent panels and methodology used in our laboratory for B-ALL, T-ALL, and AML residual disease monitoring has been previously published [27].

Quantitation

The same issues for blast or progenitor enumeration discussed above apply to the quantitation of residual disease after therapy. However, it is important to recognize that different denominators are currently used between studies that can complicate the ability to directly compare data. Much of the early data on the flow cytometric monitoring of residual disease was performed using Ficoll processing of specimens, in part in an effort to compare with molecular techniques which commonly used this method as a preparatory step. Since Ficoll depletes samples of mature granulocytes and can otherwise more subtly alter the composition of the sample, the denominator is different than that obtained by more recent whole blood lysis techniques. In an attempt to allow comparison with both prior molecular and flow cytometric data, as well as to minimize the impact of neutrophil degeneration with transport, the Children’s Oncology Group for ALL studies adopted a denominator that includes all nucleated cells using a nucleic acid-binding dye and excludes maturing myeloid cells using high side scatter, producing a denominator composed of nucleated mononuclear cells [27]. Both of these denominators differ from those used more routinely in clinical immunophenotyping and on some clinical trials where AML is monitored after therapy, namely a denominator of all CD45-positive or non-erythroid events. At present, there is no systematic method to adjust for the different denominators in use; however, these denominator effects are likely to produce less than a twofold difference in quantitation, and since residual disease is generally evaluated using a logarithmic scale relative to outcome, the practical impact is likely to be relatively minor.

Significance

The detection of residual disease after therapy by either flow cytometric or molecular methods has emerged as one of the most important prognostic indicators identified in acute leukemia. In B-ALL, numerous studies have shown that the presence of residual disease detected in the bone marrow within the first 1–3 months after therapy is strongly associated with a poorer outcome [28]. The largest of these studies is from the Children’s Oncology Group and demonstrated a progressive reduction in overall survival and event-free survival correlated with increasing levels of residual disease detected at day 29 after induction therapy [29]. Patients with undetectable residual disease (<0.01 %) had the best outcomes, and this represents the generally achievable sensitivity for current flow cytometric assays. Residual disease detection was also able to identify a subset of patients with poorer outcomes in otherwise good risk groups, e.g., t(12;21) or +4 + 10, suggesting that the absence of residual disease is not simply a surrogate for other good risk features. Patients with residual disease detected further from therapy at the end of consolidation, while small in number, were shown to be a particularly poor outcome subset. Interestingly, the presence of detectable residual disease was also associated with an increased risk of both early (<3 years) and late (>3 years) relapse. The presence of residual disease prior to and following bone marrow transplantation is also associated with an inferior outcome in B-ALL [30]. As a result of these studies, residual disease assessment in pediatric B-ALL is rapidly becoming the standard of care for this disease.

In T-ALL, few sizable trials incorporating residual disease assessment have been published, the largest being from the AIEOP-BFM group where PCR was used for residual disease monitoring [31]. In that trial, the presence of residual disease in bone marrow (>0.01 %) was associated with a poorer outcome at both day 33 and day 78 after induction therapy with the latter being used to define high risk (>0.1 %). The more frequent presence of residual disease at the early time point in comparison with B-ALL, despite relatively good overall outcomes, suggests that the kinetics of response in T-ALL to this therapeutic regimen are slower in comparison with similar regimens used in B-ALL. This has recently been confirmed in a large trial of T-ALL performed by the Children’s Oncology Group using flow cytometric residual disease monitoring (manuscript in preparation).

In AML, multiple studies have been published in both children and adults demonstrating a similar relationship between the presence of residual disease after therapy and poorer outcome as assessed by either level of disease detected or log reduction in leukemic burden after therapy. In adults, the presence of residual disease at either end of induction or end of consolidation is associated with a worse outcome, and there is some suggestion that end of consolidation may be a more informative time point for residual disease assessment in AML [3235]. In children, the presence of residual disease at the end of either 1 or 2 blocks of induction therapy is associated with decreased relapse-free survival and a higher risk of relapse, including an increased risk of relapse for patients in whom residual disease is detected at the end of block 1 but not block 2 [36, 37]. However, differences exist between studies as to the significance of residual disease below 1 % at the end of 1 block of therapy and whether there is as strong a relationship between the level of residual disease and outcome as that seen in B-ALL, perhaps related to the smaller size of the cohorts available for study. There is also the suggestion that the presence of residual disease detection in pediatric AML may be most relevant for those of standard or perhaps high cytogenetic/molecular risk. Interestingly, there is a reported lack of concordance between molecular (using fusion transcripts) and immunophenotypic assessment of residual disease post-induction that favors a stronger association between outcome and flow cytometric determination of residual disease at these relatively early time points [38], perhaps related to persistence of the molecular lesions in more differentiated cells that lack neoplastic potential or immature cells that are eventually eradicated by other mechanisms. The presence of residual AML prior to bone marrow transplantation in either CR1 or CR2 is associated with an inferior outcome [3941] for both ablative and non-ablative marrow transplantation [42]. While the data on the significance of residual AML detection are increasingly compelling, these assays remain the most challenging to implement and continue to remain largely the domain of centralized clinical trial laboratories.

While the level of residual disease achieved at particular time points after therapy is clearly important, these determinations in part reflect the kinetics of response to induction therapy and suggest that evaluation of blast clearance at earlier time points after therapy might also be correlated with outcome. In B-ALL, the level of detectable residual leukemic blasts in peripheral blood at day 8 after induction therapy has been shown to correlate with outcome and provides independent prognostic information for those patients in whom no residual disease (<0.01 %) is detected at the end of induction [29]. Similarly, the rate of clearance of leukemic blasts from peripheral blood in AML predicts blast clearance in the marrow at day 14 after induction and correlates with relapse-free survival [4346].

Future

Although flow cytometry currently plays a central role in the diagnosis of acute leukemia through leukemic progenitor identification and lineage assignment, a lack of standardization and subjective data interpretation are limitations felt more acutely in more challenging applications such as residual disease monitoring. Current molecular methods for residual disease monitoring in ALL require the generation and validation of patient-specific primers and are unlikely to be widely adopted due to their technical complexity and cost, while in AML, there are a dearth of targets suitable for general monitoring. Nevertheless, molecular methods continue to make major technological advances, as evidenced by next-generation or high-throughput sequencing (HTS), and have the potential to ultimately replace flow cytometry for residual disease monitoring. The proof of principle for the use of HTS in the monitoring of residual disease in B-ALL [47, 48] and T-ALL [49] through sequencing of polymorphic immunoglobulin or T cell receptor genes has already been established, and commercial testing is available. In AML, the absence of suitable polymorphic loci and the diversity of genetic mutation suggest that multiplexed sequencing of multiple loci will be required and must be coupled with error correction methods in order to achieve the necessary levels of sensitivity. Our laboratory has developed such an HTS assay for residual disease monitoring using NPM1 as a target that demonstrates the feasibility of this approach for AML [50]. These approaches have the potential to standardize methodology, reduce subjectivity, and increase the sensitivity of assays for residual disease monitoring; however, they are largely extensions of current approaches concerned with enumeration of bulk leukemic populations and provide little information on the spectrum of oncogenic mutations present within the leukemic population. In order to address questions regarding the quality of remission, diversity of the residual leukemic population, and potential for leukemic relapse, methodologies must be developed that allow the efficient, correlated, and sensitive detection of multiple molecular genetic abnormalities at the single-cell level—this is the logical extension of flow cytometry in the molecular era.