To understand cell dynamics in a living organism, it is crucial that any tool for observation—be it an experimental, theoretical, or metaphorical tool—interferes as little as possible with the system being observed. The heat of a lamp or the disturbance of a needle might perturb the behavior of the biological system and lead to erroneous conclusions about it. In fact, it is well-known from fundamental physics that any observation perturbs the observed system, and hence that any experiment should minimize such unwanted perturbations [1]. Hence, it can be preferable to deduce biological insights, not only from direct measurements, but also indirectly based on non-invasive observations. One particular tool for such a task is mathematical modelling. By mathematically representing key components and mechanisms of a biological system, it is possible to carry out calculations and mathematical analysis, granting insight about the mathematical representation which may also hold for the biological system being modelled.

Simultaneously, mathematical modelling can be a tool for verifying if observed phenomena follow naturally from a particular proposed biological mechanism, or if some other explanation agrees better with observed data.

An area of research where mathematical modelling has been used recently is in understanding hematopoiesis, the process of blood cell production. Hematopoiesis arises from hematopoietic stem cells (HSCs), which are typically found within the bone marrow (BM). While the observation of blood cells and their abundance can be informative about hematopoiesis, the specific properties and dynamics of HSCs in their natural setting remain elusive. Mathematical modelling is one method for investigating HSCs via indirect observations from blood samples and for verifying or proposing novel ideas and hypotheses about the mechanisms underlying HSC regulation.

In this article, we give an overview of the role mathematical modelling has played in understanding the behavior of HSCs. Some of the approaches and concepts are illustrated in Fig. 1 for reference.

Fig. 1
figure 1

Mathematical modelling of hematopoietic stem cells covers a wide range of approaches and ideas. In the figure, some example concepts are illustrated. These illustrations do not reflect any particular model described in this paper but are only conceptual. Left panel: Starting from biological knowledge, the first step in mathematical modelling is to construct an abstract representation of the system considered. Such abstract representation could, as an example, be a diagram of cell behavior. A different abstraction is the compartment diagram, in which each group of cells is illustrated as a compartment, with arrows representing flows of cells from one compartment to another. In such diagrams, compartments do not need to correspond to biological compartments but can resolve more or less details of cellular sets or populations. Middle panel: There exist multiple mathematical or computational frameworks to represent a model. Common examples are systems of ordinary or stochastic differential equations, or individual-based/agent-based models, but a multitude of other frameworks has emerged. Right panel: Through analysis and simulation of a mathematical model, it is possible to investigate different aspects, and in turn provide potential insights into the real system being modelled. At the top, we illustrate how a model could investigate the effects of hypothetical medical interventions, while the bottom figure illustrates how differences in cell properties can yield different outcomes of some, here unspecified, outputs of interest

The Benefit of Mathematical Modelling

Mathematical modelling has long been used to investigate biological systems, particularly the human body. In a recent overview, Mackey and Maini [2] address the question “What has mathematics done for biology?” and discuss significant results about biology arising from mathematical investigations, with examples from diverse areas like pattern formation during embryonic development, atherosclerosis and neuro-oncology, among others.

In addition to the challenge of obtaining data without disturbing the biological system, as discussed above, ethical concerns may result in additional restrictions on which type of investigation is possible. Hence, mathematics is not simply a tool for making forecasts from existing data but can also contribute—as a “mathematical microscope”—to access otherwise hardly accessible properties of a biological system [3]. Using mathematical modelling, the relation between biological mechanisms and observations can be quantitatively understood, and novel ideas can be studied mathematically before doing costly experiments or clinical trials.

Most malignant hematological diseases are driven by (cancer) stem or progenitor cells, but only limited information about their dynamics can be derived from clinical measurements, e.g., bone marrow biopsies. Due to the discomfort imposed on the patient and potential risks, marrow biopsies are acquired at a lower frequency compared to blood samples. Mathematical modelling makes it possible to extract additional information from blood samples compared to current clinical practice. In particular, mathematical models can link peripheral blood cell counts to processes in the bone marrow and can, therefore, be used to make implications about stem and progenitor cell abundance and dynamics based on the blood samples which are routinely taken during follow-up examinations. This is clinically relevant since it illuminates the stem cell dynamics and their role in disease progression [4,5,6,7,8, 9••]. Similarly, mathematical models can extract complementary information about stem or progenitor cell kinetics by combining marrow and blood samples [8, 9••, 10, 11].

A broad range of quantitative methods in biology and medicine rely on mathematical models. However, most often the mathematical models are hidden in lab equipment or protocols of data analysis. Well-known examples encompass CT-, MR-, MRI-, ultrasound scanners, measurement of blood pressure by plethysmography and cuffs, administration of anesthesia [3, 1215], or regression models establishing correlations between biomarkers and quantities of interest.

Mathematical Models of HSCs

In this section, we supplement recent reviews of mathematical modelling of stem cell function written by Stiehl and Marciniak-Czochra [16], and Brunetti, Mackey, and Craig [17].

In 1978, Mackey proposed a mathematical model [18] which inspired many of the mathematical models of blood cell formation that we describe in this article. The model proposed by Mackey describes the dynamics of a population of pluripotent stem cells, explicitly considering the cell cycle with HSCs shifting between a proliferating phase and a \({G}_{0}\) resting phase. Analyzing the model, the author connects changes in stem cell proliferation with the production of blood cells. The author focuses on a group of specific disorders characterized by periodic oscillations of mature blood cell counts. Mathematical models of this class of diseases are also reviewed by Dale and Mackey in 2015 [19]. The early paper by Mackey is an illustrative example of how mathematical modelling can be used to relate cell properties, e.g., the rate of apoptosis to dynamics of the entire pool of HSCs.

In a later paper [20], Mackey shows that the equivalent of the total human body weight worth of blood cells is produced about every 7 years in the typical human adult. In the same paper, the author relates an extended version of his previous model to data from an HSC-labelling experiment, estimating the differentiation rate of HSCs in humans to be between 0.01 and 0.02 days−1 and the apoptosis rate between 0.07 and 0.23 days−1 in a steady-state situation, i.e., in a situation where cell counts remain constant. In a similar effort to understand HSC processes, Abkowitz and colleagues [2124] relate stochastic models of HSCs to in vivo data of murine, feline, and human origin, and are able to obtain estimates of, e.g., the replication rate of human HSCs. Furthermore, the authors suggest that the number of HSCs is comparable across different mammals, with estimates for humans observed in the blood, the success is onlyin the range of 11,400 to 22,400 [23]. Based on this work and [22], Catlin et al. [24] use 11,000 HSCs as an estimate for the steady-state size of the HSC pool, a number which has been used subsequently in several other models of HSC dynamics. Further examples of the application of mathematical models to infer cell kinetics based on labelling experiments in animals can be found in [25••] and [26].

Another early approach to model HSC dynamics is proposed in the work of Dingli and Michor [27]. In this model, hematopoiesis is described by a system of ordinary differential equations, in which one of the variables represents the pool of HSCs. The authors consider a rate of self-renewing production of HSCs, a rate of apoptosis as well as a rate of production of differentiated mature blood cells. The model is then extended for a population of leukemic stem cells (LSCs) with similar behavior, where division rates of the healthy HSCs and the LSCs are negatively regulated by the total number of stem cells. The resulting model captures how healthy hematopoiesis is suppressed by an increase in LSCs as hematologic malignancy progresses. The most significant result of the mathematical analysis is provided in the title: “Successful therapy must eradicate cancer stem cells” [27]. The high genetic similarity of diagnosis and relapse samples in acute myeloid leukemia, a severe blood cancer, supports this claim. Observations in sequencing studies are in line with the concept that the relapse is either triggered by cancer stem cells which have survived therapy or by malignant or pre-malignant stem cells which have acquired additional mutations [54]. This suggests that while some leukemia therapies may be successful at reducing the disease burden observed in the blood, the success is only temporary if they do not act on the population of HSCs and LSCs. This conclusion has later been confirmed by related models [6, 28]. In their later work, Dingli and coworkers propose models of hematopoiesis which account for multiple immature cell states [29, 30]. These models are used to predict the number of mitotic events linking the stem cell state to the mature cell compartment and to infer the phenotypic consequences of a mutation linked to oscillating blood cell counts.

Another approach to the mathematical modelling of HSCs is described in the model proposed by Roeder and Loeffler [31] and further investigated together with coworkers [3234]. In the model, cells are assumed to move between two distinct growth environments, named GE-A and GE-Ω. In one of these abstract environments, GE-Ω, the cells are actively cycling and dividing while they are quiescent (non-dividing) in the other environment, GE-A. Each cell is characterized by its cycling status and its affinity to the GE-A environment. In contrast to other HSC models where a predefined pool of HSCs divides according to a model recapitulating the classical cell cycle phases, the model of Roeder and Loeffler [31] considers stemness and cycling activity as a property that emerges from cells dynamically switching between the two environments. The switching of a given cell between the environments is described as a function of its affinity to the GE-A environment and of the cell counts. Affinity is preserved or raised during quiescence while division degrades affinity. When affinity is sufficiently small, the cell is considered as terminally differentiated and unable to return to quiescence. Roeder and Loeffler [31] show through model simulations that the model is consistent with various experimental results from the literature, including those discussed by Abkowitz et al. [21]. Hence, the model provides a framework to understand experimental data, with the perspective that one should not only consider cell stemness as intrinsic to the cells but rather consider the interaction between HSCs and their environment. The notion of HSC reversibly exiting and re-entering quiescence, possibly adapting dynamically to external signalling, is in agreement with the notion of bone marrow stem cell niches, i.e., specific micro-environments supporting HSC quiescence and repair. A thorough review of the biological research on HSC niches is given by Wilson and Trumpp [35]. Trumpp, Essers, and Wilson [36] suggest that hematologic malignancies could be efficiently treated by combination therapy in which one drug stimulates the activation of quiescent HSCs while another drug then eradicates the newly activated non-quiescent HSCs. This is later investigated in the work of Glauche et al. [37] by adapting the model from [31], providing a mechanistic explanation behind the efficacy of combination therapy of chronic myeloid leukemia (CML).

A recent study by Ashcroft and colleagues models the binding of HSCs to HSC-specific niches in the bone marrow [38••]. In the model, HSCs detach from the niche and enter the peripheral blood at a certain rate. Upon returning to the bone marrow, the HSCs can reattach to unoccupied niches. Natural death of HSCs is assumed to occur more frequently in the peripheral blood, while only niche-bound HSCs divide. In the model, the majority of HSCs are assumed to be attached to a niche and quiescent, with a low number of unoccupied niches available in steady-state hematopoiesis. While Ashcroft et al. specifically relate the model to murine data, the model and related results should be valid for human HSCs as well. By using ordinary and stochastic differential equations, the authors investigate the prerequisites for clonal dominance in the stem cell niche. They conclude that clonal dominance in mice requires a selective advantage and cannot be the result of neutral drift. Furthermore, their model is used to investigate HSC dynamics after bone marrow transplantation. In this context, transplantation with and without prior emptying of the niches by chemotherapy (so-called preconditioning) is considered. Following the transplantation of HSCs without selective advantage into the blood stream, the engraftment of the transplanted cells in the bone marrow is limited by the number of unoccupied niches. The model can explain the experimental observation from Bhattacharya et al. [39, 40] that multiple smaller transplantations over a few days can lead to higher uptake of HSCs into the murine bone marrow compartment compared to a single-bolus transplant [38••]. The authors also use the model to predict the chimerism (abundance of donor-derived cells) after the transplantation of cells without selective advantage. Furthermore, Ashcroft et al. [38••] investigate how the probability of reconstitution after preconditioning depends on the transplanted stem cell dose. Special attention is paid to the scenario where only a single cell is transplanted. Experimental verification of such scenarios can be challenging, although not entirely impossible [41]. This is an example of an experiment that can easily be carried out in silico in a mathematical model. A related example is how patients, which in reality can only receive one treatment at a time, would have responded to a different treatment protocol or a different stem cell dose [42]. Other models of bone marrow transplantation include [43, 44] for the human and [45••] for the murine case.

Based on the observation that HSC counts are reduced in many acute myeloid leukemia (AML) patients [46], Stiehl et al. and Wang et al. [9••, 46] propose a model where HSCs and LSCs share a joint stem cell niche. This model is an extension of previous models which have been shown to recapitulate clinical blood sample data over time following bone marrow transplantation [42, 43] and the growth of leukemic cells [8, 9••, 10]. The model is given by a system of ordinary differential equations. Stem cell dynamics in the niche are coupled to the time evolution of progenitor, precursor, and mature cells as well as leukemic blasts. Clonal heterogeneity of HSCs and in turn the expansion of, e.g., a leukemic clone in the model of [38••] requires the presence of unoccupied niches and spontaneous detachment of HSCs from the niches. The model proposed in [9••, 46] considers a scenario where LSCs can actively dislodge HSCs from the niche. Since stem cells require the niche to maintain stemness properties, an invading clone has to conquer niche spaces to expand. Upon division of a LSC one of the two progeny occupies the niche of the parent, whereas the other attempts to dislodge an HSC from the niche. If this is successful, the dislodged HSC differentiates and the LSC maintains its stemness. In the opposite case, i.e., after a certain number of futile dislodgement attempts, the LSC differentiates. Similar assumptions hold for HSCs; however, to observe the expansion of leukemic cells, the probability that LSCs dislodge HSCs must outweigh the probability that HSCs dislodge LSCs. A leukemic clone with a sufficiently pronounced ability to dislodge HSCs is able to take over the niche, while the blood production might initially only show minor signs of the disease since it is transiently maintained by long-term progenitors. This model can explain the observation from [46] that AML patients with low HSC frequencies (in [46] defined as HSC frequencies below 15% of the median HSC frequency of healthy human individuals) have a poor survival. Furthermore, it is in line with the observation that HSC counts decrease before overt relapse of the AML [46]. Models assuming that HSCs and LSCs reside in separate niches and interact via systemic feedbacks that depend on the mature cell and leukemic blast counts cannot reproduce these observations [46]. A quantitative version of this model can be used to understand how the competition of stem cells for niche spaces impacts the clinical course of AML [9••, 11]. The authors develop a model-based risk stratification which uses HSC and blast counts at the time of AML diagnosis [9••] and show that this approach allows to refine the prognostic scoring of the patient cohort from [46].

In most of the models discussed above, the number of HSC niches is assumed to be independent of the dynamics of the HSC pool. To investigate the effect of feedback from HSCs to the niche-forming cells, [47] explicitly model the population of niche cells. Under the assumptions that niche cells undergo apoptosis in the absence of HSCs and that contact of HSCs to niche cells induces quiescence of HSCs but proliferation of niche cells, the model exhibits a homeostatic state. After perturbations, the model returns to homeostasis in a dampened oscillatory manner reminiscent of integral feedback control. The work of [47] illustrates that explicit modelling of the cells constituting the BM-niche can provide additional insights into the regulations of HSC dynamics.

Finally, recent work by the authors of this review aimed to model competition between HSC clones through niche occupation. Inspired by many of the models described above, a model was proposed in which competition for a limited pool of niches naturally leads to an expression of HSC fitness [48]. When two clones compete for the same niche space, e.g., in a situation where a leukemic clone has arisen due to mutations, the clone-specific fitness determines which of the two clones will eventually out-compete the other. In a scenario where the two clones have the exact same fitness, the clones will co-exist indefinitely. The model provides an expression of clonal fitness in terms of stem cell properties such as rates of proliferation, differentiation, and niche attachment. Combining this HSC model with a previous model of myeloproliferative neoplasms (MPNs) developed by Andersen et al. [49] suggests that differences in HSC fitness can explain both the disease-progression of MPNs as well as the efficacy of certain treatment protocols [50].

As approximations of reality models naturally have limitations. Besides limitations originating from the mathematical assumptions of the used modelling frameworks such as the high computational costs in the case of individual-based models or the neglect of random events in ordinary differential equation, there exist limitations coming from gaps in biological knowledge or from simplifying assumptions made during model design. One common source of limitations is the sparsity of in vivo data, especially in the human case. This sparsity originates from the inaccessibility of relevant quantities, such as kinetic parameters of stem cells, or the small number of follow-up time points. The sparsity of the data can lead to the unidentifiability of model parameters and, therefore, make it impossible to quantify important processes. Due to the sparsity of the data, it can also be challenging to distinguish between competing models. Since most human cancers require an instantaneous start of the therapy, longitudinal data of untreated cancers is practically unavailable. This makes it challenging to validate or to parameterize models of cancer dynamics. In most cases, invasive sampling techniques such as biopsies are required to extract the relevant information about a cancer. Since usually healthy individuals are not subjected to these invasive procedures, data on cancer evolution before clinical disease manifestation is rarely available. Therefore, a large part of our knowledge on tumor evolution is derived from experimental data or has been obtained by applying modelling or heuristic techniques to samples obtained at the time of diagnosis. Furthermore, important parts of our pathophysiological and pharmacological knowledge have been gained using animal or in vitro models, which do not necessarily reflect the human system [51, 52]. Although the latter may improve in the future due to advanced organoid or microfluidic systems, it has to be taken into account that human diseases are more complex than specific experimental systems [51, 52]. These limitations make it vital to validate models using real-world data and to be aware of the simplifying assumptions that have been applied during model development. In the context of personalized medicine, the applicability of models in clinical routine is further limited by the large inter-individual heterogeneity of the disease evolution, which is impacted by potentially unknown disease-specific factors (e.g., key mutations), patient-specific factors and environmental factors. Partially, this heterogeneity can be incorporated by the choice of individual model parameters for each patient [8, 9••, 11]. This, however, requires that the individualized parameters can be estimated based on patient data or derived from statistical or other approaches. Integration of mechanistic and artificial intelligence models is a promising direction to improve personalized medicine [53].


Through mathematical modelling, a range of significant insights into the otherwise hidden in vivo dynamics of HSCs has been obtained.

While cell division, apoptosis, and differentiation of individual cells are assumed to occur in response to a complicated system of regulatory processes, the average rates at which such events happen can often be estimated by relating mathematical models to experimental data. As an example, mathematical models can be used to understand how changes in such rates affect the production of blood cells. Such insight is particularly useful when investigating hematologic malignancies, where a disease may be caused by mutations which alter the rate at which a given biological process occurs. Mathematical modelling helps relate mutation-induced changes of stem cell properties to changes on the clinically accessible scale which are used to define disease entities. Furthermore, mathematical terminology and formalisms can substantially contribute to make more vaguely defined concepts rigorous whereby ambiguity may be avoided.

Naturally, there exist questions which are inaccessible to mathematical models and require in vivo experiments. Nevertheless, the methods of quantitative modelling provide a very useful starting point for investigating relevant hypotheses involving HSCs. Model selection techniques allow for falsifying of hypotheses by comparing model dynamics to available data. If a model of HSC dynamics does not agree with clinical or experimental data, at least one of the hypotheses underlying the model has to be incorrect.

The works reviewed here show different approaches to model HSC dynamics mathematically. Although some aspects of different models might be inconsistent with each other, the variety of proposed models that agree with experimental data illustrates the enigmatic nature of HSC behavior. Furthermore, it underlines an important aspect of mathematical modelling: Although a particular mathematical model might not describe all components of a biological system in detail, the model dynamics can still be sufficiently accurate to provide valuable insight and forecast how the system would behave under certain circumstances. Hence, the current state of mathematical modelling of HSC dynamics is a useful part of bio-medical research. Further work on current models and the development of new models will be necessary to understand and interpret novel experimental findings.