1 Introduction

Cancer immunotherapy with immune checkpoint inhibitors has revolutionized the treatment of patients with solid tumors in the last ten years. In addition to chemotherapy and molecularly-targeted therapy, immunotherapy provides a new set of tools for the oncology toolkit (Wheeler et al. 2004). In several tumor types such as melanoma, non-small cell lung cancer (NSCLC), and genito-urinary tumors, immunotherapy has markedly improved the average life expectancy of patients with advanced disease. Both laboratory and clinical experiments have verified the importance of the immune system in fighting cancer (de Pillis et al. 2006; Farrar et al. 1999; O’Byrne et al. 2000). Patients who suffer from acquired immunodeficiency syndrome (AIDS) are very susceptible to having some rare forms of cancer (de Pillis et al. 2006; Dalgleish and O’Byrne 2002). This also shows the significant role the immune system plays against cancer.

One of the fundamental problems in treating patients with cancer immunotherapy is the lack of predictive biomarkers. Ideally, before the treatment begins, patients could be selected for immunotherapy, but existing biomarkers fail to deliver a high predictive value in most tumor types (Chatterjee and Zetter 2005). In addition, most patients who initially respond to immunotherapy experience a relapse: the tumor later on develops immune escape mechanisms due to evolutionary pressure. Forecasting the time of relapse or treatment resistance is of high practical relevance (Anderson and Quaranta 2008; Rockne and Scott 2019). However, predictions of such changes in the tumor behavior are currently not possible in clinical routine. The main problem is that most biomarkers such as tumor mutational burden (TMB) are static, i.e., they are measured at a given time point but are not dynamically updated as the tumor evolves.

In other complex systems such as financial markets (Ledoit et al. 2001), climate systems (Manabe 1983) or complex industrial processes (Thompson and Kramer 1994), differential equation models can provide a prediction of the behavior of the system over time. By analogy, in oncology, a number of mathematical models to predict tumor growth over time have been developed in the last decades (Norton et al. 1976). Most notably, multiple of these models explicitly include the interactions of tumors with the immune system and are therefore in principle suited to model response and resistance to cancer immunotherapy (Kogan et al. 2012). de Boer and Hogeweg (1986) modeled the cellular immune reaction to tumors. They demonstrated that small doses of antigens lead to tumor dormancy (de Pillis et al. 2006). Kirschner and Panetta (1998) linked the oscillations in the tumor size and the long-term tumor regression to the dynamics among immune cells, tumor cells, and Interleukin-2 (de Pillis et al. 2006). The most prominent of these models was presented by Kuznetsov et al. (1994). Kuznetsov’s model has served as a blueprint for many other related models (de Pillis et al. 2006; Rhodes and Hillen 2019; Makhlouf et al. 2020) and has been investigated in several theoretical studies (Bellomo and Preziosi 2000; Kolev 2003; de Pillis et al. 2006; Owens and Bozic 2021).

However, none of these established oncological models are currently being used in the clinic. What is more, very few of these models have been systematically fitted to actual clinical data. While some studies have fitted models to murine tumors on a small scale (de Pillis et al. 2006; Benzekry et al. 2014; Vaghi et al. 2020), the pronounced differences between mice and humans preclude the transfer of such insights to real-world cancer patients (Ruggeri et al. 2014).

The structures of these mathematical models are well defined (Kuznetsov et al. 1994; Kolch et al. 2015; Tyson et al. 2011; Fröhlich et al. 2018). However, in the complex biological environment of cells, little is known about the associated parameters and kinetic constants. The parameter values are essential for quantitative modeling and prediction of cancer progression. In mechanistic models, one can integrate the data from various experimental procedures and sources, and design in silico experiments to generate hypotheses for underlying mechanisms (Clegg and Gabhann 2015; Baker et al. 2018). By fitting the model to the experimental data, we reverse-engineer the parameters of the system. Parameter estimation of mathematical cancer models remains a major bottleneck on the way to model-based and data-driven medical treatment of the future.

In this study, we use a mathematical model based on Kuznetsov’s model to characterize the interactions between the growing tumor and the immune system, and aim to fill this conceptual gap in the literature. We use a large dataset of thousands of cancer patients who underwent cancer immunotherapy as part of clinical trials. We then investigate how well the model can represent the actual tumor volume changes over time in these patients. After estimating the parameters of the model, we conduct an identifiability analysis to examine the uniqueness of the estimated parameters (i.e., whether we have over-fitting). Finally, we investigate if the model can be used to forecast treatment response or relapse under immunotherapy.

The remainder of the article is structured as follows. First, we briefly present the acquisition of patients data and its pre-processing in Sect. 2. In Sect. 3, we then provide the model and all the methods used to estimate model parameters and fit the data, conduct parameter identifiability analysis and extrapolate the model for tumor growth prediction. Finally, before drawing conclusions in Sect. 5, we present and discuss the obtained results in Sect. 4.

2 Data Acquisition

We briefly provide here the declaration and sources of the experimental data. For more details, please refer to Ghaffari Laleh et al. (2022).

2.1 Declarations and Data Sharing

We followed the Declaration of Helsinki and International Ethical Guidelines for Biomedical Research Involving Human Subjects developed by the Council for International Organizations of Medical Sciences (CIOMS). In this study, we used a publicly available set of anonymized patient data shared by Ghaffari Laleh et al. (2022), which is originally derived from five large clinical trials, as we describe below. Patients gave their informed consent for data analyses as part of the original clinical trials. No specific ethical approval was sought or required for this retrospective analysis of a publicly available dataset.

2.2 Data Measurement and Pre-processing

The original data for this study is from five clinical trials which were designed to evaluate the efficiency of Atezolizumab (an immune checkpoint inhibitor). Table 1 shows the original number of patients and their treatment arms. Four out of these five trials evaluated the effect of Atezolizumab on NSCLC and cohort GO29293 reported this efficiency on bladder cancer. In two of the cohorts (GO28753, GO28915), patients’ responses to Atezolizumab treatment were compared to the outcome of the second treatment arm who received Docetaxel (a chemotherapy drug) as treatment. In all the trials, the longest diameter and shortest diameter (LD and SD) of the target and non-target lesions (measured manually based on the CT scans) alongside the time intervals are reported. Several patients have only one or two data points because of tumor progression and potentially the death of patients. In this study, we use the anonymized and publicly available subset of data from Ghaffari Laleh et al. (2022), which was created by selecting the patients with three or more measurement points. For each of these patients, only the LD measurement for one target lesion has been selected. For this reason, the total number of original patients of clinical trials has been decreased from 2693 to 1472. Considering patients that have very few data points will lead to an over-parameterized problem. The version of the Kuznetsov model we used, which we discuss in Sect. 3.1, has six parameters. The model in this case would easily fit any combination of data points with many (essentially arbitrary) values for the parameters. On the other hand, disregarding patients with few data points (patients with tumor progression) introduces potential bias to the estimation problem and might limit the model application in a clinical setting. All the pre-processing details for the data generation have been described in more details in Ghaffari Laleh et al. (2022). Moreover, before using the data we pre-process it by first removing repetitive and null inputs. Following Ghaffari Laleh et al. (2022), Faustino-Rocha et al. (2013) and Shevtsov et al. (2019), we converted the measured LD in mm to the number of tumor cells (TC) by considering \(8\times 10^{-6} \hbox {mm}^{3}/{\textrm{TC}}\), where we consider spherical shapes of the lesions and that the TC have 3/4 of the lesion volumes. Moreover, we consider only patients with more than six net measurements because we are conducting model extrapolation in Sect. 3.4 in which the last two or three data points are omitted when estimating the model parameters. Thus, we preserve a minimum number of data points for parameter estimation also in this case (at least three). Figure 1 shows the number of patients per study and arm before and after data pre-processing. Finally, the total number of patients has been reduced from 1472 to 210.

Table 1 Description of the original data
Fig. 1
figure 1

Number of patients (Pt.) considered per studies and arms, before and after data pre-processing (Color figure online)

3 Methods

In this section, we introduce the mathematical model of Kuznetsov et al. (1994) with slight modifications. Afterward, we show the optimization formulations for: the estimation of the model parameters by fitting the clinical data and for the parameter identifiability analysis. We finally investigate the extrapolation capabilities of the model.

3.1 Mathematical Tumor Model

To predict the frequently observed phenomena in clinics like tumor dormancy and tumor size oscillation, the tumor mathematical model has to include terms related to the response of the immune system. The inclusion of the entire immune system in the mathematical model can be very difficult (Perelson and Weisbuch 1997). The anti-tumor immune response has highly nonlinear dynamics which are complicated and not well understood. Therefore, models that describe the immune system response to tumor presence should necessarily focus on those elements of the immune system that have the highest effects on tumor dynamics (de Pillis et al. 2006). Kuznetsov’s model (Kuznetsov et al. 1994) describes the response of the cytotoxic T lymphocyte (CTL) to the growth of an immunogenic tumor. Usually, a cell-mediated immune response to a tumor takes place. The cytotoxic T lymphocytes and natural killer (NK) cells play the main role. The model considers immunogenic TC that are attacked by cytotoxic effector cells (EC). The EC can be, for example, CTL or NK cells. The model takes into account the possibility of EC inactivation as well as the infiltration of the TC by EC. TC and EC interaction is described through the following reactions:

$$\begin{aligned} \hbox {EC} + \hbox {TC}&\mathop {\rightleftharpoons }\limits _{k_{-1}}^{k_{1}} \hbox {EC}-\hbox {TC},\\ \hbox {EC}-\hbox {TC}&\xrightarrow {k_{2}} \hbox {EC} + \hbox {TC}^{*},\\ \hbox {EC}-\hbox {TC}&\xrightarrow {k_{3}} \hbox {EC}^{*} +\hbox {TC}, \end{aligned}$$

where EC–TC denotes conjugates of effector and tumor cells and \({\hbox {EC}^{*}}\) and \({\hbox {TC}^{*}}\) are the inactivated effector and lethally-hit tumor cells, respectively. We define E, T, C, \(E^{*}\), \(T^{*}\) as the number of EC, TC, EC–TC conjugates, \({\hbox {EC}^{*}}\), and \({\hbox {TC}^{*}}\), respectively. The non-negative kinetic parameters, \(k_1\), \(k_{-1}\), \(k_2\) and \(k_3\), describe the rates of the interactions. EC–TC conjugates can reversibly decompose without damaging the cells with the kinetic rate \(k_{-1}\). However, they can also irreversibly result in \({\hbox {EC}^{*}}\) or \({\hbox {TC}^{*}}\) with respective kinetic rates \(k_2\) and \(k_3\). The following system of nonlinear differential-algebraic equations describes those interactions, which is a slightly simplified version of the model of Kuznetsov (Kuznetsov et al. 1994).

$$\begin{aligned} \frac{dE}{dt}\Bigr |_t&= s + F(C(t),T(t)) - hE(t) - k_1E(t)T(t) + \left( k_{-1}+k_{2}\right) C(t), \end{aligned}$$
(1a)
$$\begin{aligned} \frac{dT}{dt}\Bigr |_t&= aT(t) - k_1E(t)T(t) + \left( k_{-1}+k_{3} \right) C(t), \end{aligned}$$
(1b)
$$\begin{aligned} \frac{dC}{dt}\Bigr |_t&= k_1E(t)T(t) - \left( k_{-1}+k_{2}+k_{3} \right) C(t), \end{aligned}$$
(1c)
$$\begin{aligned} F(C(t),T(t))&= \frac{fC(t)}{g+T(t)} . \end{aligned}$$
(1d)

The rate of flow of mature EC to TC localization area is characterized by the generation term s. This rate is unaffected by the presence of TC. The destruction or migration of EC from the localization region of TC is represented by the elimination rate h. The model does not take into account any TC or EC–TC conjugates migration. Both multiplication and death of TC are included in parameter a that characterizes the maximum growth rate of TC population. The function F(CT) represents the accumulation rate of the cytotoxic EC in the TC localization area due to tumor existence (stimulated accumulation), where f and g are positive constants. The EC accumulation, F(CT), is due to signals, like released cytokines, generated by the EC in EC–TC conjugates. Thus this stimulated accumulation has some maximum value when T becomes large.

The equations describing the rate of change of \(E^{*}\) and \(T^{*}\) are not included in the system because they are irreversibly formed and thus have no effect on the other variables, and our target is to model T and E only. In Kuznetsov et al. (1994), the model includes a sink term in the rate of change equation of T that represents TC growth limitation due to biological environment conditions. It considers, for example, resources competition like oxygen and substrates. We do not consider this term because growth limitations of even high initial T are associated with high rates of cytotoxic EC accumulation as well as the absence of their activity suppression by TC (de Boer and Boerlijst 1994).

Following the suggestion of Kuznetsov et al. (1994), we consider a quasi-steady-state assumption for (1c), i.e., \(\nicefrac {\hbox {d}C}{\hbox {d}t}\bigr |_t \approx 0\), because C is formed and dissociated at much faster rates compared to the multiplication and influx of the EC, as well as the lysis of the lethally-hit TC. Thus, \(C \approx KET\), where \(K={k_l}/{\left( k_2+k_3-k_{-1}\right) }\). As a result, (1a) and (1b) become:

$$\begin{aligned} \frac{dE}{dt}\Bigr |_t&= s + \frac{fKE(t)T(t)}{g+T(t)} -hE(t) - Kk_3E(t)T(t), \end{aligned}$$
(2a)
$$\begin{aligned} \frac{dT}{dt}\Bigr |_t&= aT(t) - Kk_2E(t)T(t). \end{aligned}$$
(2b)

For further analysis and use of the model in parameter estimation and identifiability analysis, we use the same strategy of Kuznetsov et al. (1994) for non-dimensionalizing model equations. We non-dimensionalize (2a) and (2b) by considering concentration scales \(E_0 = 10^7\) cells and \(T_0 = 10^9\) cells for EC and TC, respectively (Kuznetsov et al. 1994). We non-dimensionalize t by relating it to the deactivation rate of TC and introducing \(\tau = {k_2KT_0t}/{100}\). The final model formulation is:

$$\begin{aligned} \frac{dx}{d\tau }\Bigr |_{\tau }&= \sigma + \frac{\rho x(\tau )y(\tau )}{\eta +y(\tau )} -\delta x(\tau ) - \mu x(\tau )y(\tau ), \quad x\bigr |_{\tau =\tau _{1}}=x_1, \end{aligned}$$
(3a)
$$\begin{aligned} \frac{dy}{d\tau }\Bigr |_{\tau }&= \alpha y(\tau ) - \frac{E_0}{T_0}x(\tau )y(\tau ), \quad y\bigr |_{\tau =\tau _{1}}=y_1, \end{aligned}$$
(3b)

where

$$\begin{aligned} x(\tau )&= \frac{E(t)}{E_0} , \quad y(\tau ) =\frac{T(t)}{T_0} , \quad \sigma = \frac{s}{k_2KE_0T_0} , \quad \rho =\frac{f}{k_2T_0} ,\\ \eta&=\frac{g}{T_0} ,\quad \mu = \frac{k_3}{k_2} ,\quad \delta =\frac{h}{k_2KT_0} , \quad {\text { and }} \quad \alpha =\frac{a}{k_2KT_0} . \end{aligned}$$

Thus, the final model is composed of two ordinary differential equations (ODEs) with two variables, x and y, with their respective initial values, \(x_1\) and \(y_1\), at the initial normalized time, \(\tau _1\), and six unknown parameters, \(\sigma \), \(\rho \), \(\eta \), \(\mu \), \(\delta \), and \(\alpha \).

3.2 Data Fitting and Parameter Estimation

To determine the parameter values for the nonlinear system (3a) and (3b) that best describe the experimental data, we conduct a regression analysis in the nonlinear least-squares sense by minimizing the sum of the squared residuals. The considered residuals are the differences between the measured values of tumor lesion longest diameters (converted to tumor number of cells as previously discussed) and the ones calculated from the model. In the present contribution, we do not seek global values of parameters, i.e., the same values for all patients. Instead, we solve the optimization problem for each patient individually to identify the parameter values that best describe the data of that patient. This constitutes a first step to check whether the model can describe the experimental data at all and analyze the ranges of parameter values. We also estimate the initial normalized value of EC number, \(x_1\), because it is unknown. In contrast, the initial normalized value of TC number, \(y_1\), is provided experimentally and does not need to be estimated. The nonlinear least square problem for each patient \(j \in \mathbb {J}=\{1,2,\ldots ,J\}\) is expressed as follows:

figure a

where J is the total number of patients considered. After data pre-processing, \(J = 210\) patients. The index \(i \in \mathbb {I}_j\) is for the observed experimental values, where \(\mathbb {I}_j = \{1,2,\ldots ,N_j\}\) with \(N_j\) being the total number of observed values for patient j. The non-dimensionalized model-predicted and observed values of TC number at the normalized time \(\tau _{i_j}\) are \(y_{i_j}(\tau _{i_j},\textbf{p}_j,x_{1_j})\) and \(\tilde{y}_{i_j}\), respectively. The initial value of the non-dimensionalized EC number at \(\tau _{1_j}\), \(x_{1_j}\), has lower and upper bounds \(x_{1_j}^\text {l}\) and \(x_{1_j}^\text {u}\), respectively. The vector \(\textbf{p}_j\) contains the non-dimensionalized parameter values of the model equations, (3a) and (3b), for patient j, \(\textbf{p}_j = \left[ \sigma _j,\,\mu _j,\,\delta _j,\,\alpha _j,\,\rho _j,\,\eta _j \right] \). The lower and upper bounds of the components of \(\textbf{p}_j\) are the components of vectors \(\textbf{p}_{j}^\text {l}\) and \(\textbf{p}_{j}^\text {u}\), respectively. The decision variables of the optimization problem are thus the components of \(\textbf{p}_j\) and \(x_{1_j}\). The dynamic optimization problem (OP1) is nonconvex and nonlinear, and can thus have multiple (suboptimal) local solutions. Hence, global optimization techniques are required to guarantee the global optimal solution, \(\textbf{p}^{\text {opt}}_{j}\) and \(x^{\text {opt}}_{1_{j}}\).

3.3 Parameter Identifiability Analysis

Parameter identifiability analysis determines if model parameters can be uniquely estimated (Walter and Pronzato 1997). Different definitions of identifiability analysis are available in the literature. Miao et al. (2011) reviewed several methods of parameter identifiability analysis for nonlinear ODE models and distinguished between different methodologies including structural and practical identifiability analyses. In the former analysis, one determines if a given structure of a model allows the realization of unique parameters when certain measured variables are provided (Walter and Pronzato 1997). However, it only provides necessary conditions for identifiability because it does not take into consideration parameters precision (Jung et al. 2019; Raue et al. 2011). In contrast, practical identifiability aims to predict confidence intervals for the estimated parameters (Jung et al. 2019; Gábor et al. 2017). It can be conducted locally (in the neighborhood of the estimated parameter values) or globally over the entire range of values. We here carry out the latter analysis and evaluate it globally to improve the confidence in the parameter values that are determined by solving (OP1).

We conduct the global practical identifiability analysis by determining the smallest box that contains the so-called feasible parameter set \(P_{\text {e}_j}\) for each patient j, as suggested in Jung et al. (2019). This set includes all values of parameters in \(\textbf{p}_j\) for which the differences between model predictions \(y_{i_j}(\tau _{i_j},\textbf{p}_j,x^{\text {opt}}_{1_{j}})\) and optimal model predictions \(y^{\text {opt}}_{i_j}\) (determined by solving (OP1)) fall within certain defined bounds \(\forall i \in \mathbb {I}_j\), that is

$$\begin{aligned} P_{\text {e}_j} = \{ \textbf{p}_j \in P_j \mid -\varepsilon y^{\text {opt}}_{i_j} \le y_{i_j}(\tau _{i_j},\textbf{p}_j,x^{\text {opt}}_{1_{j}})-y^{\text {opt}}_{i_j} \le \varepsilon y^{\text {opt}}_{i_j}\}, \quad \forall j \in \mathbb {J}, \end{aligned}$$
(4)

where \(y_{i_j}^{\text {opt}}=y_{i_j}(\tau _{i_j},\textbf{p}^{\text {opt}}_{j},x^{\text {opt}}_{1_{j}})\), \(\varepsilon \) is the percentage of deviation, and \(P_j\) is the set of parameter values in \(\textbf{p}_j\) bounded by the components of \(\textbf{p}_{j}^\text {l}\) and \(\textbf{p}_{j}^\text {u}\) defined in (OP1). The set \(P_{\text {e}_j}\) is depicted in dark gray as shown in Fig. 2 for the case of a two-dimensional vector \(\textbf{p}_j\). We approximate the nonconvex set \(P_{\text {e}_j}\) by a rectangular box (light gray color), whose edges are formed by the extreme values of the elements of \(\textbf{p}_j\) (\(p_{j_1}^{\text {min}}\), \(p_{j_1}^{\text {max}}\), \(p_{j_2}^{\text {min}}\), \(p_{j_2}^{\text {max}}\)). We determine these extreme values by solving a series of constrained dynamic optimization problems. For a vector \(\textbf{p}_j\) that consists of K parameters, the optimization problem for parameter number k is formulated \(\forall j \in \mathbb {J}\), and \(\forall k \in \mathbb {K}\) as follows (Jung et al. 2019; Paulen et al. 2016):

figure b

with \(\mathbb {K} = \{1,2,\ldots ,K\}\) and \(K=6\). We set \(\varepsilon \) to 20%, which means that the differences between model predictions for \(\textbf{p}_j=\textbf{p}^{\text {max}}_{j}\) and \(\textbf{p}_j=\textbf{p}^{\text {min}}_{j}\), and optimal model predictions for \(\textbf{p}_j=\textbf{p}^{\text {opt}}_{j}\) fall within 20% of the optimal prediction values \(\forall i \in \mathbb {I}_j\). For each parameter \(p_{j_k}\) in \(\textbf{p}_j\), its minimum value \(p_{k_j}^{\text {min}}\) in \(\textbf{p}^{\text {min}}_j\) and its maximum value \(p_{j_k}^{\text {max}}\) in \(\textbf{p}^{\text {max}}_j\) are determined by solving (OP2). Therefore, (OP2) is solved K times for each patient \(j \in \mathbb {J}\). As a result, the approximation of \(P_{\text {e}_j}\) is determined, and its edges are the elements of \(\textbf{p}^{\text {min}}_j\) and \(\textbf{p}^{\text {max}}_j\). The set \(P_{\text {e}_j}\) allows for the determination of confidence regions of the estimated parameter values in \(\textbf{p}^{\text {opt}}_{j}\). When \(P_{\text {e}_j}\) covers a large space in the direction of \(p_{j_k}\), the estimated parameter \(p^{\text {opt}}_{j_k}\) is not identifiable, in the sense that there is a large range of values for \(p_{j_k}\) that could lead to a good fit. When the set covers a small space, \(p^{\text {opt}}_{j_k}\) is identifiable because the parameter is determined to a sufficient accuracy. To decide on identifiability a certain threshold is thus needed. We here do not define a cutoff, we rather analyze the identifiability qualitatively.

Fig. 2
figure 2

Illustration of parameter identifiability via global confidence intervals (based on Jung et al. 2019). The dark gray region indicates the feasible set \(P_{\text {e}_j}\) that includes all values of parameters in the two-dimensional vector \(\textbf{p}_j\) for which the differences between model predictions \(y_{i_j}(\tau _{i_j},\textbf{p}_j,x^{\text {opt}}_{1_{j}})\) and optimal model predictions \(y_{i_j}^{\text {opt}}\) fall within certain defined bounds. The light gray region shows the rectangular box that conservatively approximates this feasible set, where the box edges are the extreme values of the elements of \(\textbf{p}_j\) (\(p_{j_1}^{\text {min}}\), \(p_{j_1}^{\text {max}}\), \(p_{j_2}^{\text {min}}\), \(p_{j_2}^{\text {max}}\))

If estimates of the errors in tumor length measurements were available, then depending on the confidence intervals of the measurements, the threshold value of 20% in the practical identifiability analysis could be changed accordingly. For instance, for wide error margins in the measurements, the threshold value should be increased to account for those wide margins. However, we do not have estimates of measurement errors. The longest and shortest diameters were measured manually by radiologists in the original clinical trial. Such manual measurement is currently the state of the art, although our previous experience indicates that there can be a 1–2 mm margin of error in these measurements (McNitt-Gray et al. 2015).

3.4 Tumor Growth Prediction

Beyond being able to reproduce experimental data a posteriori, a more clinically relevant application of mathematical tumor modeling would be if the model was able to predict tumor growth. This could lead to model-based tumor treatment, as the supplied doses to patients could be adjusted in advance according to model predictions. In Sect. 3.2, we fitted the model to all experimental data points when estimating the parameters. In order to compare model extrapolation capabilities and future predictions to the clinical data, we now do not include the last \(\zeta \) points of the data when fitting the model and solving (OP1). For each patient j, we solve an optimization problem similar to (OP1), but using only the data points in the set \(\mathbb {I}_j^{\text {ext}}\) instead of \(\mathbb {I}_j\), where \(\mathbb {I}_j^{\text {ext}}=\{1,2,\ldots ,N_j-\zeta \}\). The optimal values of the decision variables obtained from this problem are called \(\textbf{p}^{\text {ext}}_{j}\) and \(x^{\text {ext}}_{1_{j}}\). We then integrate the model equations for \(\textbf{p}_{j}=\textbf{p}^{\text {ext}}_{j}\) and \(x_{1_{j}}=x^{\text {ext}}_{1_{j}}\) from \(\tau _{1_j}\) to \(\tau _{\left( N_j\right) _j}\), thus extrapolating beyond the data used for fitting. The results of the integrated \(y^{\text {ext}}_{i_j} = y_{i_j}(\tau _{i_j},\textbf{p}^{\text {ext}}_{j},x^{\text {ext}}_{1_{j}})\) between \(\tau _{\left( N_j-\zeta \right) _j}\) and \(\tau _{\left( N_j\right) _j}\) are the extrapolated part of the model that can be compared with the remaining \(\zeta \) data points to gauge the extrapolation capabilities. We consider two model extrapolation cases in which \(\zeta \) is equal to two and three.

Moreover, we formulate another optimization problem to investigate how far model extrapolation could deviate from the actual values. We aim to find two “extreme-case” lines that are designed to be as far away from each other at the final time point (\(\tau _{\left( N_j\right) _j}\)) while both being within some \(\theta \) tolerance of the found optimal fit (\(y^{\text {ext}}_{i_j}\)) for the fitted time before extrapolation starts. For this, we solve the following optimization problem \(\forall j \in \mathbb {J}\):

figure c

where \(\textbf{p}^{\text {upp}}_j\) and \(\textbf{p}^{\text {low}}_j\) are the vectors that contain the parameter values for the upper and lower “extreme-case” model extrapolation deviations, respectively. The initial values of the non-dimensionalized EC number at \(\tau _{1_j}\) for these upper and lower deviation cases are \(x_{1_j}^\text {upp}\) and \(x_{1_j}^\text {low}\), respectively. We set \(\theta \) to 10%. By integrating (3a) and (3b) from \(\tau _{1_j}\) to \(\tau _{\left( N_j\right) _j}\), we get \(y_{i_j}^{\text {upp}}=y_{i_j}(\tau _{i_j},\textbf{p}^{\text {upp}}_{j},x^{\text {upp}}_{1_{j}})\) and \(y_{i_j}^{\text {low}}=y_{i_j}(\tau _{i_j},\textbf{p}^{\text {low}}_{j},x^{\text {low}}_{1_{j}})\), which allow the comparison of these two “extreme-case” extrapolation deviations with the remaining \(\zeta \) data points after \(\tau _{\left( N_j-\zeta \right) _j}\) (start of extrapolation).

If estimates of the measurement errors were available, then the deviations of the “extreme case” predictions from the experimental measurements could be assessed while taking into consideration those error estimates. One could then conclude if the “extreme case” predictions still lie within the measurement uncertainty or not. However, as aforementioned, we do not have estimates of the measurement errors.

3.5 Implementation

We implement the model, (3a) and (3b), in MATLAB R2019b (MATLAB 2019). All optimization problems are solved in the MATLAB version of the global optimization toolbox MEIGO using the enhanced scatter search metaheuristic (eSS) method (Egea et al. 2014). The eSS is stochastic and employs some elements of the scatter search and path re-linking methodologies (Egea et al. 2010). We set the maximum number of function evaluations, the maximum CPU time and the maximum absolute violation of the constraints to \(10^5\), 100 s and \(10^{-5}\), respectively. We use a 50% probability of biasing the search toward bounds and the dynamic hill climbing (DHC) (Yuret and de La Maza 1993) as a local search method. For all aforementioned optimization problems, we set \(x_{1_j}^\text {l}\) and \(x_{1_j}^\text {u}\) to \(10^{-2}\) and \(10^{2}\), respectively. All elements of \(\textbf{p}_j^\text {l}\) and \(\textbf{p}_j^\text {u}\) are set to \(10^{-2}\) and \(10^{2}\), respectively. Because of the non-dimensionalization described in Sect. 3.1, we expect that the parameter values are close to one. Therefore, we arbitrarily chose those bounds. Wider bounds might result in lower optimal objective values and hence better fits, but the computational cost would increase too. We also tried for several patients to use wider bounds (e.g., \(10^{-3}\) and \(10^{3}\)). However, it did not yield significant improvements for the majority of the cases.

The ODE (3a) and (3b) are solved using the variable-step, variable-order (VSVO) solver based on the numerical differentiation formulas (NDFs) of orders one to five (ode15s) (Shampine and Reichelt 1997). We set the relative and absolute error tolerances to \(10^{-3}\) and \(10^{-6}\), respectively. The solution refinement factor is one, and the maximum step size is \(0.1(\tau _{\left( N_j\right) _j}-\tau _{1_j})\).

Although MEIGO is a global solver, since it is stochastic, the solution depends on the initial guesses. Thus, the global optimum is not guaranteed. For all patients, we performed multiple optimization runs from different initial guesses and chose the decision variables that resulted in the lowest objective functions. In general, the improvements were very slight. However, for some patients whose profiles are difficult to predict (e.g., wavy profiles), the multi-start optimization did improve the optimal solution that was found.

4 Results and Discussion

We now show the results of estimation, identifiability, and predictions. We show the results of six selected patients in detail and give performance measures for all 210 patients. We selected those six patients in a way to provide the different profiles of TC dynamics. Data fitting and growth prediction results of all 210 patients are provided in the supplementary material.

4.1 Data Fitting and Parameter Estimation

We fitted the parameters of a modified Kuznetsov model on a dataset of solid tumors in human patients under immunotherapy treatment. The model predicted the different tumor growth profiles represented by a selection of representative patients, as well as in the total (unselected) cohort. As shown in Fig. 3 and Table 2, model prediction and experimental data profiles are qualitatively and quantitatively very close for these patients. The mean absolute error (MAE), the root-mean-square error (RMSE), and the coefficient of determination (\(R^2\)) for the selected six patients are given in Table 2. We found that the model gave very high goodness of fit as measured by \(R^2\). Across 210 patients in all studies, an average \(R^2\) of 0.784 was achieved with median and range values of 0.896 and 1.594, respectively. According to the statistical Lilliefors normality test, this \({R^2}\) distribution does not follow a Normal distribution. Moreover, Table 4 provides the number of patients and the \(R^2\) values per study and per arm. Study 1 and Study 5 have higher \(R^2\) than the remaining studies. Arm 1 has the highest \(R^2\) in all studies except for Study 2. Although direct comparison of this performance with the previous work in Ghaffari Laleh et al. (2022) is not possible, comparing the MAE of the selected six patients with the reported average MAE in the previous study indicates good fitting performance of the developed Kuznetsov model. Furthermore, by analyzing the goodness of fit in individual patients, we found that the modified Kuznetsov model was able to fit clinically interesting patterns. In particular, the modified Kuznetsov model was able to predict relapse after initial tumor response (patient \(\#207\) in Fig. 3) and other types of fluctuating behavior, solving a key limitation of previously used simpler models as in Ghaffari Laleh et al. (2022).

Table 2 Quantification of the goodness of fit of the model as shown in Fig. 3

We fitted the parameters of the modified Kuznetsov model to a large clinical dataset obtained from five clinical trials. In general, the distributions of the resulting parameter values were similar between the studies (Fig. 4). These ranges can be useful for further studies because they enable other researchers to determine plausible ranges and optimal boundaries when fitting the same model to other datasets, thereby simplifying the optimization procedure. A global parameter estimation, however, was not performed in this study and could be attempted in future studies.

Figure 4 shows the estimated values (black dashes) of model parameters of all 210 patients. All parameter values vary per patient. The values are distributed all over the bounds, except for \(\alpha \), which has a maximum of 6.331. The parameter \(\alpha \) is the normalized parameter for a, which represents the maximum growth rate of TC population. In addition, most of the values of \(\mu \) and \(\rho \) are close to the upper bound. Although parameter values are quite distributed between the bounds, the aforementioned findings can help in narrowing the expected ranges of values of parameters when global parameter estimation is targeted. Moreover, the distribution of parameter values is compared among the considered five studies. As we can see in Fig. 4, the distribution densities between the bounds of parameter values are the same for all studies.

We also provide box plots for the results of each parameter in Fig. 4. The medians of the parameter values in all studies and arms have close values. In addition, we performed statistical Lilliefors normality tests, for the parameter value distributions in each study and arm. The parameter \({\sigma }\) follows Normal distributions in Study 1, Study 2, Study 5 and all the Arms of Study 3 except Arm 6. The parameter \({\rho }\) follows Normal distributions in Study 1, Study 2, Study 3, and Arm 1 of Study 4.

Fig. 3
figure 3

Data fitting results of TC number of the selected patients. The solid (black) line shows model results, where all data points are used when estimating the parameters. The points represent the measured data. Ordinates: normalized number of tumor cells. Abscissas: normalized treatment time, where negative values indicate time before the start of treatment. The model can fit experimental data with different qualitative trends (e.g., up, down and “U”-curve)

Fig. 4
figure 4

Estimated values (black dashes) of the parameters of (3a) and (3b) model for all 210 patients (after data pre-processing) individually. The bottom and top of the boxes of the box plots are the 25th and 75th percentiles of the data, respectively. The distance between the bottom and top of each box is the interquartile range. The red line in the middle of each box is the data median. The outliers (red plus sign) are the values that are more than 1.5 times the interquartile range away from the bottom or top of the box. Ordinates: parameter values. Abscissas: non-dimensionalized model parameters. The values are scattered all over the bounds’ ranges, but \(\alpha \) values, which have a maximum of 6.331. Moreover, the distribution densities of parameter values are very close to each other among the studies (Color figure online)

4.2 Parameter Identifiability Analysis

Figure 5 provides the identifiability analysis results of the estimated model parameter values of the selected patients. Depending on the patient and the parameter, the estimated parameter values can be unique or take other values. For patient \(\#22\), the approximated feasible parameter set covers a small space in all parameters directions. Their estimated parameter values are close to being unique and thus identifiable. On the other hand, the approximated feasible parameter set for patient \(\#203\) covers a large space in all parameter directions. Therefore, the found values are not unique and other combinations of values could lead to good fits and predictions. For the remaining selected patients, the feasible parameter sets can be small or large depending on the patient and the parameter.

Figure 6 shows the averages of the indicator functions for each parameter range (found by the identifiability analysis) for all patients. These averages provide the most frequent ranges of parameter values within their bounds. They are equivalent to histograms where the output is between zero and one (indicator function output). The ranges are quite distributed all over the parameter bounds for all parameters. The only exception is \(\alpha \), for which there is a maximum bound after which no values could be found. The most frequent ranges are close to the median values in Fig. 4, except for \(\mu \), for which it is at the lower bound of the parameter value.

Fig. 5
figure 5

Identifiability analysis results of estimated values of model parameters of the selected patients. Ordinates: parameter values. Abscissas: non-dimensionalized model parameters. The points are the estimated values. The arrowheads are the maximum and minimum values found by the analysis. The results show that there are several combinations of parameter values at which the model can fit the experimental data (Color figure online)

When the target is to estimate parameter values of the tumor dynamics model of a certain patient, identifiable values are what we need. However, when the aim is to find global parameters values for all patients, large spaces of the feasible parameter sets can be desirable for finding those global values, in which they are independent of a considered patient. In summary, the results of Figs. 5 and 6 show that for several combinations of parameter values, good data fitting and model predictions can be achieved. This opens the potential for global parameter estimation, in which the estimated values of all or some model parameters are the same for all patients. However, using data from patients diagnosed with different cancer types (e.g., NSCLC and bladder cancer) might make it unlikely to find global values for all parameters. Nevertheless, we are not certain about that. Also, immunotherapy is tumor-agnostic and we wanted to pursue this tumor-agnostic approach. The applied version of the Kuznetsov model has six parameters. The model could describe different tumor growth profiles of both the NSCLC and bladder cancer when estimating the parameter values individually. It might be thus possible that the model could still describe the tumor dynamics of both cancer types while estimating only the initial values of EC with possibly one or two additional parameters while fixing the others.

Global parameter estimation would have the advantage of fixing part or all of the Kuznetsov model parameters so that the estimation problem is simplified when applied to each patient in clinical practice. However, fixing all the model parameters and only estimating the initial values of the EC contradicts personalized modeling. In personalized modeling, patient-specific characteristics, treatment types, and metadata are included in the model as combined or additional terms. One could thus fix the parameters that are not specific to the patient or treatment types and estimate the others. The parameters specific to treatment types could be also fixed if only a certain treatment type or strategy is applied. Parameters for metadata terms like gender, age, etc., could be also fixed within each group. Therefore, the approach of global parameter estimation should be applied to the parameters involved in the model terms describing common phenomena (which are tumor-agnostic) among tumor types, independent of treatment type or metadata, such as the terms in the Kuznetsov model.

Fig. 6
figure 6

The results of the averages of the indicator functions for each parameter range (found by the identifiability analysis) for all patients. The graphs are equivalent to histograms indicating the frequency of the estimated ranges among the patients. Ordinates: indicator functions values, between zero and one (estimated ranges frequencies). Abscissas: non-dimensionalized model parameters. The most frequent ranges of parameter values are close to the median values in Fig. 4, except for \({\mu }\), the range is most frequent at the lower bound of the parameter value. These median values are here the vertical lines (Color figure online)

4.3 Tumor Growth Prediction

In clinical decision making, a possible role of mechanistic models is forecasting tumor growth during treatment, potentially enabling physicians to adjust the treatment strategy earlier. We found that the modified Kuznetsov model indeed was able to extrapolate beyond the initial time points when the last two or three data points are not included when fitting the model. The solid black lines in Fig. 7 show model extrapolation results of the selected patients when the last two data points are not considered for fitting. For the six patients, the model quantitatively forecasts tumor dynamics, except for patient \(\#186\). The last data point of patient \(\#186\) is almost impossible to forecast because it suddenly shifts upward after a mild and constant increase in tumor growth. However, the model can still qualitatively predict the growth. For the other patients, the predictions are very close to the experimental data. In Table 3, the first sub-table “optimal Extra.” provides the MAE, RMSE, and \(R^2\) of the selected patients, as well as the mean absolute percentage error (MAPE) for the extrapolated part, defined as:

$$\begin{aligned} \hbox {MAPE} = \frac{1}{\zeta }\sum _{i=N_j-\zeta +1}^{N_j} \left| \frac{\tilde{y}_{i_j} - y_{i_j}(\tau _{i_j},\textbf{p}_j,x_{1_j})}{\tilde{y}_{i_j}}\right| . \end{aligned}$$

The values of \(R^2\) for the six patients are close to one except for patient \(\#186\) due to the aforementioned explanation. Compared to the previous work in Ghaffari Laleh et al. (2022), model extrapolation results here have higher \(R^2\) values, specifically, an \(R^2\) of 0.979 was reached for patient \(\#207\). The average values of \(R^2\) and MAPE of all 210 patients are also provided in the table. The \({R^2}\) value is 0.419 with median and range values of 0.685 and 6.717, respectively. According to the statistical Lilliefors normality test, this \({R^2}\) distribution does not follow a Normal distribution.

Model extrapolation results when omitting the last two data points for all 210 patients are provided in the supplementary material. By analyzing the performance of model extrapolation results of the 210 patients, we could not generally classify the tumor dynamic profiles into different subtypes. However, the model could not predict the future growth for several patients where tumor dynamics suddenly increase after steady profiles (e.g., patient \({\#186}\)). In addition, the performance of model extrapolation for patients receiving chemotherapy is better than for those receiving immunotherapy. Also, the model performed in general well when extrapolating curves that have slight slope changes after starting future predictions. In addition, we provide model extrapolation results while omitting the last three data points for the selected six patients in the supplementary material. For the latter results, the model could also predict the growth very well.

Table 4 provides the \(R^2\) and MAPE values per study and per arm for model extrapolation when omitting the last two data points. In general, and as expected, the calculated average \(R^2\) for the extrapolation experiment is lower than the \(R^2\) when using all the data points in all the studies. Similar to data fitting, Study 1 and Study 5 have higher \(R^2\) than the remaining studies. In Study 3, Arm 2 and Arm 3 have the lowest \(R^2\). Interestingly, in Study 2 and Study 4, the extrapolation performance of the model is markedly better for the Docetaxel group (Arm 2) than for the Atezolizumab group (Arm 1). This indicates that patients receiving docetaxel have a more predictable response trajectory compared to patients receiving immunotherapy in whom unexpected patterns of tumor response can occur even at later time points. Moreover, in Study 3 where all patients received immunotherapy, Arm 1 (MPDL3280A-1a) and Arm 3 (MPDL3280A-1a) have better model performance in both fitting and extrapolation than the rest of the arms.

Moreover, Fig. 7 shows how far model extrapolation could deviate from the actual values. The results of the two “extreme-case” (dash-dotted blue and the dashed magenta) lines are designed to be as far away from each other at the final time point (\(\tau _{\left( N_j\right) _j}\)) while both being within some 10% tolerance of the found optimal fit for the fitted time before extrapolation starts. For some patients, the “extreme-case” lines can significantly deviate from the actual values, especially for the upper case (dash-dotted blue lines), as for patients \(\#22\) and \(\#83\). In contrast, the “extreme-case” lines for patients \(\#163\) and \(\#186\) are very close to the optimal extrapolated ones. Sub-tables “Upper Extra.” and “Upper Extra.” in Table 3 provide the MAE, RMSE, \(R^2\) and MAPE for the selected patients for the two “extreme-case” extrapolations.

To sum up, the model can forecast tumor dynamics of the patients and the “extreme-case” extrapolation scenarios were conducted to check the worst model predictions. For some patients the “extreme-case” extrapolation results deviate from the actual values, for others they remain close to the optimal extrapolation results.

Table 3 Quantification of the goodness of fit and prediction of the model as shown in Fig. 7
Fig. 7
figure 7

Model extrapolation results of the selected patients. The solid (black) line represents the optimal model prediction. The dash-dotted (blue) and the dashed (magenta) lines show the upper and the lower “extreme-case” model extrapolation deviation results, respectively. Here the last two data points are not considered for parameter estimation. The points show the measured data. Shaded areas highlight regions of model extrapolation. Ordinates: normalized number of tumor cells. Abscissas: normalized treatment time, negative values indicate time before the start of treatment. The model is capable of forecasting tumor dynamics qualitatively and sometimes quantitatively (Color figure online)

Table 4 Quantification of the goodness of fit and extrapolation of the model per the conducted studies and arms

5 Conclusion

In this study, we show that quantitative mathematical models can be used to describe and forecast the behavior of cancer. Previous studies have used the same datasets to fit very simple ODE models to the tumor volume measurements over time (Ghaffari Laleh et al. 2022). However, in Ghaffari Laleh et al. (2022), it was observed that all established ODE models were not able to fit “U”-shaped trajectories well. In clinical terms, patients who relapsed after an initial response, or patients who showed a delayed response, were not adequately represented in these previous models. Compared to this, the present study evaluates a more complex model which has the benefit of being able to describe a larger variety of real-world time series. This specific model is a slight simplification of the Kuznetsov model (Kuznetsov et al. 1994), which has not been linked with or validated in large amounts of quantitative real-world human data. Specifically, it could quantitatively fit the Kuznetsov model to a large dataset of 1472 patients. Data are collected from patients undergoing immunotherapy or chemotherapy treatments.

In the parameter estimation for each patient, we found that some parameters for some patients are not unique (identifiability analysis). This means that many combinations of parameter values could lead to good fitting and predictions. This opens the potential for global parameter estimation, in which parameter values are the same for all patients. However, since we did not consider patients with few data points (patients with early tumor progression and potentially their death), potential bias to the estimation problem is introduced that might limit the clinical applicability of the model with global parameters. Still, the model could predict tumor growth (2–3 omitted measurements) which could indicate practical usefulness as a predictive biomarker. This is a more clinically relevant application of mathematical tumor modeling. Specifically, the model fitting and prediction could potentially describe and forecast the behavior of cancer, improve the understanding of underlying biological mechanisms, and provide model approaches for cancer treatments, as the supplied doses to patients could be adjusted in advance according to model predictions. Future studies should attempt a global parameter estimation, use deterministic and less computationally demanding solution methods. Another possibility is the reduction of the number of model parameters in order to have less number of parameters to estimate and thus enhance the possibility to reach global values. In addition, the mixed-effects modeling (population approach), which allows the simultaneous modeling of tumor dynamics and inter-individual variability within a statistical framework, could be considered as a potential work for future investigation.

The model complexity is mostly limited by the availability of suitable data. Therefore, if more data become available, it would be possible to include other terms in the model, e.g., metadata, explicitly representing different types of immune cell populations or multiple cancer cell clones.

A general limitation of our approach is that mechanistic models of tumor growth are competing against statistical models of tumor growth, including machine learning models. In order to forecast clinical response to immunotherapy and to guide clinical decisions, both types of models could be applied. However, it is still unclear which approach will ultimately translate to clinical application. While mechanistic models are attractive because they integrate expert knowledge into their predictions, they are also constrained by these assumptions. Our study contributes empirical evidence that mechanistic models can make clinically relevant predictions in a relevant use case, but further work is required to validate this in other disease contexts and compare the predictions to statistical models.