1 Introduction

Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has long been used as a clinical tool to study the vasculature of different tissues, especially tumours. Recently, it has been employed to assess the efficacy of antiangiogenic treatments in tumours [1,2,3]. This technique consists on the acquisition of a series of T1-weighted images before, during, and after the intravenous injection of a contrast agent (CA). As the CA reaches the tissue of interest via arterial inflow, it decreases the native T1 relaxation time, producing an increase in the measured signal intensity. The following removal of CA through venous blood flow results in a decrease in signal intensity, returning to its baseline value. For each voxel, the outcome is a signal intensity versus time curve [4].

Three main approaches can be found in the literature to analyse these curves [5]: subjective analyses, based on the observation of DCE-MRI images by an experienced observer; semi-quantitative analyses, which extract parameters from the signal-time curve, such as the initial slope, time to peak, etc; and finally, full-quantitative analyses. The last approach implies the conversion of signal intensity to CA concentration, which can be elaborated considering either linear or non-linear equations [6]. Once converted to CA concentration, pharmacokinetic (PK) models are applied to extract physiological information from the images. There are a wide range of PK models [7], considering different formulations and hypotheses to analyze the DCE data. One of the most common approaches is the so-called standard Tofts model [8] (STM). It is a compartmental model, which considers two different well-mixed compartments: intravascular and extravascular-extracellular space (EES), and the exchange of CA between them. In some highly vascularized tissues (such as liver tissue and some type of tumours), the voxel volume occupied by vessels is not negligible. Tofts et al. [9] formulated the extended Tofts model (ETM), which includes an additional variable that represents the contribution of intravascular CA to signal enhancement.

These models assume that CA can only reach the region of interest (ROI) through blood perfusion, neglecting passive delivery (spatial diffusion from adjacent regions, also known as inter-voxel diffusion, and convection due to gradients in interstitial fluid pressure). The hypothesis of well-mixed compartments assumes that CA is distributed uniformly within each compartment. This approximation also underestimates the possible intra-voxel CA diffusion. Although previous works have studied the influence of this intra-voxel diffusion on fitted parameters [10, 11], this work will focus on inter-voxel diffusion. Even though convective effects may be relevant in some tissues, they will not be considered to reduce the complexity of the model. As several authors have pointed out [12,13,14,15,16,17], assuming no inter-voxel CA diffusion can lead to errors in the quantities estimated by the model, especially in weakly vascularized zones, such as necrotic regions in a tumour.

Previous works have developed different methods to include the diffusive phenomenon in the model. The models developed by Jia et al. [13] and Koh et al. [14] accounted for inter-voxel CA diffusion, but lacked information about other physiological properties, like those parameters defined in the STM or the ETM [8, 9]. Pellerin et al. [15], first, and Fluckinger et al. [16], next, have created models that, keeping the formulation proposed by Tofts et al. [8], added a term to account for CA diffusion. Pellerin et al. [15] proposed a combined diffusion-perfusion (DP) that incorporated CA diffusion to the STM. However, since it required fitting all model parameters at once, the resulting computational cost was excessively high. Fluckinger et al. [16] solved this issue by implementing a voxel-wise approach to compute CA diffusion. To do so, they considered some simplifications, such as homogeneity in cellularity and diffusivity among neighbouring voxels, that might not apply to many types of tissues. Later, Cantrell et al. [17] proposed a diffusion compensated Tofts model and applied it to intracranial aneurysms. This approach considered a known diffusivity coefficient and handled separately the contributions due to diffusion and to extravasation to reduce the computational cost. Their method is based on the results obtained from the ETM, which may differ from true values in some tumours where CA diffusion plays an important role. In those cases, the applicability of the model is limited. More recently, Sinno et al. [18] have developed a new transport model (Cross Voxel Exchange Model, CVXM) that includes both the diffusion and the convection phenomena into the STM formulation. In their work, they quantified the error in the parameters retrieved by the STM due to ignoring passive transport mechanisms, showing the importance of convective and diffusive processes in DCE-MRI data. The implementation of the model was based on a discrete formulation, similar to previous works [15, 16]. They tested the model both on in silico and experimental xenograft data, considering only 1D geometries. Except Cantrell et al. [17], other authors based their works on the STM, which is only accurate in poorly vascularized tissues [19]. To extend the range of application of the diffusion-corrected models, the present work is based on the ETM, which is accurate on both weakly and highly vascularized regions.

Therefore, this study proposes a finite element (FE)-based model to include a diffusive term in the ETM [9] formulation with the aim of using this model to solve an inverse problem that provides an estimation of vascular properties of the tumour from DCE-MRI sequences. To achieve this aim, we first define the formulation of a diffusion-corrected extended Tofts model that accounts for the effect of CA diffusion and perfusion and we implement it on a FE model. Second, we develop a gradient-based optimization method to fit this model to the clinical imaging data to extract vascular properties. And third, we test the performance and accuracy of this inverse method in different simulated theoretical cases. Hence, this approach aims to benefit from the computational efficiency of the finite element method (FEM) to fit all analysed volumes simultaneously while keeping an affordable computational cost. Thus, the modelling approach here proposed can be applied to analyze the heterogeneous behavior characteristic of tumours, avoiding simplifications and approximations that reduce the range of application of the model.

2 Materials and methods

This section is divided into three different parts. In the first one, a whole description of the model formulation is presented in detail. The second comprises the computational implementation of this model in the FE-based commercial software ANSYS (Ansys Inc., TX, USA). And finally, there is a complete description of the algorithm that solves the inverse problem of curve-fitting.

Fig. 1
figure 1

a Insight on the different scales considered in the model. The RVE consists of a certain volume occupied by cells, another region which corresponds to the microvasculature and the rest of the volume (coloured in light yellow), which is the EES (V\(_{E}\)). The concentration of CA within the EES volume of the RVE is averaged (Eq. 3). The two different coordinate systems (x and X) are detailed. Panel b Different processes taking place in the RVE (V\(_{T}\)) that are considered in the proposed model. Apart from the perfusion process (solid green arrows) and the contribution of microvasculature to the total concentration, diffusion between adjacent RVEs (dotted blue arrows) is included

2.1 Diffusion-corrected extended Tofts Model (D-ETM)

The transport mechanisms in biological tissues have long been of interest among physicians and researchers. Nicholson and collaborators [20,21,22,23,24,25] studied thoroughly this process, establishing a general formulation for the diffusion of particles in the brain [22]. In this work, they compared the densely packed cells of the brain and their interstitial spaces to a porous medium with two phases, an intra- and extracellular phase. They then formulated the equations of the diffusive process in brain based on the general equation of diffusion in porous media, which is a process that has also been widely studied by many authors [26,27,28].

In the present work, this diffusion formulation is added to the general form of the ETM. There is a distinction between two different scales: a macro-scale, where the extravasation contribution to the concentration averaged in the representative volume element (RVE) as well as the diffusion of CA between adjacent RVEs are studied (Fig. 1b); and the micro-scale, which is defined within the RVE and consists of a heterogeneous distribution of cells, vessels and EES (Fig. 1a). The size of the RVE must be large enough to achieve length scale separation, containing sufficient number of cells and, at the same time, it must be small enough so that CA concentration can be averaged in it without adding significant error. Therefore, Eq. 1 describes the main transport mechanisms for a RVE:

$$\begin{aligned}&\frac{\partial \overline{C_t}(\varvec{x},t)}{\partial t}= {\varvec{\nabla }\cdot }{\left( D_{eff}(\varvec{x})\nabla {\overline{C_t}(\varvec{x},t)}\right) } \nonumber \\&+ K^{\rm{Trans}}(\varvec{x})\left( C_p(t)-\overline{C_e}(\varvec{x},t)\right) + v_p(\varvec{x})\frac{\text {d} C_p(t)}{\text {d}t} \end{aligned}$$
(1)

where:

  • \(\overline{{C_{t}}}({\varvec{x}},t)\) is the total CA concentration, averaged in the RVE (V\(_{T}\))

    $$\begin{aligned} \overline{C_t}(\varvec{x},t)=\frac{1}{V_T}\int _{V_T} C_t(\varvec{X},\varvec{x},t)dV \end{aligned}$$
    (2)

    From now on, we will refer to \(\overline{{C_{t}}}({\varvec{x}},t)\) as C\(_{t}({\varvec{x}},t)\) to facilitate the formulation

  • x is the coordinates vector in the macroscopic domain of the tissue

  • X is the coordinates vector in the microscopic domain of the RVE

    The relation between the macroscopic (x) and the microscopic (X) coordinate systems is defined through the homogenization process defined in Eqs. 2 and 3

  • D\(_{\rm{eff}}({\varvec{x}})\) is the effective diffusion coefficient for that RVE. Considering that each RVE contains a heterogeneous distribution of cells and vessels, we can compare the diffusive process of CA to the diffusion in a porous medium, as stated previously [22, 25,26,27,28,29]. A further detailed study of this variable is included below (Sect. 2.1.1)

  • K\(^{\rm{Trans}}({\varvec{x}})\) is the extravasation rate for that RVE

  • v\(_{p} ({\varvec{x}})\) is the volume fraction of blood plasma in the RVE

  • C\(_{p}(t)\) is the CA concentration in the blood plasma volume, which follows the same temporal function in every RVE and only varies with time

  • \(\overline{{C_{e}}}({\varvec{x}},t)\) is the averaged CA concentration in the extracellular subvolume (V\(_{E}\)) of the RVE. Considering that the microscopic scale is unknown (we have no information about the subvolume V\(_{E}\) of the RVE), we assume the hypothesis of well-mixed compartments presented by Tofts [9]. This hypothesis considers an infinite diffusivity coefficient of CA in the subvolume V\(_{E}\), what implies that there cannot be any spatial gradient of C\(_{e}\) in the subvolume V\(_{E}\). Therefore:

    $$\begin{aligned} \overline{C_e}(\varvec{x},t)=\frac{1}{V_E}\int _{V_E} C_e(\varvec{X},\varvec{x},t)dV \end{aligned}$$
    (3)

    From now on, we will refer to \(\overline{{C_{e}}}({\varvec{x}},t)\) as C\(_{e}({\varvec{x}},t)\) to facilitate the understanding of the equations

This well-mixed compartments hypothesis validates the Tofts equation for the compartmental model:

$$\begin{aligned} C_t(\varvec{x},t)=v_e(\varvec{x})C_e(\varvec{x},t)+v_p(\varvec{x})C_p(t) \end{aligned}$$
(4)

where \(v_e(\varvec{x})\) is the volume fraction of the EES in the RVE, defined as \(v_e = \frac{V_E}{V_T}\).

Substituting Eq. 4 in Eq. 1, we obtain Eq. 5, which is the general form of the diffusion-corrected ETM (D-ETM), formulated in terms of the total concentration in the RVE:

$$\begin{aligned}&\frac{\partial C_t(\varvec{x},t)}{\partial t}= {\varvec{\nabla }\cdot }{\left( D_{eff}(\varvec{x})\nabla {C_t(\varvec{x},t)}\right) } \nonumber \\&+ \frac{K^{\rm{Trans}}}{v_e}(\varvec{x})\left( C_p(t)(v_e(\varvec{x})+v_p(\varvec{x}))-C_t(\varvec{x},t)\right) + v_p(\varvec{x})\frac{\text {d} C_p(t)}{\text {d}t} \end{aligned}$$
(5)

A table summarizing all model parameters, their definition and units can be found in Appendix A.

2.1.1 Effective diffusivity

Different authors have formulated equations that relate the effective diffusivity to different geometrical characteristics of the porous material [29,30,31]. Given that the geometrical structure of the solid phase (cells and vessels) is unknown, we assume that the equivalent diffusivity is related to the tortuosity [22]. Tortuosity (\(\lambda\)) quantifies the increase in path-length of a diffusing particle due to the existence of obstacles in its way [32]. It is defined as:

$$\begin{aligned} \lambda =\sqrt{\frac{D}{D_{\rm{eff}}}} \end{aligned}$$
(6)

Tao and Nicholson [25] used the Monte Carlo method to simulate the diffusion of particles on different porous structures and different cell geometries and fitted the simulated data to obtain the value of \(D_{eff}\) for each case. Applying Eq. 6 to these values, they obtained a value of tortuosity (\(\lambda\)) for each simulated structure and cell geometry and found that tortuosity was independent on the considered cell geometry and was only dependent on the porosity (\(\varepsilon\)) of the structure:

$$\begin{aligned} \lambda =\sqrt{\frac{3-\mathcal {E}}{2}} \end{aligned}$$
(7)

Substituting Eq. 7 in Eq. 6 we obtain the definition of an equivalent diffusivity (D\(_{\rm{eff}}\)). Considering that porosity (\(\varepsilon\)) is defined as the volume fraction of ”empty” space in the material, its equivalence to v\(_{e}\) is straightforward:

$$\begin{aligned} D_{\rm{eff}}=\frac{2D}{3-v_e} \end{aligned}$$
(8)

Where D is the diffusion coefficient of CA in free medium, which is known. Using this formulation, we can consider different effective diffusion coefficients per element without adding more unknowns to the model.

2.2 Forward FE model

Once the theoretical formulation is defined, the next step is to implement it into the FE software to generate CA vs time curves from given sets of parameters. Equation 5 is implemented into ANSYS diffusion module [33], including the extravasation term and the contribution of the blood plasma fraction as non-linear generation terms.

The numerical formulation obtained for Eq. 5 is defined as:

$$\begin{aligned} \varvec{C^d_e}\varvec{\dot{c_e}}+\varvec{K^d_e} \varvec{c_e} = \varvec{r_e^g} \end{aligned}$$
(9)

Where, for an element e:

  • \(\varvec{c_e}\) is the nodal concentration vector and \(\varvec{\dot{c_e}}\) is its temporal derivative

  • \(\varvec{n}\) are the element shape functions

  • \(\varvec{C^d_e}\) is the element diffusion damping matrix and it is defined as:

    $$\begin{aligned} \varvec{C^d_e} = \int _{\Omega ^e} \varvec{n}\varvec{n}^T\,d(\Omega ^e) \end{aligned}$$
    (10)
  • \(\varvec{K^d_e}\) is the element diffusion conductivity matrix and it is defined as:

    $$\begin{aligned} \varvec{K^d_e}= \int _{\Omega ^e} \left( \nabla {\varvec{n}^T}\right) ^T\varvec{D}\nabla {\varvec{n}^T} \, d(\Omega ^e) \end{aligned}$$
    (11)

    where \(\varvec{D}\) is the diffusion coefficient matrix, further defined

  • \(\varvec{r_e^g}\) is the element diffusing substance generation load vector and it is defined as:

    $$\begin{aligned} \varvec{r_e^g} = \int _{\Omega ^e} \varvec{g}\varvec{n}^T \, d(\Omega ^e) \end{aligned}$$
    (12)

    where \(\varvec{g}\) is the generation load vector. The integration volume \(\Omega ^e\) corresponds to the volume of the finite element e.

The time discretization scheme is derived from the backward Euler implicit method and is defined as:

$$\begin{aligned} \varvec{u_{n+1}}=\varvec{u_n}+ \theta \Delta {t} \, \varvec{\dot{u}_{n+1}} + \left( 1-\theta \right) \Delta {t} \, \varvec{\dot{u}_{n}} \end{aligned}$$
(13)

Where:

  • \(\varvec{u_{n}}\) is the nodal degree of freedom (DOF) values at time \(t_n\)

  • \(\theta\) is the transient integration parameter. If \(\theta =0\) an explicit algorithm is used, whereas if \(\theta =1\) an implicit algorithm is employed

  • \(\Delta {t}=t_{n+1}-t_n\) is the time step size

  • \(\varvec{\dot{u}_{n}}\) is the time rate of the nodal DOF values at time \(t_n\), computed at previous time point

Implementing this time discretization scheme on Eq. 9 leads to Eq. 14:

$$\begin{aligned}&\varvec{C^d}\frac{\varvec{c^{t+1}}-\varvec{c^{t}}}{\Delta {t}} = -\varvec{K^d}\varvec{c^{t+1}}+\varvec{R}\left( (\varvec{k^{\rm{Trans}}}\oslash {\varvec{v_e}})\odot {(\varvec{v_e}+\varvec{v_p})}c_p^{t+1}\right) \nonumber \\&- \varvec{R}\left( (\varvec{k^{\rm{Trans}}}\oslash {\varvec{v_e}})\odot {\varvec{c^{t+1}}}\right) + \varvec{R}\varvec{v_p}\dot{c_p}^{t+1} \end{aligned}$$
(14)

Where \(\varvec{R}\) is the substance generation matrix and \(\varvec{K^d}\) is the assembled diffusion conductivity matrix. The terms \(\varvec{c^{t+1}}\) and \(\varvec{c^{t}}\) are the nodal concentration vectors resulting from evaluating the nodal concentration matrix (\(\varvec{C}\)) at given time points. \(\varvec{k^{\rm{Trans}}}\), \(\varvec{v_e}\) and \(\varvec{v_p}\) are the nodal variables vectors. Finally, \(c_p^{t+1}\) and \(\dot{c_p}^{t+1}\) are the scalars resulting of evaluating the arterial input function (AIF) and the derivative of the AIF vectors (\(\varvec{c_p}\) and \(\varvec{\dot{c_p}}\)) at given time points. To distinguish between the regular dot product and the Hadamard operations, we employ a specific notation for the pointwise product (\(\odot\)) and division (\(\oslash\)). The diffusion coefficients matrix \(\varvec{\textit{D}}\) for each element is:

$$\begin{aligned} \begin{bmatrix} \frac{2D_{xx}}{3-v_e^{elem}} &{} 0 &{} 0 \\ 0 &{} \frac{2D_{yy}}{3-v_e^{elem}} &{} 0 \\ 0 &{} 0 &{} \frac{2D_{zz}}{3-v_e^{elem}} \end{bmatrix} \end{aligned}$$

Where \({v_e}^{elem}\) is the element average of the nodal v\(_{e}\) variable. Considering as a first approach isotropic diffusion, \(D_{xx}=D_{yy}=D_{zz}=D\).

Due to the dynamic nature of the physical process, transient effects were included. Although the transient integration parameter was set to \(\theta = 1\), which makes the solution unconditionally stable, the influence of the transient effects still affected the accuracy of the solution depending on the time step (\(\Delta {t}\)). We conducted several experimental error analyses and concluded that a time step of 1s ensured the consistency of the solution, while keeping an affordable computational cost.

Initial values were provided to initialize the transient simulations. Both the nodal concentration and its time derivative were set to zero for t=0s (\(C(t_0)=\dot{C}(t_0)=0\)).

The AIF is interpolated to match this time resolution. Since both generated curves and fitted curves are obtained using the same model, no interpolation is needed to perform the fitting process.

Regarding the boundary conditions, we assumed that no CA could diffuse across the boundaries of the geometry, considering the tumour as an ”isolated” entity with respect to adjacent tissues. Although it may not be biologically correct for tissues that are not surrounded by physical barriers, this condition facilitates the formulation and is consistent with literature [8, 9, 15, 16].

2.3 Solving the inverse problem

Imaging data is usually processed as a voxelized geometry. However, the method proposed here can be applied to any complex geometry, like those obtained from imaging segmentation. Although the conversion from a voxelized discretization to a FE mesh approximates the data, since the spatial resolution of the latter is expected to be equal to or higher than the spatial resolution of the former, the error added to the data is small.

After implementing the D-ETM equation in ANSYS, we can simulate the CA transport for a certain tissue. Therefore, the next step is fitting the model to the concentration-time curves obtained from imaging. This inverse problem of curve fitting is solved using an iterative method based on the non-linear least squares method [34].

Being a gradient based method, it needs to compute the derivative of the total concentration C(t) with respect to each nodal parameter, known as the Jacobian matrix.

Typically, this matrix is obtained numerically using the finite differences method. This option leads to an excessively high computational cost, since the number of simulations (\(Total_{sim}\)) needed to compute this matrix increases linearly with the number of time points and nodal variables and quadratically with the number of nodes.

$$\begin{aligned} {\rm{Total}}_{\rm{sim}}={N_{\rm{nodes}}}^{2}{N_{\rm{time}}}N_{\rm{parameters}} \end{aligned}$$
(15)

where \(N_{\rm{nodes}}\) is the number of nodes in the model, \(N_{\rm{time}}\) is the number of time points and \(N_{\rm{parameters}}\) is the number of unknowns to be fitted per node. Even for small cases, this numerical analysis requires an excessive computational cost that cannot be considered.

Therefore, we propose an alternative semi-analytical computation of the Jacobian matrix. We can re-write Eq. 14 as:

$$\begin{aligned} \varvec{A}\varvec{c^{t+1}} = \Delta {t} \, \varvec{R} \left( (\varvec{k^{\rm{Trans}}}\oslash {\varvec{v_e}})\odot {(\varvec{v_e}+\varvec{v_p})}c_p^{t+1} + \varvec{v_p}\dot{c_p}^{t+1}\right) + \varvec{C^d}\varvec{c^t} \end{aligned}$$
(16)

where:

$$\begin{aligned} \varvec{A}=\varvec{C^d}+\Delta {t} \, \varvec{K^d}+\Delta {t} \, \varvec{R}(\varvec{k^{\rm{Trans}}}\oslash {\varvec{v_e}}) \end{aligned}$$
(17)

The derivative of \(\varvec{C}\) with respect to the different parameters can be obtained from Eq. 16 by applying the product rule on the left hand side of the equation:

  • For K\(^{\rm{Trans}}\):

    $$\begin{aligned} \varvec{A}\frac{\partial \varvec{{c}^{t+1}}}{\partial \varvec{k^{\rm{Trans}}}} = \Delta {t} \, \varvec{R}\odot {\left( c_{p}^{t+1}(\varvec{v_e}+\varvec{v_p})\oslash {\varvec{v_e}}-\varvec{{c}^{t+1}}\oslash {\varvec{v_e}}\right) }+\varvec{C^d}\frac{\partial \varvec{{c}^t}}{\partial \varvec{k^{\rm{Trans}}}} \end{aligned}$$
    (18)
  • For v\(_{e}\):

    $$\begin{aligned}&\varvec{A}\frac{\partial \varvec{{c}^{t+1}}}{\partial \varvec{v_e}}=-\Delta {t} \, \varvec{K^d(\varvec{v_e})}\odot {\varvec{{c}^{t+1}}} \nonumber \\&+ \Delta {t} \, \left( \varvec{R}\odot {(\varvec{k^{\rm{Trans}}}\oslash {\varvec{v_e}^2)}}\odot {\varvec{{c}^{t+1}}}-\varvec{v_p}c_{p}^{t+1}\right) + \varvec{C^d}\frac{\partial \varvec{{c}^t}}{\partial \varvec{v_e}} \end{aligned}$$
    (19)

    where \(\varvec{K^d({v_e})}=\frac{\partial \varvec{K^d}}{\partial {v_e}}\). \(\varvec{K^d}\) is the assembled matrix of elemental diffusion conductivity matrices, which are defined by Eq. 11. To build these matrices, we define \(\varvec{D}\) as:

    $$\begin{aligned} \begin{bmatrix} \frac{2D}{(3-{v_e}^{e})} &{} 0 &{} 0 \\ 0 &{} \frac{2D}{(3-{v_e}^{e})} &{} 0 \\ 0 &{} 0 &{} \frac{2D}{(3-{v_e}^{e})} \end{bmatrix} \end{aligned}$$

    Where, for an element e, \({v_e}^{e} =\frac{1}{N} \sum _{i=0}^{N}{v_e}^i\). N is the number of nodes in the element and \({v_e}^i\) are the nodal values of v\(_{e}\). The derivative of \(\varvec{{K^d_e}}\) with respect to nodal v\(_{e}\) (\(\varvec{{K^d_e}({v_e})}\)) is exported from ANSYS by deriving the components of \(\varvec{D}\) with respect to the nodal v\(_{e}\) values, obtaining:

    $$\begin{aligned} \begin{bmatrix} \frac{-2D}{N(3-\frac{1}{N} \sum _{i=0}^{N}{v_e}^i)^2} &{} 0 &{} 0 \\ 0 &{} \frac{-2D}{N(3-\frac{1}{N} \sum _{i=0}^{N}{v_e}^i)^2} &{} 0 \\ 0 &{} 0 &{} \frac{-2D}{N(3-\frac{1}{N} \sum _{i=0}^{N}{v_e}^i)^2} \end{bmatrix} \end{aligned}$$

    Each elemental matrix \(\varvec{{K^d_e}({v_e})}\) is generated introducing this matrix in Eq. 11. Finally, these elemental matrices are assembled to build the global matrix \(\varvec{K^d({v_e})}\).

  • For v\(_{p}\):

    $$\begin{aligned} \varvec{A}\frac{\partial \varvec{c^{t+1}}}{\partial \varvec{v_p}} = \Delta {t}\varvec{R}\odot {\left( (\varvec{k^{\rm{Trans}}}\oslash {\varvec{v_e}})c_{p}^{t+1}\right) +\varvec{R}\varvec{I}\dot{c_p}^{t+1}} + \varvec{C^d}\frac{\partial \varvec{c^{t}}}{\partial \varvec{v_p}} \end{aligned}$$
    (20)

To compute Eqs. 18, 19, 20, matrices \(\varvec{C^d}\) and \(\varvec{R}\) are exported from ANSYS along with \(\varvec{K^d({v_e})}\). After solving these equations, the Jacobian matrix is obtained by concatenating the matrices computed in Eqs. 18, 19, 20:

$$\begin{aligned} \varvec{J}= \begin{bmatrix} \frac{\partial {c}_1^1}{\partial k_1^{\rm{Trans}}}&{}\dots &{}\frac{\partial {c}_{nnode}^1}{\partial k_{nnode}^{\rm{Trans}}}&{}\frac{\partial {c}_1^1}{\partial {v_e}_1}&{}\dots &{}\frac{\partial {c}_{nnode}^1}{\partial {v_e}_{nnode}} &{} \frac{\partial {c}_1^1}{\partial {v_p}_1}&{}\dots &{}\frac{\partial {c}_{nnode}^1}{\partial {v_p}_{nnode}}\\ &{} &{} &{} &{} &{} &{} &{} &{} \\ &{} &{} &{} &{} \vdots &{} &{} &{} &{} \\ &{} &{} &{} &{} &{} &{} &{} &{} \\ \frac{\partial {c}_{1}^{ntime}}{\partial k_1^{\rm{Trans}}}&{}\dots &{}\frac{\partial {c}_{nnode}^{ntime}}{\partial k_{nnode}^{\rm{Trans}}}&{}\frac{\partial {c}_{1}^{ntime}}{\partial {v_e}_1}&{}\dots &{}\frac{\partial {c}_{nnode}^{ntime}}{\partial {v_e}_{nnode}}&{}\frac{\partial {c}_{1}^{ntime}}{\partial {v_p}_1}&{}\dots &{}\frac{\partial {c}_{nnode}^{ntime}}{\partial {v_p}_{nnode}} \end{bmatrix} \end{aligned}$$

This matrix is included in the least-squares method to fit the CA concentration-time curves to Eq. 5. We tested this analytical computation in small-sized models, comparing it to the numerically computed Jacobian. Results were promising, as both matrix and convergence were similar between the two methods. Further information about these results can be found in Appendix B.

The optimization algorithm developed takes as inputs an initial set of values of the unknowns, as well as the curves to be fitted. The model proposed defines a set of three parameters (K\(^{\rm{Trans}}\), v\(_{e}\) and v\(_{p}\)) per node. This implies that the algorithm needs to fit three times the number of nodes simultaneously. Starting from the initial seed provided, a forward simulation of the D-ETM is executed, obtaining a first set of curves that are then used to initialize the cost function. Then, the numerical matrices needed for the Jacobian computation (\(\varvec{C^d}\), \(\varvec{R}\) and \(\varvec{K^d({v_e})}\)) are exported by running several scripts on ANSYS. Once the Jacobian is obtained, the minimization solver computes the updated set of parameters, finishing the first iteration. A schematic pseudocode of this process is presented in algorithm 1.

The cost function (CF) defined for the method is a standard sum of squared differences (Eq. 21), which has proven to be effective in the simulated cases:

$$\begin{aligned} CF=\sum ^{N_{time}}_{i=0}(y_i-f_i(\varvec{x_j}))^2 \end{aligned}$$
(21)

Where \(N_{time}\) is the total number of time points, \(y_i\) is the value of the reference concentration at time point i and \(f_i(\varvec{x_j})\) is the value at time point i of the curve obtained from running a simulation of D-ETM with given \(\varvec{x_j}\) vector of parameters.

Other cost functions that only consider a fraction of the time points, like those proposed in [15], were tested. Nevertheless, they showed convergence problems, leading to the algorithm getting caught in local minima distant from the global optimum.

The solver method chosen to perform the minimization was the Trust Region Reflective (TRF) algorithm [35]. Although the commonly used Levenberg-Marquardt (LM) [36] is also suitable for our purposes, the TRF algorithm (in its Scipy [37] implementation) handles sparse matrix. Given the nature of our problem, working with matrices in sparse format for the Jacobian computation dramatically reduced the use of system memory (around 98% reduction). We compared the performance of the LM algorithm against the TRF (converting the matrices to dense format for the LM method). Both methods retrieved accurately the reference values and the number of iterations needed was similar in both cases.

The proposed curve generation and curve fitting processes were executed in a cluster composed of 480 CPUs and 1088 GB of RAM. Specifically, the resources employed for the benchmark problem and the real tumour geometry were: 16 CPUs and 16GB RAM and 24 CPUs and 32GB RAM, respectively. The optimization time of the benchmark problem was 1.5h on average (fitting 360 time points on 339 nodes) and for the tumour geometry was around 10h on average (fitting 360 time points on 955 nodes). The optimization algorithm was developed in Python and APDL and the forward simulations were generated on ANSYS 2019R2.

figure a

3 In silico simulations

To test the performance of the optimization algorithm proposed and to compare the D-ETM to the ETM, two different sets of simulated tissue concentration time courses have been generated from the forward FE model of the D-ETM and some sets of known parameters (K\(^{\rm{Trans}}\), v\(_{e}\), v\(_{p}\) and D).

The first case corresponds to a simple two-dimensional (2D) geometry, where the distribution of parameters generates a CA distribution that is completely dependent on diffusion. The second case, on the other hand, is based on a more complex geometry inspired in clinical data of tumours [38].

Although the proposed method can be applied to both 2D and three-dimensional (3D) cases, given the common resolution of imaging data (slice thickness is usually several times the pixel size), we focused on 2D geometries to keep the computational cost affordable. Due to this size difference, the effects of out-of-plane diffusion are expected to be negligible in comparison with the in-plane diffusion [16]. In terms of computation, it is dramatically faster (up to two orders of magnitude) to parallelize several 2D simulations rather than studying all of them as a 3D case.

The geometries correspondent to both cases were meshed using linear quadrilateral diffusion elements (PLANE238 in ANSYS manuals ) of size 0.15 mm, which generates a mesh of higher density than of most of the clinical and experimental imaging data.

Similarly to previous works [10], the diffusion coefficient D was given a value of 2.6E-04 mm\(^{2}\)/s, which is consistent with experimental measurements [14, 40]. Considering the formulation of the diffusive term of the model proposed here (Eq. 5), the diffusion coefficient was constant in both cases.

The AIF used in the simulations (Fig. 2) was inspired on the clinical data collected within the European research project PRIMAGE [38].

Fig. 2
figure 2

AIF used for the simulations

3.1 Benchmark problem

This case is based on the one proposed by Pellerin et al. [15], which was also included in the work of Fluckiger et al. [16]. Given the importance of these contributions into the study of CA diffusion process, we consider this case as a benchmark problem for PK models that incorporate diffusion.

Hence, we simulated a slice of a circular, radially symmetric, tissue. The definition of both the geometry and the distribution of parameters aims to generate an extreme case of a distribution of CA that is diffusion-limited. To do so, the circle is divided into two different regions: a highly perfused rim and a necrotic core.

Initially, the parameters values chosen were similar to those used by Fluckiger et al. [16]: \(K^{\rm{Trans}} = 0.2\) min\(^{-1}\) in the rim and 0.05 min\(^{-1}\) in the core; a constant value of v\(_{e}\) equal to 0.5 in the whole model; and, finally \(v_p=0.05\) in the rim and 0.005 in the core.

When running the optimization algorithm on this case, we observed that, although the cost function was reduced to values close to zero, the parameters returned were different from the true ones. Upon closer examination, we noticed the existence of several local minima close to the global optimum. Figure 3 shows two different nodes where the true values of the model parameters were the same, and so were the generated CA curves. However, although the fitted curve is almost identical to the reference one, the two sets of fitted parameters are different between them and both differ from the true values. It must be noted that some of the parameters retrieved are unphysical: \(K^{\rm{Trans}}\) and \(v_p\) below zero in the second case of Fig. 3. Even when applying bounds to keep the parameters within physical ranges (\(K^{\rm{Trans}}\) greater than zero and \(v_e\) and \(v_p\) between zero and one), the optimization algorithm still got caught in a local minimum.

Fig. 3
figure 3

Two of the sets of curves (simulated and fitted) correspondent to some of the nodes within the necrotic region of the first simulation of the benchmark problem. The true values for both nodes are: K\(^{\rm{Trans}}\)=0.05 min\(^{-1}\), v\(_{e}\)=0.5 and v\(_{p}\)=0.005. The fitted values are: in the first case, K\(^{\rm{Trans}}\)=0.15 min\(^{-1}\), v\(_{e}\)=0.41 and v\(_{p}\)=0.01; and, in the second case: K\(^{\rm{Trans}}\)=-0.09 min\(^{-1}\), v\(_{e}\)=0.25 and v\(_{p}\)=-0.001. These results demonstrate the convergence of the inverse method to a local minima

This meant that the success of the optimization process was dependent on the initial seed. The curves shown in Fig. 3 were obtained using as initial seed a set of values that was a random distribution of values between 0.4 min\(^{-1}\) and 0, for K\(^{\rm{Trans}}\); between 0.2 and 0.8 for v\(_{e}\); and between 0 and 0.1 for v\(_{p}\). In this case, the optimization method retrieved accurate results only if the initial seed was very close to the true values.

We attribute this problem to the numerical instability produced by the radial symmetry, both in geometry and parameters, which produces a set of identical curves at nodes with similar values. To prove this hypothesis, we created two additional simulations. In the first one, the parameters were kept constant for each region, while the geometry was modified to suppress the axial symmetry. The results obtained from these simulations are detailed in Appendix C. In the second simulation, the geometry was preserved, while the parameters were changed by random distributions of values within a range. Such, K\(^{\rm{Trans}}\) was assigned values between 0.25 min\(^{-1}\) and 0.15 min\(^{-1}\) in the rim and between 0.05 min\(^{-1}\) and 0 min\(^{-1}\) in the core. v\(_{p}\), on the other hand, was given values between 0.08 and 0.03 in the rim, and between 0.01 and 0 in the core. v\(_{e}\) maintained its original values. The initial seed was the same as in the previous simulation.

Fig. 4
figure 4

Second simulation of the benchmark problem. Comparison between the reference values and the parameters returned by the D-ETM and the ETM. Results show that the D-ETM accurate retrieves the distribution of K\(^{\rm{Trans}}\) and v\(_{p}\), while the ETM show an averaging pattern, especially for K\(^{\rm{Trans}}\). Although not as accurate as the other parameters, the v\(_{e}\) map returned by the D-ETM is within the physiological range [0,1], while the distribution obtained from the ETM reaches values close to infinity in the necrotic core

Results obtained with the ETM on this simulation are consistent with previous works [15, 16]. The K\(^{\rm{Trans}}\), mainly, and the v\(_{p}\) maps, to a lesser extent, show an averaging pattern with respect to the reference values. Besides, the v\(_{e}\) map returns unphysical values (v\(_{e}\)>1 and v\(_{e}\)<0) in the necrotic region. Even if the K\(^{\rm{Trans}}\) and v\(_{p}\) in the necrotic region are not exactly zero, the enhancement curve of these nodes is completely dependent on diffusion. Thus, these curves cannot be accurately fitted to the Tofts formulation. Besides, the vascularized regions adjacent to the necrotic ones are also influenced by these values, since there is diffusion of CA from the former to the latter. This diffusive process results in a lower CA concentration, which is then fitted to K\(^{\rm{Trans}}\) values below the true ones. v\(_{p}\) is not as affected by this effect as K\(^{\rm{Trans}}\) is, resulting in a better fit for this parameter. The distributions obtained with the D-ETM are very accurate for K\(^{\rm{Trans}}\) and v\(_{p}\), while the fitted v\(_{e}\) map shows higher error, especially in the necrotic region. This error is related to the influence of the variable in the global solution. If we take a look at Eq. 19, we can see that the value of this derivative is dependent on the value of K\(^{\rm{Trans}}\) and v\(_{p}\). Therefore, in those necrotic regions, where these parameters present values close to zero, the derivative depends only on the diffusive term. Due to the definition of this term (Eq. 6), the influence of v\(_{e}\) is limited. Thus, because of its low effect on the global solution, the optimization algorithm is not able to retrieve accurate results for v\(_{e}\), especially on necrotic regions.

A quantitative comparison between the outcome of both models is presented in Table 1. Since both the true and initial values for the simulations are a function of randomness, ten cases with different true and initial sets of parameters were tested to validate the robustness of the method. Considering that reference values for K\(^{\rm{Trans}}\) and v\(_{p}\) reached zero in the necrotic region, the use of relative error metrics is unfeasible. The absolute error, measured as the absolute difference between the fitted and the reference value, was selected to compare the performance of both models.

Table 1 Comparison of error metrics between the D-ETM and the ETM for the benchmark problem

Different error thresholds were defined to compare the performance of both models. Threshold for K\(^{\rm{Trans}}\) was set to 0.01 min\(^{-1}\), which is the maximum precision of the DP model [15]. Similarly, the threshold for v\(_{p}\) was set at 0.001, a value that can be considered as sufficient precision for this kind of models. Due to the impact of v\(_{e}\) maps on the global solution, its threshold was set to 0.15.

The D-ETM clearly outperforms the ETM, especially on K\(^{\rm{Trans}}\) maps. While only 19% of nodes fitted by the ETM are within a 0.01 min-1 range from the reference value, 72% of those retrieved by the D-ETM fall into that range. Besides, the K\(^{\rm{Trans}}\) mean absolute error in the ETM is around four times higher than the one correspondent to the D-ETM.

Due to the great effect of the unphysiological values of v\(_{e}\) returned by the ETM on the absolute error, this metric will not be used to compare the models. The fraction of nodes whose error is below 0.15, however, is not affected by these values. While only 55% of the values obtained using the ETM are within the error range, almost 78% of the ones retrieved by the D-ETM fall into this range. Despite experiencing difficulties retrieving the correct v\(_{e}\) maps, the D-ETM shows a great improvement with respect to the ETM. Besides, all of the v\(_{e}\) values fitted using the D-ETM were within the physiological range [0,1].

The mean absolute error of D-ETM v\(_{p}\) maps is around half the error obtained by the ETM. Nevertheless, this parameter does not seem to be as affected by diffusion as the other two.

Although the error retrieved by suppressing the homogeneity in the distributions of parameters (Fig. 4) was lower than the error obtained by removing the axial symmetry in geometry (Appendix C), these results demonstrate that the combination of both factors was causing the convergence of the algorithm to local minima.

3.1.1 Analysis of the mesh effect

The influence of mesh size on the convergence of both the forward and the inverse models was tested using the benchmark problem geometry. Two different meshes were generated: the first one discretized the geometry using 0.15 mm size elements, while the element size on the second one was half that value. The number of nodes on the two simulations were 339 and 777, respectively. The same type of elements (linear quadrilateral diffusion elements) was used on both models. On both cases, the sets of true and initial values were random distributions between the ranges defined previously. Just as in the benchmark problem, ten different simulations were generated for the finer mesh, to eliminate the influence of randomness on the result.

Table 2 Comparison of error metrics between two different meshes

The results presented in Table 2 show that the method reduces the error when refining the mesh. Nevertheless, the slight increase in accuracy does not justify the greater computational cost associated to the finer mesh. The finer mesh model needed twice the time of the original model to fit the curves. Therefore, the element size selected for the simulations was 0.15 mm, a tradeoff between accuracy and computational cost.

3.2 Real tumour geometry

A second set of simulations was generated to test the performance of the D-ETM in real geometries and vascular properties with a heterogeneous distribution. The geometry corresponded to a tumour slice of around 20 mm\(^{2}\), while the vascular properties were inspired by clinical data [38].

The vascular properties distribution is divided into three different zones: a highly perfused region, an intermediate region and a necrotic region. Depending on the zone, the assigned parameters were: for K\(^{\rm{Trans}}\), random values between 0.4 and 0.3 min\(^{-1}\), between 0.25 and 0.1 min\(^{-1}\) and between 0.05 and 0 min\(^{-1}\), for the three respective regions. v\(_{e}\) values were randomly selected within a range between 0.85 and 0.75 for the necrotic region and 0.6 and 0.4 for the other two. Finally, v\(_{p}\) random distribution ranged from 0.08 to 0.03 for the highly perfused nodes, between 0.05 and 0.03 in the intermediate region and between 0.01 and 0.005 in the necrotic one.

The convergence of the inverse method was tested by providing random distributions of parameters as initial values. The ranges for K\(^{\rm{Trans}}\), v\(_{e}\) and v\(_{p}\) were the same as in the previous case: [0,0.4] min\(^{-1}\), [0.2,0.8] and [0,0.1], respectively. Following the procedure described in the previous case, ten simulations with different reference and initial values were generated.

Fig. 5
figure 5

Real tumour geometry. Reference values and results of the D-ETM and the ETM for each of the parameters. Result show that the D-ETM accurate captures the heterogeneity of the distribution of parameters, while the ETM tends to average the values. The maps of v\(_{p}\) are the least sensitive to this phenomenon. The v\(_{e}\) distribution obtained from the D-ETM shows a more accurate fit than the one obtained in the benchmark case

The heterogeneous reference maps generated clearly expose the limitations of the ETM in accurately capturing the K\(^{\rm{Trans}}\) distributions (Fig. 5). While the D-ETM provides an almost exact distribution for K\(^{\rm{Trans}}\) and v\(_{p}\) and an acceptable v\(_{e}\) map, the ETM tends to homogenize the K\(^{\rm{Trans}}\), failing to depict the highly perfused regions, as well as the necrotic ones. This effect is particularly visible in those zones where two of these regions are adjacent.

Table 3 Error metrics comparison between the D-ETM and the ETM for the second case

The error metric employed in this case was the absolute relative difference (ARD), calculated between true and fitted parameters. Table 3 shows the metrics correspondent to the average of ten different simulations of the real geometry case. In this case, where necrotic zones were not as large as in the previous case, the ETM accurately retrieves the v\(_{e}\) map (mean ARD is 11% and 86% percent of nodes have an ARD below 20%), depicting the increase in v\(_{e}\) correspondent to these necrotic regions (Fig. 5). The D-ETM, on the other hand, gives a good fit for the v\(_{e}\) map (mean ARD is 16% and the ARD is below 20% in 77% of the nodes), although it is not as accurate as the ETM in fitting the v\(_{e}\) values in necrotic nodes (Fig. 5). K\(^{\rm{Trans}}\) and v\(_{p}\) distributions obtained from the D-ETM are almost an exact representation of the reference maps (Fig. 5), as evidenced by the metrics obtained (Table 3). The mean ARD is 16% and 9% for K\(^{\rm{Trans}}\) and v\(_{p}\), respectively. The ETM, nonetheless, show higher error for these two parameters (mean ARD of 148% and 195% for K\(^{\rm{Trans}}\) and v\(_{p}\), correspondingly, and for both maps only 40% of nodes have an ARD below 20%).

In their work, Pellerin et al. [15] tested the performance of their model in a simulated case similar to the real geometry case presented in here. The DP model obtained a mean ARD of 16% for K\(^{\rm{Trans}}\) and 17% for v\(_{e}\), with 73% of the K\(^{\rm{Trans}}\) values and 77% of the v\(_{e}\) values within 20% of the true values. The D-ETM has obtained identical values for v\(_{e}\) and a similar mean ARD for K\(^{\rm{Trans}}\), improving the fraction of values whose ARD is below 20%.

3.2.1 Influence of noise

To test the robustness of the D-ETM to the addition of noise, several simulations were conducted. Starting from a set of reference values similar to those generated in the last case, experimental levels of noise were added to the generated curves. These levels were defined using a gaussian distribution with an standard deviation (SD) equal to a fraction (1%, 2.5% and 5%) of the highest concentration reached in the curve.

Fig. 6
figure 6

Influence of noise on the accuracy of the D-ETM (left) and the ETM (right). Results show that noise has greater influence on the D-ETM, particularly on \(v_p\). Although the ETM seems to be unaffected by noise, for noise values below or equal to 2.5% the D-ETM still performs better. Even for the maximum levels of noise (5%), the error in the D-ETM solution is similar to the one in the ETM

The results of these simulations presented in Fig. 6 show the effect of noise on both the D-ETM and the ETM. Despite showing higher error for noise-free simulations, the parameters returned by the ETM seem to be unaffected by noise, since the histograms show almost no difference in the distribution of ARD between the cases with different levels of noise. The added noise shows greater influence on the parameters obtained fitting the D-ETM. Noise seems to have the greater effect on v\(_{p}\) and the slighter effect on v\(_{e}\). K\(^{\rm{Trans}}\), for its part, shows a slight disturbance when the noise level is low (1% and 2.5%), having more than 70% of its values within a 20% range from the true values. When the added noise reaches the maximum value, this percentage drops dramatically to 30%.

These results are consistent with those obtained by Pellerin et al. [15]. In their work, the authors attributed this different effect of noise on parameters depending on the influence of each parameter on the different parts of the CA curve. Thus, v\(_{e}\) regulates the last part of the curve, where the concentration is close to the maximum and, therefore, it is less sensitive to noise. K\(^{\rm{Trans}}\), and even more v\(_{p}\), affect the initial part of the curve. Consequently, the influence of added noise is greater on these parameters.

The great number of variables to be fitted simultaneously makes this model, as well as the model developed by Pellerin et al. [15], more sensitive to noise. Therefore, when fitting the model to experimental data it must be ensured that the signal-to-noise ratio (SNR) of the data meets the model requirements.

4 Discussion

The use of DCE-MRI sequences to assess the efficacy of antiangiogenic therapies in tumours [1,2,3] increases the need for PK models that retrieve vascular properties as accurate as possible. Several authors have pointed out the limitations of the widely used standard and extended Tofts models when the CA reaches the region of interest (ROI) through passive delivery [12,13,14,15,16,17,18]. These models return an inaccurate estimation of K\(^{\rm{Trans}}\) as well as unphysical values for v\(_{e}\) in those regions within the ROI where the active delivery of CA is low or non-existent (necrotic zones). Different models and methods have been developed to assess the effect of diffusion and to develop PK that accounted for this process [15,16,17,18]. These works exposed the mentioned limitations and proposed different approaches to include the effects of diffusion into the STM and the ETM. Pellerin et al. [15] were the first to include a diffusive term into the STM. The DP model proposed showed an improvement in parameter accuracy with respect to the original STM in those regions where passive delivery of CA was significant. One of its major limitations was the high computational cost associated to the simulated annealing algorithm, since the model had to fit all voxels simultaneously. Fluckiger et al [16] added some hypotheses to the DP model (homogeneity in the diffusive coefficient and v\(_{e}\) between neighbouring voxels) that allowed them to compute the effects of diffusion while fitting each voxel separately and, therefore, reducing substantially the computational cost. However, this homogeneity hypothesis may not be suitable for some kind of tumours (such as neuroblastoma, which is a type of cancer characterized by its high heterogeneity [41]). Cantrell et al. [17] based their formulation of the diffusive term on the one proposed by Pellerin et al. [15] and proposed a diffusion-compensated Tofts model (DC-Tofts). Their work consisted on obtaining the ETM parameters from CA curves, then computing the diffusion contribution from these fitted parameters to generate a new set of curves and, finally, fit again these last curves with the ETM. Although this model was suitable for intracranial aneurysms, the approach used may not be appropriate for other type of lesions where necrotic zones are present, since the initial fit of the ETM would retrieve unphysical values of v\(_{e}\) that would condition the following diffusion computation. All these works follow the same formulation of the diffusive term (revised thoroughly in [15]). This formulation assumes that CA can diffuse freely through each of the voxel faces (this means that no obstacles, such as cells, are present in that faces). Depending on the cellularity level and the voxel size, this assumption may not be valid. The latest contribution by Sinno et al. [18] proposes a modified formulation of the diffusive term that, as well as the formulation presented in this work, avoids this simplification. In their work, the diffusivity coefficient is considered unknown but constant through the domain. As stated previously, this hypothesis may not be valid for some tissues, especially in tumour tissues, which are characterized by their heterogeneity.

The formulation of the diffusion process here proposed avoids this assumption by embracing the concept of effective diffusivity [22]. This concept implies that the diffusion of agents within biological tissues is similar to the diffusion of an agent in a porous medium [26,27,28]. Apart from providing a more accurate description of the diffusive process, this hypothesis links the effective diffusivity to the volume fraction of the extravascular-extracellular space (v\(_{e}\)), avoiding the generation of a new parameter (D) that needs to be fitted or extracted from data or literature, as it is the case in previous works [14,15,16,17,18].

The performance of the D-ETM was tested using two different in silico cases. The first one, a benchmark problem derived from literature [15, 16], exposed the limitations of the ETM in those regions were passive delivery of CA is the main transport mechanism. Results showed the improved accuracy of the model parameters returned by the D-ETM, which were very close to the reference values (mean absolute error for K\(^{\rm{Trans}}\), v\(_{e}\) and v\(_{p}\) were 0.008 min\(^{-1}\), 0.095 and 0.0004, respectively). A second case inspired in real tumour geometries and parameters was analysed. Again, the ETM performed poorly, returning an incorrect estimation of K\(^{\rm{Trans}}\). In both simulations, the K\(^{\rm{Trans}}\) distribution obtained by the ETM appeared averaged with respect to the reference maps, adding significant error to the parameters. D-ETM parameters were almost identical to the true values, accurately depicting the heterogeneous distribution of values. Besides, Fig. 6 shows that the error obtained by the D-ETM follows a Pareto-like distribution for noise levels below 2.5%, what means that the smallest errors are obtained for most of nodes, while the ETM does not follow this pattern (except for \(v_e\)), even for noise-free simulations.

The results obtained from the D-ETM in both cases show lower accuracy in v\(_{e}\) maps with respect to the other variables, especially on necrotic regions, where K\(^{\rm{Trans}}\) values are very close to zero. This is due to the modest influence of v\(_{e}\) on the global solution in these regions. To solve this issue, future works could develop an alternative expression for the effective diffusivity where v\(_{e}\) would have more influence. Nevertheless, the accuracy of the v\(_{e}\) maps obtained from the D-ETM is acceptable, keeping all of their values within the physiological range. Moreover, the v\(_{e}\) maps retrieved by the D-ETM on the real tumour geometry show an accuracy similar to the DP model [15].

Due to the additional variable included in the ETM with respect to the STM (the volume fraction of blood plasma, v\(_{p}\)), the convergence of the optimization algorithm can be affected by the presence of local minima within the solution space. This means that the solution is dependent on the initial seed. Although our model seems to overcome this issue on the simulated cases, the values obtained from the ETM could be used as an initial seed in those cases where the minimization convergence is severely affected by this issue.

This new formulation is limited by the computational cost associated to the optimization algorithm. Although the method benefits from the computational efficiency of the FEM, the optimization time for the two simulated cases was 1.5 h and 10 h on average (fitting 360 time points on 339 nodes and 955 nodes, respectively). Despite being faster than the DP model (average of 72h to fit 484 voxels), the execution time cannot be compared to the ETM, which took an average of 5s and 12s, respectively. Future works can be applied to migrate the code from Python to more efficient languages, such as C. One of the bottlenecks of this method is the Jacobian matrix computation, which executes operations on large sparse matrix. This computation could be parallelized to reduce the execution time.

The described D-ETM is the first diffusion-corrected PK model to be implemented using the FE method. It proposes a new formulation for the diffusive term, based on the concept of effective diffusivity, that simplifies the computation of this term and avoids the inclusion of additional variables to the model. The semi-analytical method formulated to compute the Jacobian matrix opens the door for further gradient-based optimization methods for FE-based PK models. Although previous works [10] have developed a FE implementation of the extended Tofts model, their objective was to expose the effect of intra-voxel CA diffusion on PK analyses. To the best of our knowledge, this is the first FE-based optimization algorithm for the ETM. The results obtained with this model are promising, since it accurate retrieves the reference values, outperforming the ETM. Future works should test this model on real clinical or experimental data.