Introduction

Mathematical modeling of the immune processes is an essential part of the research in immunology [15, 16, 22]. Despite the emergence of a great amount of new high-performance methods for experimental analysis of the immunity, the results of mathematical modeling are relevant, in particular, in clinical practice in order to work out optimal, individually customized strategies for treatment of the pathological process (bacterial/viral infections or tumor growth).

It is well known that physiological parameters vary between individual patients and thus, personalized treatment approaches require the development of robust and efficient parameter estimation methods to assimilate individual data with mathematical models [6]. The problems of parameter identification in mathematical models are often non-linear and ill-posed, and thus challenging to solve numerically [45, 46]. The computational algorithms for parameter identification that we work on, are based on an adaptive time-mesh refinement [13] for coefficient inverse problems (CIPs). The main idea of our work is adoptation of results for space mesh refinement developed in [13] for solution of hyperbolic CIPs, to the parameter identification problem (PIP) for reconstruction of parameters on the time mesh. More precisely, first, we determine candidate parameter at known initial (coarse) time partition. Then we refine time-mesh locally only at a such time intervals where a posteriori error indicator is large and compute new time-dependent control functions on a new time mesh until the error in the computed residual is reduced to the desired accuracy. The adaptive finite element method has shown that it significantly improves reconstruction of parameters when solving the coefficient inverse problems for hyperbolic PDE [5, 7, 8, 12, 13].

We note that the main goal of our work is to present mathematical framework of a posteriori error estimation for solution of PIP’s and to show usefulness of the time-adaptive error control for determination of parameters in PIP which we demonstrate on the example of the model of HIV infection. More than 35 years have passed since the discovery of the etiological agent of AIDS—human immunodeficiency virus (HIV), nevertheless, the problem of the spread of HIV infection, treatment and quality of life for people living with HIV still remains relevant: the number of newly infected does not decrease. Appearing of highly active antiretroviral therapy (HAART) in 1996-1997 has led to a significant improvement in the quality of life of patients, has caused a clear decrease in AIDS-related diseases and mortality. HAART provides treatment protocols with using combinations (two or more) of antiviral drugs which affect both the different stages of viral replication and prevent HIV from entering the host cell.

We took the simplified model of HIV infection proposed in [42] as an illustrative example to show effectiveness and robustness of identification time-dependent parameter drug efficacy using a local time-mesh refinement algorithm. This work is a continuation of the works by authors [9, 10] where was introduced the time-adaptive finite element method for parameter identification problems. Compared to other optimal control algorithms for solution of PIP, see, for example, works [1, 2, 26] and references therein, our time-adaptive algorithms are based on rigorous finite element a posteriori error analysis for the error in the Tikhonov’s functional or for the error in the reconstructed parameter. The same approach can be applied to the solution of any other PIP and particularly, for more complicated models of HIV infection which involve more unknown functions and parameters [3, 27, 34,35,36, 40, 41, 48]. However, these models are much complicated compared with studied here model of [42], and can be considered as topic for a future research.

As was mentioned above, in [10] was studied the optimal control problem of reconstruction of drug efficacy in the model of HIV infection when measurements of all functions in time of the model ODE system were used what, actually, is not realistic problem. Moreover, numerical simulations were not presented in [10]. In the current work we fill this gap and study a more realistic case when instead of measuring all four functions in ODE system, we take measurements only of the virus population function. New a posteriori error estimate between regularized and computed drug efficacy is derived. Based on this estimate, a time adaptive algorithm is formulated and numerically tested on the optimal determination of drug efficacy in time domain from noisy measurements of virus population function. Extended numerical studies for different noise levels in the virus population function and for different values of time for initial observation of this function are presented in the preprint version of this work [11]. We note that compared to the current work proofs of Theorems 1, 2, and 3 are not presented in [11], as well as study on stability of solution of the forward problem and study of ill-posedness of the PIP problem.

The time-adaptive method proposed in this paper can eventually be used by clinicians to determine the drug-response for each treated individual. The exact knowledge of the personal drug efficacy can aid in the determination of the most suitable drug as well as the most optimal dose to an individual, in the long run resulting in a personalized treatment with maximum efficacy and minimum adverse drug reactions.

The outline of the paper is as follows. The biological description of the mathematical model is given in “The Mathematical Model and Its Biological Description”. “Inverse Problems and Ill-Posedness” is based on material of the Master’s thesis [25] and discusses ill-posedness of parameter identification problems and Tikhonov’s regularization method for their solutions. In “The Parameter Identification Problem” the parameter identification problem is formulated. The optimization method to solve the parameter identification problem is presented in “Optimization Method”. The finite element method is formulated in “Finite Element Discretization” and a posteriori error estimates are derived in “A Posteriori Error Estimates”. An adaptive algorithm for solution of PIP is presented in “Algorithms for Solution of PIP”. Finally, in “Numerical Results” numerical examples illustrate effectiveness of the proposed time-adaptive algorithm.

The Mathematical Model and Its Biological Description

The main cellular targets of HIV are the immune system cells that have CD4 receptors at their surface, called CD4+ T-cells. HIV differs from other viruses by a high mutation level, the mutation rate is \(10^{-5}\)\(10^{-4}\) per nucleotide during one replication cycle [30]. High genetic variability allows HIV to skillfully avoid humoral and cellular defense factors and the effects of drugs. The constant presence of large reservoirs of latently infected cells is one of the important feature of the pathogenesis of HIV infection. Another paradoxical feature of HIV is that activation of the immune system does not lead to a suppression of virus multiplication, but rather to activation of latently infected cells, which start to produce new viral particles. These factors are the major obstacles for antiviral therapy and the development of efficient vaccines [18, 19]. According to the latest data on HIV (UNAIDS, 2020) [47], there are currently 38 million people living with HIV globally and 1.7 million people became newly infected with HIV in 2019.

The HIV life cycle starts with attachment of the viral envelope protein gp120 to the cell surface via interaction with CD4 receptor. In general case for living organisms the genetic information goes from the storage in DNA through messenger RNA (mRNA) to protein synthesis in the ribosomes. The process of converting the genetic information from DNA to mRNA is called transcription [25]. In the case of retroviruses, such as HIV, HIV’s genetic information is encoded in form of RNA. Having fused with the cell membrane, HIV releases its genetic material (viral RNA) and enzymes into the CD4+ T-cell. Here viral RNA is reversely transcribed into HIV DNA, which is compatible with genetic material of the host cell [reverse transcription (RT)]. To perform the reverse transcription of RNA into DNA, HIV carries its own enzyme called reverse transcriptase. Viral DNA is transported to the cell’s nucleus and incorporated into the DNA of the host cell (integration). This process is made possible by the enzyme integrase. The individual components of HIV are then produced within the CD4+ T-cell. The individual components of HIV are then assembled together to make new HIV viruses. This process depends on the enzyme protease. The newly matured HIV particles are released from the CD4+ T-cell. These are ready to infect other CD4+ T-cells and begin the replication process all over again. The process of HIV life cycle described above is illustrated in Fig. 1. Antiretroviral drugs blocking the enzyme reverse transcriptase (called Reverse Transcriptase Inhibitors) will be able to prevent the production of new viruses [18, 37].

Fig. 1
figure 1

The HIV life cycle

Our basic mathematical model in this work is the model proposed in [42] which describes the effect of Reverse Transcriptase Inhibitor (RTI) on the dynamics of HIV infection. In this model the infected class of CD4+ T-cells is subdivided into two subclasses: pre-RT class and post-RT class. Pre-RT class consists of the infected CD4+ T-cells in which reverse transcription is not completed, and post-RT class consists of those infected CD4+ T-cells where the reverse transcription is completed such that they are capable to produce virus.

Throughout the paper we denote by \(\Omega _{T}= [0,T]\) the time domain for \(T>0\), where T is the final observation time. The mathematical model which we use in this note, is:

$$\begin{aligned} {\left\{ \begin{array}{ll} {\displaystyle \frac{du_1}{dt}} = {f_1(u(t), \eta (t))}=s - k u_1(t) u_4(t) - \mu u_1(t) +(\eta (t) \alpha + b) u_2(t),\\ {\displaystyle \frac{du_2}{dt}} = {f_2(u(t), \eta (t))}=k u_1(t) u_4(t) - (\mu _1+ \alpha + b )u_2(t),\\ {\displaystyle \frac{du_3}{dt}} = {f_3(u(t), \eta (t))}=(1-\eta (t))\alpha u_2(t) - \delta u_3(t),\\ {\displaystyle \frac{du_4}{dt}} = {f_4(u(t), \eta (t))}=N \delta u_3(t) - c u_4(t), \end{array}\right. } \end{aligned}$$
(1)

with initial conditions

$$\begin{aligned} u_1(0)&=u^0_1=300\ {\text {mm}}^{-3},\quad u_2(0)=u_2^0=10\ {\text {mm}}^{-3}, \nonumber \\ u_3(0)&=u^0_3=10\ {\text {mm}}^{-3}, \quad u_4(0)=u^0_4=10\ {\text {mm}}^{-3}. \end{aligned}$$
(2)

In system (1) the functions are defined as follows:

  • \(u_1(t)\) – uninfected target cells population

  • \(u_2(t)\) – infected target cells before Reverse Transcription (pre-RT class)

  • \(u_3(t)\) – infected target cells in which Reverse Transcription is completed and they are capable of producing virus (post-RT class)

  • \(u_4(t)\) is the virus population function.

The parameters which are used in system (1) are described in Table 1 and are taken from the specialized literature [32, 33]. The initial data (2) is chosen such that they satisfy two steady states which we discuss in “Existence and Lyapunov Stability of Steady State”. Figure 2 illustrates the effect of Reverse Transcriptase Inhibitor (RTI) on the dynamics of HIV infection for the mathematical model (1).

Fig. 2
figure 2

The effect of reverse transcriptase inhibitor (RTI) on the dynamics of HIV infection described by the model problem (1)

The system (1) can be presented in the following compact form:

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{du}{dt} &{}= f(u(t),\eta (t)) \quad t \in [0,T], \\ u(0) &{}= u^0, \end{array}\right. } \end{aligned}$$
(3)

where we have denoted all involved functions as

$$\begin{aligned} u&=u(t)=(u_1(t),u_2(t),u_3(t),u_4(t))^T, \nonumber \\ u^0&= (u_1(0),u_2(0),u_3(0),u_4(0))^T, \nonumber \\ \frac{du}{dt}&= \left( \frac{\partial u_1}{ \partial t}, \frac{\partial u_2}{ \partial t},\frac{\partial u_3}{ \partial t}, \frac{\partial u_4}{ \partial t} \right) ^T, \nonumber \\ f(u(t),\eta (t))&=(f_1,f_2,f_3,f_4)^T(u(t),\eta (t))\nonumber \\&= ( f_1(u_1,\ldots ,u_4, \eta (t)), \ldots ,f_4(u_1, \ldots , u_4, \eta (t)))^T. \end{aligned}$$
(4)

Here, \((\cdot )^T\) denotes transposition operator.

Table 1 Parameters dataset

In the model (1) we assume that \(f \in C^1(\Omega _T)\) is Lipschitz continuous and the function \(\eta (t) \in C(\Omega _T)\) represents the unknown drug efficacy which belongs to the set of admissible functions \(M_{\eta }\):

$$\begin{aligned} M_{\eta } = \{ \eta (t) : \eta (t)\, \in \, \left[ 0 , 1 \right] \,{\text {in}} \,\Omega _T , \eta (t) =0 {\text { outside of }} \Omega _T\}. \end{aligned}$$
(5)

The control parameter \(\eta\) is dosage of the reverse transcriptase inhibitor. This parameter protects the cells and prevents infection. In this work we assume that all parameters in system (1) are constants except the control parameter \(\eta\) which depends on time, e.,e., \(\eta = \eta (t), \,\,t \in [0,T]\). This means that the control parameter \(\eta = \eta (t)\) tells us which dosage of the reverse transcriptase inhibitor should be given to the concrete patient at any time moment t for \(t \in [0,T]\). Thus, personalized determination of this parameter for every individual is of vital importance for treatment of HIV.

Existence and Lyapunov Stability of Steady State

Let us now assume that the parameter \(\eta\) in system (1) is constant. That is, \(\eta (t) \equiv c \in (0,1)\). Setting \(\frac{du}{dt}\) in (1) to zero and solving for \(u_1\), \(u_2\), \(u_3\) and \(u_4\) we can see that there are two possible steady states: an infected and an uninfected one [42].

The uninfected steady state is given by

$$\begin{aligned} {\left\{ \begin{array}{ll} u_1 = \frac{s}{\mu },\\ u_2 = 0,\\ u_3 = 0,\\ u_4 = 0, \end{array}\right. } \end{aligned}$$
(6)

and the infected steady state is achieved when

$$\begin{aligned} {\left\{ \begin{array}{ll} u_1 = \frac{(\mu _1+\alpha +b)c}{N\alpha k(1-\eta )},\\ u_2 = \frac{s-\mu u_1}{\mu _1 + \alpha (1-\eta )},\\ u_3 = \frac{\alpha (1-\eta )u_2}{\delta },\\ u_4 = \frac{N\delta u_3}{c}. \end{array}\right. } \end{aligned}$$
(7)

In [42] was shown that the infected steady state can exist only when \(\eta\) is less than the following critical value

$$\begin{aligned} \eta _{crit} = 1 - \frac{\mu c(\mu _1+\alpha +b)}{N \alpha ks}. \end{aligned}$$
(8)

For our system of parameters, presented in Table 1, this critical value is \(\eta _{crit} \approx 0.88375\). Whenever \(\eta \ge \eta _{crit}\) only the uninfected steady state can exist.

Plugging in the values of Table 1 into (7) for \(\eta < \eta _{crit}\), or (6) if \(\eta \ge \eta _{crit}\), we obtain the numerical values for solutions \((u_1,u_2,u_3,u_4)^T\) of (1) presented in the Table 2.

Table 2 Stable steady states for different values of \(\eta\), while keeping the other parameters fixed

Stability of Solutions

Let us define

$$\begin{aligned} {\left\{ \begin{array}{ll} \mu _m = \min \{\mu ,\mu _1\}, \\ \Xi = \frac{s}{\mu _m}, \\ \Phi := \Phi (\eta ) = \frac{\alpha s (1-\eta )}{\mu _m \delta }, \\ \Psi := \Psi (\eta ) = \frac{N \alpha s (1-\eta )}{\mu _m c}, \end{array}\right. } \end{aligned}$$
(9)

where \(\mu , \mu _1, s\) etc. are the parameters of (1). Consider the set

$$\begin{aligned} \Gamma (\eta ) = \{(u_1,u_2,u_3,u_4) \in {\mathbb {R}}^4: 0 \le u_1 \le \Xi , 0 \le u_2 \le \Xi , 0 \le u_3 \le \Phi , 0 \le u_4 \le \Psi \}. \end{aligned}$$
(10)

It can be proven [42] that if \(u(0) \in \Gamma (\eta )\), then the solution trajectories of (1) will stay inside \(\Gamma (\eta )\) for all \(t \in \Omega _T\).

Remark 1

It is not required that \(\eta\) is constant. As long as \(\eta \in M_\eta\), we may allow \(\eta (t)\) to vary with time.

For our parameters presented in Table 1, these bounds are quantitatively defined as

$$\begin{aligned} {\left\{ \begin{array}{ll} 0 \le u_1 \le 1000,\\ 0 \le u_2 \le 1000,\\ 0 \le u_3 \le 1538.5(1 - \eta ),\\ 0 \le u_4 \le 166667(1 - \eta ). \end{array}\right. } \end{aligned}$$
(11)

Table 3 shows upper limits for \(u_3\) and \(u_4\) for different values of \(\eta\).

It can furthermore be proven that if and only if \(\eta \ge \eta _{crit}\) the uninfected state is globally asymptotically Lyapunov stable. On the other hand, if the steady state exists, then it is locally asymptotically Lyapunov stable whenever the following condition is satisfied [42]

$$\begin{aligned} \Delta C - A^2D > 0, \end{aligned}$$
(12)

where

$$\begin{aligned}&A = \mu + ku_4 + \mu _1 + \alpha + b + \delta + c,\nonumber \\&B = (c+\delta )(\alpha +\mu _1+\mu +ku_4+b)+c\delta + \mu (\mu _1 + \alpha + b) + ku_4(\mu _1 + (1-\eta )\alpha ),\nonumber \\&C = c\delta (\mu + ku_4) + (c+\delta )(\mu \mu _1 + \mu \alpha + \mu b +\mu _1 ku_4 + (1-\eta ) \alpha k u_4),\nonumber \\&D = c\delta ku_4(\mu _1 + \alpha (1-\eta )),\nonumber \\&\Delta = AB - C. \end{aligned}$$
(13)

We can calculate that, when \(\eta\) is constant and the other parameter values are chosen as in Table 1, then the infected steady state is locally asymptotically stable for all values of \(\eta\) such that \(\eta\) is less than the critical value, \(\eta _{crit} \approx 0.88\). Figure 3 illustrates this statement.

Fig. 3
figure 3

\(\Delta C - A^2D\) plotted as a function of \(\eta\). Note that \(\Delta C - A^2D > 0\) \(\forall \eta < 0.88\)

Thus, if \(\eta\) is constant, and less than the critical value \(\eta _{crit} \approx 0.88\), it suffices to know the solution of (1) at steady state to deduce \(\eta\).

Table 3 Upper limit for the positive invariant set \(\Gamma (\eta )\). The integer parts of fractional numbers is always reported as the upper bound

Although it is often a reasonable assumption that the drug efficacy is constant for a given individual, viruses mutate readily, which can alter the efficacy of a RT-inhibitor. Thus, it is interesting to know how to determine \(\eta (t)\) when it is not constant. So let us for the remainder of this note consider the case when \(\eta (t)\) is not necessarily constant.

Well-Posedness of the Forward Problem

Let \(D = \Omega _T \times \Gamma (\eta )\) be the bounded domain. Let functions u(t), f(tu(t)) are defined as in (4). Further, let f(tu(t)) be a continuous function for t in \(\Omega _T\) and Lipschitz continuous function for u(t) in D. Then f(tu(t)) is clearly Lipschitz continuous on the compact set \(\Gamma (\eta ) \times \Omega _T\). In other words, \(\exists L =const.: \forall t \in \Omega _T, \forall u^1(t), u^2(t) \in D\),

$$\begin{aligned} \Vert f(t, u^1) - f(t, u^2) \Vert \le L \Vert u^1(t) - u^2(t) \Vert . \end{aligned}$$
(14)

Thus, using the Picard–Lindelöf theorem (Theorem 2.2 in [43]) one can prove that, for given initial condition u(0), the model problem (1) has a unique solution. Furthermore, the solution depends continuously on data of the problem (1) in the following sense (Theorem 2.8 in [43]):

Proposition 2

Let \(u^1(t), u^2(t)\) be two solutions of the problems

$$\begin{aligned} \frac{du^i}{dt} = f^i(t, u^i(t)), \,\, u^i(t_0) = {u^i}^0, i=1,2 \end{aligned}$$

with perturbated initial conditions

$$\begin{aligned} \Vert u^1(0) - u^2(0)\Vert \le \delta \end{aligned}$$

and perturbated right hand sides

$$\begin{aligned} \Vert f^1(t, u^1(t)) - f^2(t, u^2(t)) \Vert \le \epsilon ,\,\,\forall t \in \Omega _T. \end{aligned}$$

Then

$$\begin{aligned} ||u^1(t) - u^2(t)|| \le ||u^1(0) - u^2(0)||e^{Lt} + \frac{||f^1(t, u^1(t)) - f^2(t, u^2(t))||}{L}\left( e^{Lt} - 1\right) , \end{aligned}$$
(15)

where L is the Lipschitz constant. If the initial values are equal then on \(\Omega _T = [0,T]\) we have

$$\begin{aligned} ||u^1(t) - u^2(t)|| \le \frac{||f^1(t, u^1(t)) - f^2(t, u^2(t))||}{L}\left( e^{Lt} - 1\right) . \end{aligned}$$
(16)

And since f is clearly continuous with respect to \(\eta\) it follows that the solution to (1) must be continuous with respect to \(\eta\).

Inverse Problems and Ill-Posedness

Since the parameter identification problem can be considered as an inverse ill-posed problem it is clear that this problem is difficult to solve properly. In this section we show how parameter identification problem can be solved accurately via construction of proper Tikhonov regularization functional.

Let us consider the following problem: Let \(B_1\) and \(B_2\) be Banach spaces. Let \(G \subseteq B_1\) be an open set in \(B_1\) and \(F: G \rightarrow B_2\) an operator. Let \(y \in B_2\) be given, and suppose we want to find \(x \in G\) such that

$$\begin{aligned} F(x) = y. \end{aligned}$$
(17)

Problems of this sort, when you want to identify x in (17), given observations, y, are called inverse problems. A special class of inverse problems are called parameter identification problems (PIP), i.e. x is some parameter of a differential equation, and F(x) is the solution of the differential equation, with this parameter.

Definition 1

Problem (17) is said to be well-posed by Hadamard if it satisfies the following conditions [45, 46]:

  1. 1.

    Existence: For each \(y \in B_2\) there is an \(x = x(y)\) such that \(F(x) = y\).

  2. 2.

    Uniqueness: For each \(y \in B_2\) there is not more than one \(x = x(y)\) such that \(F(x) = y\).

  3. 3.

    Stability: For each y such that a unique solution of (17) exists, the solution \(x = x(y)\) is a continuous function of y.Footnote 1

Definition 2

Problem (17) is said to be ill-posed if it is not well-posed.

PIP and other inverse problems are often ill-posed. Ill-posedness means that it is difficult to solve (17) numerically, since measurement errors, or even errors induced by finite-precision computer arithmetic, can have disastrous consequences. Let \(y^*\) denote noiseless observations, \(\delta > 0\) be the noise level, and \(B_\delta [y^*] = \{y:||y-y^*||_{B_2} \le \delta \}\). The solution to the slightly perturbed equation \(F(x) = y_{\delta }\) (with \(y_\delta \in B_\delta [y^*]\)) could be entirely different from the solution to \(F(x) = y^*\). Perhaps a solution to \(F(x) = y_{\delta }\) does not even exist. No matter how small \(\delta\) is. A generally ill-posed problem (17) can be well-posed if we consider the restriction of F in (17) to certain subsets of its domain. In this case is introduced the following definition.

Conditional well-posedness

Let \(B_1\) and \(B_2\) be Banach spaces. Suppose \(G \subset B_1\) is the closure of an open subset in \(B_1\). Let \(F: G \rightarrow B_2\) be a continuous operator. Assume that \(y^* \in F(G)\) is our ideal noiseless data, and pick a noise level \(\delta > 0\). Suppose we want to solve

$$\begin{aligned} F(x) = y_{\delta }, \end{aligned}$$
(18)

where \(y_{\delta } \in B_\delta [y^*]\). This problem is called conditionally well-posed on G if it satisfies the following conditions [45, 46]:

  1. 1.

    Existence: It is a priori knownFootnote 2 that there exists an ideal solution \(x^* =x^*(y^*) \in G\) for an ideal noiseless data \(y^*\).

  2. 2.

    Uniqueness: The operator \(F:G \rightarrow B_2\) is one-to-one.

  3. 3.

    Stability: The inverse operator \(F^{-1}\) is continuous on F(G).

Definition 3

The set G in Definition 3 is called the correctness set of the problem (18).

Continuity of the inverse operator \(F^{-1}\) can be guaranteed if the domain of F is compact. Hence, any compact set with nonempty interior such that F is one-to-one is a correctness set. This suggests a method to solve (18) by choosing a suitable correctness set, G, and then finding a \(x \in G\) such that \(||F(x) - y_{\delta }||\) is as small as possible. The Tikhonov’s theorem offers a such method.

Theorem 3

(Tikhonov [44]) Let \(B_1\) and \(B_2\) be Banach spaces, and \(U \subset B_1\) a compact set. Let \(F:U \rightarrow B_2\) be a continuous one-to-one operator and \(V=F(U)\). Then \(F^{-1}: V \rightarrow B_1\) is a continuous operator.

For a proof of this fundamental theorem see, for example [13, 44].

Quasi-Solution

Let \(H_1\) and \(H_2\) be Hilbert spaces,Footnote 3 and assume that \(F:G \rightarrow H_2\) is a continuous mapping defined on a compact correctness set, \(G \subset H_1\). Let \(\delta > 0\) and assume, as before, that we want to solve

$$\begin{aligned} F(x) = y_{\delta }, \end{aligned}$$
(19)

with \(y_{\delta }\) defined as before. We know that a solution exists for perfect data \(y^*\), but in general (19) has no solution, since \(y_{\delta } \notin F(G)\) (implying that we are dealing with an ill-posed problem). Our goal in this, and the following subsection is to sketch how to construct a family of approximate solutions \(\{x_{\delta }\}\) in G that converges to \(x^*\) as \(\delta \rightarrow 0\). Let us define

$$\begin{aligned} J_{y_\delta }(x) = ||F(x) - y_\delta ||^2_{B_2}. \end{aligned}$$
(20)

Since F is continuous, it takes compact sets to compact sets, thus F(G) is compact in \(B_2\). And since F(G) is a compact subset of a Hilbert space, and therefore closed, a minimum of (20) exists (and if F(G) also happens to be convex, this minimum is unique). Any \(x \in G\), unique or not, that minimizes \(J_{y_\delta }\) in (20) is called a quasi-solution to (19).

Since the inverse mapping, \(F^{-1}\), is continuous by the Theorem 3, and is defined on a compact metric space, it admits a modulus of continuity, \(\omega _{F^{-1}}\).Footnote 4 From Theorem 1.5 in [13] it follows that, given \(y_{\delta } \in B_2\), then for any quasi-solution \(x_{\delta } \in \min _{x \in G} J_{y_\delta }(x)\) the following error estimate holds:

$$\begin{aligned} ||x_{\delta } - x^*||_{B_1} \le \omega _{F^{-1}}(2\delta ), \end{aligned}$$
(22)

where \(\omega _{F^{-1}}(z)\) is the modulus of continuity of the inverse operator \(F^{-1}\). Thus \(x_{\delta } \rightarrow x^*\) as \(\delta \rightarrow 0\). Hence, we can take a sequence of quasi-solutions to be our desired family.

However, sometimes the set of all plausible solutions to (19) does not form a compact set, and in these cases, F need not be continuous on the set of all plausible solutions—such problems are called essentially ill-posed. Thus, it is not obvious how to choose a suitable correctness set. And even if all the plausible solutions form a compact correctness set, the minimum of (20) may not be unique: there may be local minima or regions where the gradient of the functional is very small, where a minimization algorithm could get trapped. In the next subsection, we will discuss how a stable solution to essentially ill-posed problems could be obtained in practice.

The Tikhonov Functional

The Tikhonov functional makes sure that when minimizing (20), we will stay in the neighborhood of some point, \(x_0\), which is a priori known to be close to the true solution, \(x^*\). A general Tikhonov functional is given below

$$\begin{aligned} J_{\gamma }(x) = \frac{1}{2}||F(x) - y||^2_{B_2} + \frac{\gamma }{2}||x-x_0||^2_{B_1}. \end{aligned}$$
(23)

The first term is essentially the same as in (20), the second term is the regularization term and \(\gamma :=\gamma (\delta )\) is the regularization parameter. The regularization parameter can be chosen as, for instance,

$$\begin{aligned} \gamma (\delta ) = \delta ^{2\mu }, \end{aligned}$$
(24)

where \(\mu \in (0,1)\), see details in [4].

In general, the Tikhonov functional (23) might not actually attain its infimum; we can only guarantee the existence of the minimizing sequence, \(\{x_k\}\). However, without loss of generality, we can assume that G is the closure of an open and bounded set containing the initial guess, \(x_0\), the (bounded) minimizing sequence, \(\{x_k\}\), and the exact solution, \(x^*\). Hence, if we consider finite dimensional Hilbert spaces, the Tikhonov functional, defined on G, would have a minimum, since closed and bounded sets on finite dimensional Hilbert spaces are compact, and functionals defined on compact sets attain their infimum according to the Weierstrass’ extreme value theorem. Suppose now that G is convex and that (23) is Fréchet differentiable,Footnote 5 with a Fréchet derivative that is uniformly bounded and Lipschitz continuous. Then one can prove, see [13, 14], that for given noise level and regularization parameter (23) is locally strongly convex in a neighborhood of its minimum and that \(x^*\) is also contained in this neighborhood if \(||x^*-x_0||\) is small enough.

Assume that we have a single noise \(\delta\) and our goal is to minimize (23). Let \(\gamma (\delta )\) is chosen as in (24), then it can be proven, see [28], that there exists a \(\delta _0\) such that

$$\begin{aligned} \delta \in (0,\delta _0) \implies ||x_k - x^*|| \le \xi ||x_0 - x^*||, \end{aligned}$$
(26)

in particular it follows that if (23) attains a minimum, any \(x_k \in \min _{x \in G} J(x)\) would be a better approximation to \(x^*\) than \(x_0\), if the noise level is small enough.

Thus, to sum up, under reasonable assumptions discussed above, a minimum of (23) exists and is a better approximation than the starting guess, \(x_0\). And under reasonable assumptions, and if the initial guess is good enough, there is only one unique point that is the global minimum, and we do not need to worry about local minima. These facts explain why Tikhonov regularization is so useful for solving ill-posed problems. To find the zero of the Fréchet derivative, one can use common minimization techniques, such as the conjugate gradient method (CGM) or the method of steepest descent. Obviously, the minimum of the Tikhonov functional will not be exactly the same as the quasi-solution if the noise level and regularization parameter are constants. On the other hand, by letting the regularization parameter \(\gamma\) decrease for each iteration of the minimization algorithm we will have a minimum of the Tikhonov functional that approaches the quasi-solution. In [4] it was suggested that \(\gamma\) can be updated as

$$\begin{aligned} \gamma _k = \frac{\gamma _0}{(k+1)^p}, \end{aligned}$$
(27)

where \(p \in (0,1]\) and \(k = 0, 1, 2, \ldots\).

From what have been discussed, it is clear that a good first guess is essential for successful identification of the desired parameter. Of course, we do not in general have any idea at all what the solution to (17) might be, and therefore, in general, we need to devise some kind of globally convergent algorithm to solve ill-posed PIP.

The Parameter Identification Problem

To formulate the parameter identification problem we assume that all parameters in system (1) are known except the control parameter \(\eta (t)\) which describes efficacy of the drug. The typical values of parameters \(\{s, \mu , k, \mu _1, \alpha , b, \delta , c, N \}\) in (1) are taken from [42] and they are described in Table 1.

Parameter Identification Problem (PIP). Assume that conditions (5) hold and parameters \(\{s\), \(\mu\), k, \(\mu _1\), \(\alpha\), b, \(\delta\), c, \(N \}\) in system (1) are known. Assume further that the function \(\eta (t) \in M_\eta\) is unknown inside the domain \(\Omega _T\). The PIP is: determine \(\eta (t)\) for \(t \in \Omega _T,\) under the condition that the virus population function g(t) is known

$$\begin{aligned} u_4(t) = g(t),\quad t \in [T_1,T_2],\;\; 0 \le T_1 < T_2 \le T. \end{aligned}$$
(28)

Here, the function \(g\left( t\right)\) presents observations of the function \(u_4\left( t\right)\) inside the observation interval \([T_1,T_2]\).

Note, that we solve the PIP on the time interval [0, T] and assume that observations of g(t) can even be on the more narrow interval \([T_1,T_2] \subset [0,T]\). “Numerical Results” section show that reconstruction of the parameter \(\eta (t)\) is not very good on the time interval where observations are not available and thus, observations of the virus population function \(u_4(t)\) should be taken as early as possible from the date when the virus started to be reproduced in the body of the host.

Optimization Method

Let H be a Hilbert space of functions defined in \(\Omega _T\). To determine \(\eta (t)\), \(t\in [0,T]\) in PIP we construct the Tikhonov functional (23) in the following form:

$$\begin{aligned} J(\eta )=\frac{1}{2} \int \limits _{T_1}^{T_2}(u_4(t)-g(t))^{2}z_{\zeta }\left( t\right) \,\mathrm{d}t+\frac{1}{2}\gamma \int \limits _{0}^{T}(\eta -\eta ^{0})^{2}dt. \end{aligned}$$
(29)

Here, the solution \(u_4(t)\) of the system (1) with parameter \(\eta (t)\), g(t) is the observed virus population function, \(\eta ^0\) is the initial guess for the parameter \(\eta (t)\) and \(\gamma \in (0,1)\) is the regularization parameter, \(z_{\zeta }(t), \zeta \in \left( 0,1\right)\) is smoothness function which can be defined similarly to [10]. Our goal now is to minimize the Tikhonov functional (29) with respect to the function \(\eta (t) \in H\).

To find the function \(\eta (t) \in H\) which minimizes the Tikhonov functional (29) we seek for a stationary point of (29) with respect to \(\eta\) which satisfies

$$\begin{aligned} J^{\prime }(\eta )(\bar{\eta })=0, \quad \forall \bar{\eta } \in H. \end{aligned}$$
(30)

To find minimum of (29) we use constrained optimization with the standard Lagrangian approach [1, 38] and introduce the following Lagrangian

$$\begin{aligned} L(v)=J(\eta )+ \sum _{i=1}^4 \int \limits _{0}^{T} \lambda _i \left( \frac{du_i}{dt} - f_i \right) \,dt, \end{aligned}$$
(31)

where \(u(t)=(u_1(t),u_2(t),u_3(t),u_4(t))\) is the solution of the system (1), \(\lambda (t)\) is the vector of Lagrange multipliers \(\lambda (t)=(\lambda _1(t),\lambda _2(t),\lambda _3(t), \lambda _4(t))\) and \(v= (\lambda ,u,\eta )\).

Let us introduce following spaces needed for further analysis

$$\begin{aligned} H_{u}^{1}(\Omega _T)&=\{f\in H^{1}(\Omega _T):f(0)=0\},\nonumber \\ H_{\lambda }^{1}(\Omega _T)&=\{f\in H^{1}(\Omega _T): f(0) = 0, f(T)=0\},\nonumber \\ U&=H_{u}^{1}(\Omega _T)\times H_{\lambda }^{1}(\Omega _T)\times C(\Omega _T), \end{aligned}$$
(32)

where all functions are real valued.

To derive the Fréchet derivative of the Lagrangian (31) we assume that functions \(v=(\lambda ,u,\eta )\) can be varied independently of each other in the sense that

$$\begin{aligned} L'(v)(\bar{v})=0,\quad \forall \bar{v} =(\bar{\lambda },\bar{u},\bar{\eta }) \in U. \end{aligned}$$
(33)

Thus, we consider \(L(v + \bar{v}) - L(v)\), single out the linear part with respect to v of the obtained expression and neglect all nonlinear terms. The optimality condition (33) means that for all \(\bar{v} \in U\) we have

$$\begin{aligned} L'(v; \bar{v}) = \frac{\partial L}{\partial \lambda }(v)(\bar{\lambda }) +\frac{\partial L}{\partial u}(v)(\bar{u}) + \frac{\partial L}{\partial \eta }(v)(\bar{\eta }) = 0, \end{aligned}$$
(34)

i.e., every component of (34) should be zero out. Thus, the optimality conditions (33) yields

$$\begin{aligned} 0&= \frac{\partial L}{\partial \lambda }(v)(\bar{\lambda }) = - \alpha \int \limits _{0}^{T} u_2 (\lambda _1-\lambda _3) \bar{\eta } dt \nonumber \\&\quad + \int \limits _{0}^{T} ( \dot{u_1} - s + ku_1u_4 + \mu u_1 -(\eta \alpha + b) u_2) {\bar{\lambda }}_{{1}} dt \nonumber \\&\quad + \int \limits _{0}^{T} (\dot{u_2} - ku_1u_4 +(\mu _1+ \alpha + b )u_2) {\bar{\lambda }}_{{2}} dt \nonumber \\&\quad +\int \limits _{0}^{T} (\dot{u_3} - (1-\eta )\alpha u_2 + \delta u_3) {\bar{\lambda }}_{{3}} dt \nonumber \\&\quad + \int \limits _{0}^{T} ( \dot{u_4} - N \delta u_3 + c u_4) {\bar{\lambda }}_{{4}} dt \quad \forall \bar{\lambda } \in H_u^1(\Omega _T), \end{aligned}$$
(35)
$$\begin{aligned} 0&= \frac{\partial L}{\partial u}(v)(\bar{u}) = -\int \limits _{0}^{T}( \dot{\lambda }_{1} - \lambda _1 ku_{4} - \lambda _1\mu + \lambda _2k u_{4}) {\bar{u}}_{{1}} dt \nonumber \\&\quad - \int \limits _{0}^{T}( \dot{\lambda }_{2} - \lambda _2(\mu _1+\alpha +b) + \lambda _1 (\eta \alpha + b) + (1 - \eta )\alpha \lambda _3) {\bar{u}}_{{2}} dt\nonumber \\&\quad -\int \limits _{0}^{T}( \dot{\lambda }_{3} - \lambda _3\delta + \lambda _4N\delta ) {\bar{u}}_{{3}} dt \nonumber \\&\quad - \int \limits _{0}^{T}( \dot{\lambda }_{4} - \lambda _4 c - \lambda _1k u_1 + \lambda _2ku_1 ) {\bar{u}}_{{4}} dt + \int \limits _{T_1}^{T_2} (u_4 - g)z_{\zeta } {\bar{u}}_{{4}} dt \quad \forall {\bar{u}} \in H_{\lambda }^1(\Omega _T), \end{aligned}$$
(36)
$$\begin{aligned} 0&= \frac{\partial L}{\partial \eta }(v)(\bar{\eta }) = \gamma \int \limits _{0}^{T}(\eta -\eta ^{0})\bar{\eta } dt + \alpha \int _0^T u_2(\lambda _3 - \lambda _1)\bar{\eta } dt \quad \forall \bar{\eta } \in C\left( \Omega _T\right) . \end{aligned}$$
(37)

The equation (35) corresponds to the forward problem (1)–(2), the equation (36) — to the following adjoint problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial \lambda _1}{\partial t} = \tilde{f}_1(\lambda (t),\eta (t)) =\lambda _1(t) ku_{4}(t) + \lambda _1(t)\mu - \lambda _2(t)k u_{4}(t), \\ \frac{\partial \lambda _2}{\partial t} = \tilde{f}_2(\lambda (t),\eta (t)) =\lambda _2(t)(\mu _1+\alpha +b) - \lambda _1(t) (\eta (t) \alpha + b) - (1-\eta (t)) \alpha \lambda _3(t), \\ \frac{\partial \lambda _3}{\partial t} = \tilde{f}_3(\lambda (t),\eta (t)) = \lambda _3(t)\delta - \lambda _4(t) N \delta , \\ \frac{\partial \lambda _4}{\partial t} = \tilde{f}_4(\lambda (t),\eta (t)) =\lambda _4(t) c + \lambda _1(t) k u_1(t) - \lambda _2(t) k u_1(t) +(u_4(t) - g)z_{\zeta } , \\ \lambda _i(T) = 0,\quad i=1,\ldots ,4, \end{array}\right. } \end{aligned}$$
(38)

which can be rewritten in the compact form as

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial \lambda }{\partial t} = \tilde{f}(\lambda (t),\eta (t)),\\ \lambda _i(T) = 0,\quad i=1,\ldots ,4, \end{array}\right. } \end{aligned}$$
(39)

with

$$\begin{aligned} \lambda = \lambda (t)&=(\lambda _1(t), \lambda _2(t), \lambda _3(t), \lambda _4(t))^T, \nonumber \\ 0&= (\lambda _1(T), \lambda _2(T), \lambda _3(T), \lambda _4(T))^T, \nonumber \\ \frac{d\lambda }{dt}&= { \left( \frac{\partial \lambda _1}{ \partial t}, \frac{\partial \lambda _2}{ \partial t},\frac{\partial \lambda _3}{ \partial t}, \frac{\partial \lambda _4}{ \partial t} \right) ^T}, \nonumber \\ \tilde{f}(\lambda (t))&=( \tilde{f}_1, \tilde{f}_2, \tilde{f}_3, \tilde{f}_4)( \lambda (t),\eta (t))^T. \end{aligned}$$
(40)

The adjoint system should be solved backwards in time with already known solution u(t) to the forward problem (1)–(2) and a given measurement function g(t).

For the case when u and \(\lambda\) are exact solutions of the forward (1)–(2) and adjoint (39) problems, respectively, to the known function \(\eta\), we get from (31) that

$$\begin{aligned} L(v(\eta )) = J(\eta ), \end{aligned}$$
(41)

and thus the Fréchet derivative of the Tikhonov functional can be written as

$$\begin{aligned} J'(\eta ) := J_\eta (u(\eta ), \eta ) = \frac{\partial J}{\partial \eta }(u(\eta ), \eta ) = \frac{\partial L}{\partial \eta }(v(\eta )). \end{aligned}$$
(42)

Using (37) in (42), we get the following expression for the Fréchet derivative of the Tikhonov functional

$$\begin{aligned} J'(\eta )(t) =\gamma (\eta -\eta ^{0})(t) + \alpha u_2 (\lambda _3 - \lambda _1)(t) = 0, \end{aligned}$$
(43)

Thus, to find the unknown parameter \(\eta\) which minimizes the Tikhonov functional (29) we can use the following expression

$$\begin{aligned} \eta = \frac{1}{\gamma } \alpha u_2 (\lambda _1-\lambda _3) + \eta ^{0}. \end{aligned}$$
(44)

Finite Element Discretization

For solution of (33) we will use the finite element discretization and consider a partition \({\mathcal {J}}_{\tau } = \{J\}\) of the time domain \(\Omega _T = [0,T]\) into time subintervals \(J=(t_{k-1},t_k]\) of the time step \(\tau _k = t_k - t_{k-1}\). We define also the piecewise-constant time-mesh function \(\tau\) such that

$$\begin{aligned} \tau (t) = \tau _k, \,\,\forall J \in J_{\tau }. \end{aligned}$$
(45)

For discretization of the state and adjoint problems we define the finite element spaces \(W_{\tau }^{u}\subset H_{u}^{1}\left( \Omega _T\right)\) and \(W_{\tau }^{\lambda }\) \(\subset H_{\lambda } ^{1}\left( \Omega _T\right)\) for u and \(\lambda\), respectively, as

$$\begin{aligned} W_{\tau }^{u}&=\{f\in H_{u}^{1}: f|_J \in P^1(J)\quad \forall J \in J_{\tau }\},\nonumber \\ W_{\tau }^{\lambda }&=\{f\in H_{\lambda }^{1}: f|_J \in P^1(J)\quad \forall J \in J_{\tau } \}. \end{aligned}$$
(46)

For the function \(\eta (t)\) we also introduce the finite element space \(W_{\tau }^{\eta }\subset L_2\left( \Omega _T\right)\) consisting of piecewise constant functions

$$\begin{aligned} W_{\tau }^{\eta } =\{f\in L_{2}\left( \Omega _T\right) : f|_J \in P^0(J) \quad \forall J \in J_{\tau }\}. \end{aligned}$$
(47)

We use different finite element spaces since we are working in a finite dimensional space and all norms in finite dimensional spaces are equivalent. Next we denote \(U_{\tau }= W_{\tau }^{u}\times W_{\tau }^{\lambda }\times W_{\tau }^{\eta }\) such that \(U_{\tau }\subset U\).

Now the finite element method for (33) is: find \(v_{\tau }\in U_{\tau }\) such that

$$\begin{aligned} L^{\prime }\left( v_{\tau };\bar{v}\right) =0,\quad \forall \overline{v}\in U_{\tau }. \end{aligned}$$
(48)

Since the forward (1)–(2) and adjoint (36) problems are nonlinear their solutions can be found by Newton’s method. For the discretization

$$\begin{aligned} \frac{\partial u}{\partial t} = \frac{u^{k+1} - u^k}{\tau _k} \end{aligned}$$

the variational formulation of the forward problem (1)–(2) for all \(\bar{u} \in H_u^1(\Omega _T)\) is:

$$\begin{aligned} (u^{k+1}, \bar{u}) -(u^k, \bar{u}) - (\tau _k f(u^{k+1}), \bar{u}) = 0. \end{aligned}$$
(49)

The finite element method for (1)–(2) will be: find \(u_\tau ^{k+1} \in H_u^1(\Omega _T)\) such that for all \(\bar{u} \in H_u^1(\Omega _T)\)

$$\begin{aligned} (u_\tau ^{k+1}, \bar{u}) -(u_\tau ^k, \bar{u}) - ( \tau _k f(u_\tau ^{k+1}), \bar{u}) = 0. \end{aligned}$$
(50)

Denoting

$$\begin{aligned} \tilde{u}&= u_\tau ^{k+1}, \nonumber \\ V(\tilde{u})&= \tilde{u} - \tau _k f(\tilde{u}) - u_\tau ^k \end{aligned}$$
(51)

we can rewrite (50) as

$$\begin{aligned} (V(\tilde{u}), \bar{u}) = 0. \end{aligned}$$
(52)

For solution \(V(\tilde{u})=0\) the Newton’s method can be used for the iterations \(n=1,2,\ldots\) [17]

$$\begin{aligned} \tilde{u}^{n+1} = \tilde{u}^n - [ V'(\tilde{u}^n)]^{-1} \cdot V(\tilde{u}^n). \end{aligned}$$
(53)

Here, we can compute the Jacobian \(V'(\tilde{u}^n)\) via definition of \(V(\tilde{u})\) in (51) as

$$\begin{aligned} V'(\tilde{u}^n) = I - \tau _k f'(\tilde{u}^n), \end{aligned}$$

where I is the identity matrix, \(f'(\tilde{u}^n)\) is the Jacobian of f (the right hand side of the forward problem (1)) at \(\tilde{u}^n\) and n is the iteration number in Newton’s method. The explicit entries in the Jacobian \(f'(\tilde{u}^n)\) for system (1) are computed as

$$\begin{aligned} f'(\tilde{u}^n) =\left[ \begin{array}{cccc} \dfrac{\partial f_1}{\partial u_1} &{}\quad \dfrac{\partial f_1}{\partial u_2} &{} \dfrac{\partial f_1}{\partial u_3} &{}\quad \dfrac{\partial f_1}{\partial u_4} \\ \dfrac{\partial f_2}{\partial u_1} &{}\quad \dfrac{\partial f_2}{\partial u_2} &{} \dfrac{\partial f_2}{\partial u_3} &{}\quad \dfrac{\partial f_2}{\partial u_4} \\ \dfrac{\partial f_3}{\partial u_1} &{}\quad \dfrac{\partial f_3}{\partial u_2} &{} \dfrac{\partial f_3}{\partial u_3} &{}\quad \dfrac{\partial f_3}{\partial u_4}\\ \dfrac{\partial f_4}{\partial u_1} &{}\quad \dfrac{\partial f_4}{\partial u_2} &{} \dfrac{\partial f_4}{\partial u_3} &{}\quad \dfrac{\partial f_4}{\partial u_4} \end{array}\right] (\tilde{u}^n) = \left[ \begin{array}{cccc} -k{u_\tau }_{4}^{n}-\mu &{}\quad (\eta \alpha +b) &{}\quad 0 &{}\quad -k {u_\tau }_{1}^{n} \\ k {u_\tau }_{4}^{n} &{}\quad -(\mu _1+\alpha +b) &{}\quad 0 &{}\quad k {u_{\tau }}_{1}^{n} \\ 0 &{}\quad \quad (1-\eta )\alpha &{}\quad -\delta &{} 0 \\ 0 &{}\quad 0 &{}\quad N\delta &{}\quad -c \end{array}\right] . \end{aligned}$$

We note that the finite element method (48) will work even in this case, see details in [24].

In a similar way the Newtons’s method can be derived for the solution the adjoint problem (39). Since we solve the adjoint problem backwards in time starting from the known \(\lambda (T)=0\), we discretize time derivative as

$$\begin{aligned} \frac{\partial \lambda }{\partial t} = \frac{\lambda ^{k+1} - \lambda ^{k}}{\tau _k} \end{aligned}$$
(54)

for the already known \(\lambda ^{k+1}\), and write the variational formulation of the adjoint problem for all \(\bar{\lambda } \in H_\lambda ^1(\Omega _T)\) as

$$\begin{aligned} -( - \lambda ^{k+1} + \lambda ^{k} + \tau _k \tilde{f}(\lambda ^k) ,\bar{\lambda }) = 0. \end{aligned}$$
(55)

The finite element method for (39) will be: find \(\lambda _\tau ^k \in H_\lambda ^1(\Omega _T)\) such that for all \(\bar{\lambda } \in H_\lambda ^1(\Omega _T)\)

$$\begin{aligned} ( \lambda _\tau ^{k} - \lambda _\tau ^{k+1} + \tau _k \tilde{f} (\lambda _\tau ^k) ,\bar{\lambda }) = 0. \end{aligned}$$
(56)

Denoting

$$\begin{aligned} \tilde{\lambda }&= \lambda _\tau ^{k}, \nonumber \\ \tilde{V}(\tilde{\lambda })&= \tilde{\lambda } + \tau _k \tilde{f}(\tilde{\lambda }) -\lambda _\tau ^{k+1}, \end{aligned}$$
(57)

we can rewrite (56) for all \(\bar{\lambda } \in H_\lambda ^1(\Omega _T)\) as

$$\begin{aligned} (\tilde{V}(\tilde{\lambda }), \bar{\lambda }) = 0. \end{aligned}$$
(58)

For solution \(\tilde{V}(\tilde{\lambda })=0\) we use again Newton’s method for iterations \(n=1,2,\ldots\)

$$\begin{aligned} \tilde{\lambda }^{n+1} = \tilde{\lambda }^n - [ \tilde{V}' (\tilde{\lambda }^n)]^{-1} \cdot \tilde{V}(\tilde{\lambda }^n). \end{aligned}$$
(59)

We compute \(\tilde{V}'(\tilde{\lambda }^n)\) using the definition of \(\tilde{V}(\tilde{\lambda })\) in (57) as

$$\begin{aligned} \tilde{V}'(\tilde{\lambda }^n) = I + \tau _k \tilde{f}'(\tilde{\lambda }^n), \end{aligned}$$

where I is the identity matrix, \(\tilde{f}'(\tilde{\lambda }^n)\) is the Jacobian of \(\tilde{f}\) (the right hand side of the adjoint problem (39)) at \(\tilde{\lambda }^n\), and n is the iteration number in Newton’s method. The explicit entries in the Jacobian \(\tilde{f}'(\tilde{\lambda }^n)\) for the adjoint system (39) are given by

$$\begin{aligned} \tilde{f}'(\tilde{\lambda }^n) =\left[ \begin{array}{cccc} \dfrac{\partial \tilde{f}_1}{\partial \lambda _1} &{}\quad \dfrac{\partial \tilde{f}_1}{\partial \lambda _2} &{}\quad \dfrac{\partial \tilde{f}_1}{\partial \lambda _3} &{}\quad \dfrac{\partial \tilde{f}_1}{\partial \lambda _4} \\ \dfrac{\partial \tilde{f}_2}{\partial \lambda _1} &{}\quad \dfrac{\partial \tilde{f}_2}{\partial \lambda _2} &{}\quad \dfrac{\partial \tilde{f}_2}{\partial \lambda _3} &{}\quad \dfrac{\partial \tilde{f}_2}{\partial \lambda _4} \\ \dfrac{\partial \tilde{f}_3}{\partial \lambda _1} &{}\quad \dfrac{\partial \tilde{f}_3}{\partial \lambda _2} &{}\quad \dfrac{\partial \tilde{f}_3}{\partial \lambda _3} &{}\quad \dfrac{\partial \tilde{f}_3}{\partial \lambda _4}\\ \dfrac{\partial \tilde{f}_4}{\partial \lambda _1} &{}\quad \dfrac{\partial \tilde{f}_4}{\partial \lambda _2} &{}\quad \dfrac{\partial \tilde{f}_4}{\partial \lambda _3} &{}\quad \dfrac{\partial \tilde{f}_4}{\partial \lambda _4} \end{array}\right] (\tilde{\lambda }^n) =\left[ \begin{array}{cccc} k {u_\tau }_{4}^n +\mu &{}\quad -k {u_\tau }_{4}^n &{}\quad 0 &{}\quad 0 \\ -(\eta \alpha +b) &{} \quad \mu _1+\alpha +b &{}\quad (\eta - 1)\alpha &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad \delta &{}\quad -N\delta \\ k {u_\tau }_{1}^n &{}\quad -k {u_\tau }_{1}^n &{}\quad 0 &{}\quad c \end{array}\right] . \end{aligned}$$

Taking into account values of parameters given in Table 1, we observe that \(\det \tilde{f}'(\tilde{\lambda }^n) \ne 0\) as well as \(\det \tilde{f}'(\tilde{\lambda }^n) \ne 0\). Thus, schemes (53), (59) will converge given the appropriate starting values \(\tilde{u}^1\) and \(\tilde{\lambda }^1\), correspondingly. For study of convergence of iterative methods we refer to [4].

A Posteriori Error Estimates

We consider the function \(\eta \in C(\Omega _T)\) as a minimizer of the Lagrangian (31), and \(\eta _{\tau } \in W_{\tau }^\eta\) its finite element approximation. Let us assume that we know good approximation to the exact solution \(\eta ^* \in C(\Omega _T)\). Let \(g^{*}(t)\) be the exact data and the function \(g_{\sigma }(t)\) represents the error level in these data. We assume that measurements g(t) in (28) are given with some noise level (small) \(\sigma\) such that

$$\begin{aligned} g(t)=g^{*}(t)+g_{\sigma }(t);\,\, g^{*},g_{\sigma }\in L_{2} \left( \Omega _{T}\right) ,\, \left\| g_{\sigma }\right\| _{L_{2} \left( \Omega _{T}\right) }\le \sigma . \end{aligned}$$
(60)

Accordingly [14] we assume that

$$\begin{aligned} \gamma = \gamma (\sigma ) = \sigma ^{2 \mu },\,\,\mu \in (0,1/4), \,\,\sigma \in (0,1) \end{aligned}$$
(61)

and

$$\begin{aligned} \Vert \eta _0 - \eta ^* \Vert \le \frac{\sigma ^{3\mu }}{3}, \end{aligned}$$
(62)

where \(\eta ^*\) is the exact solution of PIP with the exact data \(g^*(t)\). Let

$$\begin{aligned} V_\varepsilon (\eta ) = \{ x \in C(\Omega _T): \Vert \eta - x \Vert < \varepsilon \quad \forall \eta \in C(\Omega _T) \}. \end{aligned}$$
(63)

Assume that for all \(\eta \in V_1(\eta ^*)\) the operator

$$\begin{aligned} F(\eta ) = \frac{1}{2} \int \limits _{T_1}^{T_2}(u_4(\eta , t)-g(t))^{2}z_{\zeta } \left( t\right) \,\mathrm{d}t \end{aligned}$$
(64)

has the Fréchet derivative \(F'(\eta )\) which is bounded and Lipschitz continuous in \(V_1(\eta ^*)\) for \(D_1, D_2 = const. > 0\)

$$\begin{aligned} \Vert F'(\eta ) \Vert&\le D_1 \quad \forall \eta \in V_1(\eta ^*), \nonumber \\ \Vert F'(\eta _1) - F'(\eta _2) \Vert&\le D_2 \Vert \eta _1 - \eta _2 \Vert \quad \forall \eta _1, \eta _2 \in V_1(\eta ^*). \end{aligned}$$
(65)

An A Posteriori Error Estimate for the Tikhonov Functional

In the Theorem 1 we derive an a posteriori error estimate for the error in the Tikhonov functional (29) on the finite element time partition \({\mathcal {J}}_{\tau }\).

Theorem 1

We assume that there exists minimizer \(\eta \in C(\Omega _T)\) of the functional \(J(\eta )\) defined by (29). We assume also that there exists finite element approximation of a minimizer \(\eta _{\tau } \in W_{\tau }^{\eta }\) of \(J(\eta )\). Then the following approximate a posteriori error estimate for the error \(e=|| J(\eta ) - J(\eta _{\tau }) ||_{L^2(\Omega _T)}\) in the Tikhonov functional (29) holds true

$$\begin{aligned} e= || J(\eta ) - J(\eta _{\tau }) ||_{L^2(\Omega _T)} \le C_I C \left\| J^{\prime }(\eta _{\tau })\right\| _{L^2(\Omega _T)} || \tau \eta _{\tau } ||_{L_2(\Omega _T)} \end{aligned}$$
(66)

with positive constants \(C_I, C > 0\) and where

$$\begin{aligned} J^{\prime }(\eta _{\tau }) = \gamma (\eta _{\tau } -\eta ^{0}) -\alpha {u_2}_{\tau } ({\lambda _1}_{\tau } - {\lambda _3}_{\tau }). \end{aligned}$$
(67)

Proof

We use the definition of the Frechét derivative to get

$$\begin{aligned} J(\eta ) -J(\eta _{\tau }) =J'(\eta _{\tau })(\eta - \eta _{\tau }) + R(\eta , \eta _{\tau }), \end{aligned}$$
(68)

where \(R(\eta , \eta _{\tau }) =O((\eta - \eta _{\tau })^2),\,\, (\eta - \eta _{\tau }) \rightarrow 0 \,\,\forall \eta , \eta _{\tau } \in W_{\tau }^{\eta }\). The term \(R(\eta , \eta _{\tau })\) is small because of assumption (62): we assume that \(\eta _{\tau }\) is the minimizer of the Tikhonov functional on the mesh \({\mathcal {J}}_{\tau }\) and this minimizer is located in a small neighborhood of the regularized solution \(\eta\). Because of that we neglect R in (68). Next, we use the splitting

$$\begin{aligned} \eta - \eta _{\tau } = \eta - \eta _\tau ^I + \eta _\tau ^I - \eta _{\tau } \end{aligned}$$
(69)

for \(\eta - \eta _{\tau }\) in (68) together with Galerkin orthogonality

$$\begin{aligned} J'(\eta _{\tau })( \eta _\tau ^I - \eta _{\tau }) = 0, \quad \forall \eta _\tau ^I, \eta _{\tau } \in W_{\tau }^{\eta } \end{aligned}$$
(70)

to get

$$\begin{aligned} J(\eta ) -J(\eta _{\tau }) \le J'(\eta _{\tau })(\eta - \eta _\tau ^I). \end{aligned}$$
(71)

Here, \(\eta _\tau ^I\) is a standard interpolant of \(\eta\) on the mesh \({\mathcal {J}}_{\tau }\) [23]. Taking norms in (71), we obtain

$$\begin{aligned} ||J(\eta ) -J(\eta _{\tau }) ||_{L^2(\Omega _T)} \le ||J' (\eta _{\tau })||_{L^2(\Omega _T)} ||\eta - \eta _\tau ^I||_{L^2(\Omega _T)}, \end{aligned}$$
(72)

where the term \(||\eta - \eta _\tau ^I||_{L^2(\Omega _T)}\) can be estimated via the interpolation estimate with the constant \(C_I\)

$$\begin{aligned} ||\eta - \eta _\tau ^I||_{L^2(\Omega _T)} \le C_I \left\| \tau \eta \right\| _{H^1(\Omega _T)}. \end{aligned}$$
(73)

We can estimate \(||\tau \,\eta ||_{H^1(\Omega _T)}\) in (73) as

$$\begin{aligned} || \tau \,\eta ||_{H^1(\Omega _T)}&\le \sum _J || \tau _k \eta ||_{H^1(J)} =\sum _J \left\| \left( \eta + \frac{ \partial \eta }{ \partial t} \right) \tau _k \right\| _{L_2(J)} \nonumber \\&\le \sum _J \left( || \eta _\tau \tau _k ||_{L_2(J)} +\left\| \frac{[\eta _\tau ]}{\tau _k} \tau _k \right\| _{L_2(J)} \right) \nonumber \\&\le || \tau \eta _\tau ||_{L_2(\Omega _T)} +\sum _J \left\| [\eta _\tau ] \right\| _{L_2(J)}. \end{aligned}$$
(74)

Here, \([\eta _\tau ]\) denote the jump of the function \(\eta _\tau\) over the time intervals \([t_{k-1}, t_k]\) and \([t_k, t_{k+1}]\) defined as

$$\begin{aligned} {[}\eta _{\tau }] = \eta _{\tau }^+ - \eta _{\tau }^- \end{aligned}$$

with functions \(\eta _{\tau }^-, \eta _{\tau }^+\) computed on \([t_{k-1}, t_k]\) and \([t_k, t_{k+1}]\), respectively.

Now we substitute above estimate into (72) to get

$$\begin{aligned} ||J(\eta ) -J(\eta _{\tau }) ||_{L^2(\Omega _T)} \le C_I \left\| J^{\prime }(\eta _{\tau })\right\| _{L^2(\Omega _T)} \left( || \tau \eta _\tau ||_{L_2(\Omega _T)} + \sum _J \left\| [\eta _\tau ] \right\| _{L_2(J)} \right) \quad \forall \eta _\tau \in W_\tau ^\eta . \end{aligned}$$
(75)

In the case when \(\eta _\tau \in W_\tau ^\eta\) terms with jumps in time disappear and we have a posteriori error estimate

$$\begin{aligned} ||J(\eta ) -J(\eta _{\tau }) ||_{L^2(\Omega _T)} \le C_I \left\| J^{\prime }(\eta _{\tau })\right\| _{L^2(\Omega _T)} || \tau \eta _\tau ||_{L_2(\Omega _T)} \, \forall \eta _\tau \in W_\tau ^\eta . \end{aligned}$$
(76)

\(\square\)

A Posteriori Error Estimate of the Minimizer on Refined Meshes

Theorems 2 and 3 present two a posteriori error estimates for a minimizer \(\eta\) of the functional (29). Proof of the next theorem follows from the proof of Theorem 5.1 of [29].

Theorem 2

Let \(\eta _\tau \in W_\tau ^\eta\) be a finite element approximation on the finite element mesh \(J_\tau\) of the minimizer \(\eta \in L^2(\Omega _T)\) of the functional (29) with the mesh function \(\tau (t)\). Then there exists a Lipschitz constant \(D=const.>0\) defined by

$$\begin{aligned} \left\| J^{\prime }(\eta _1) - J^{\prime }(\eta _2) \right\| \le D \left\| \eta _1 - \eta _2\right\| ,\forall \eta _1, \eta _2 \in L_2(\Omega _T), \end{aligned}$$
(77)

and interpolation constant \(C_I\) independent on \(\tau\) such that the following a posteriori error estimate for the minimizer \(\eta\) holds true

$$\begin{aligned} || \eta _\tau - \eta ||_{L_2(\Omega _T)} \le \frac{D}{\gamma } C_I || \tau \eta _\tau ||_{L_2(\Omega _T)} \, \forall \eta _\tau \in W_\tau ^\eta . \end{aligned}$$
(78)

Proof

Let \(\eta _\tau\) be the minimizer of the Tikhonov functional (29). The existence and uniqueness of this minimizer is guaranteed by conditions (62) and follows from Theorem 1.9.1.2 of [13]. By this theorem, the functional (29) is strongly convex on the space \(L_2(\Omega _T)\) with the strong convexity constant \(\gamma\). This implies that

$$\begin{aligned} \gamma \left\| \eta _\tau - \eta \right\| _{L_2(\Omega _T)} ^{2}\le | \left( J^{\prime }\left( \eta _\tau \right) - J^{\prime }\left( \eta \right) , \eta _\tau -\eta \right) |. \end{aligned}$$
(79)

Here, \(J^{\prime }(\eta _\tau ), J^{\prime }\left( \eta \right)\) are the Fréchet derivatives of the functional (29) given by (43) for respective \(\eta\).

Since \(\eta\) is the minimizer of the Tikhonov functional (29) then

$$\begin{aligned} \left( J^{\prime }\left( \eta \right) , \eta \right) =0, \quad \forall \eta \in L_2(\Omega _T). \end{aligned}$$

Using the splitting

$$\begin{aligned} \eta _\tau - \eta =\left( \eta _\tau - \eta _\tau ^I \right) +\left( \eta _\tau ^I - \eta \right) , \end{aligned}$$
(80)

where \(\eta _\tau ^I\) is an interpolant of \(\eta\), together with the Galerkin orthogonality principle for all \(\eta _\tau , \eta _\tau ^I \in W_\tau ^\eta\)

$$\begin{aligned} \left( J^{\prime }\left( \eta _\tau \right) - J^{\prime }\left( \eta \right) , \eta _\tau - \eta _\tau ^I \right) =0 \end{aligned}$$
(81)

in (79) we obtain

$$\begin{aligned} \gamma \left\| \eta _\tau - \eta \right\| _{L_2(\Omega _T)}^{2}\le |\left( J^{\prime }\left( \eta _\tau \right) - J^{\prime }\left( \eta \right) , \eta _\tau ^I - \eta \right) | . \end{aligned}$$
(82)

We can estimate the right hand side of (82) using (77) as

$$\begin{aligned} | \left( J^{\prime }\left( \eta _\tau \right) - J^{\prime }\left( \eta \right) , \eta _\tau ^I - \eta \right) | \le D || \eta _\tau - \eta ||_{L_2(\Omega _T)} || \eta _\tau ^I - \eta ||_{L_2(\Omega _T)}. \end{aligned}$$

Substituting above equation into (82) we obtain

$$\begin{aligned} || \eta _\tau - \eta ||_{L_2(\Omega _T)} \le \frac{D}{\gamma } || \eta _\tau ^I - \eta ||_{L_2(\Omega _T)} . \end{aligned}$$
(83)

Using the interpolation property

$$\begin{aligned} || \eta _\tau ^I - \eta ||_{L_2(\Omega _T)} \le C_I || \tau \,\eta ||_{H^1(\Omega _T)} \end{aligned}$$
(84)

we obtain a posteriori error estimate for the regularized solution with the interpolation constant \(C_I\):

$$\begin{aligned} || \eta _\tau - \eta ||_{L_2(\Omega _T)} \le \frac{D}{\gamma } || \eta _\tau ^I - \eta ||_{L_2(\Omega _T)} \le \frac{D}{\gamma } C_I || \tau \,\eta ||_{H^1(\Omega _T)}. \end{aligned}$$
(85)

.

We can estimate \(|| \tau \,\eta ||_{H^1(\Omega _T)}\) in (85) similar to (74). Substituting this estimate into the right hand side of (85) we get

$$\begin{aligned} || \eta _\tau - \eta ||_{L_2(\Omega _T)} \le \frac{D}{\gamma } C_I \left( || \tau \eta _\tau ||_{L_2(\Omega _T)} +\left\| [\eta _\tau ] \right\| _{L_2(\Omega _T)} \right) \, \forall \eta _\tau \in W_\tau ^\eta . \end{aligned}$$

In the case when \(\eta _\tau \in W_\tau ^\eta\) terms with jumps in time \([\eta _\tau ]\) disappear and we have a posteriori error estimate

$$\begin{aligned} || \eta _\tau - \eta ||_{L_2(\Omega _T)} \le \frac{D}{\gamma } C_I || \tau \eta _\tau ||_{L_2(\Omega _T)}. \end{aligned}$$

\(\square\)

Theorem 3

Let \(\eta _\tau \in W_\tau ^\eta\) be a finite element approximation on the finite element mesh \(J_\tau\) of the minimizer \(\eta \in L^2(\Omega _T)\) of the functional (29) with the mesh function \(\tau (t)\). Then there exists an interpolation constant \(C_I\) independent on \(\tau\) such that the following a posteriori error estimate for the minimizer \(\eta\) and the regularization parameter \(\gamma \ne 0\) holds

$$\begin{aligned} || \eta _\tau - \eta ||_{L_2(\Omega _T)} \le \sqrt{ \frac{\Vert R(\eta _\tau )\Vert }{\gamma } C_I || \tau \eta _\tau ||_{L_2(\Omega _T)}} \, \forall \eta _\tau \in W_\tau ^\eta , \end{aligned}$$
(86)

where \(R(\eta _\tau )\) is the residual defined as

$$\begin{aligned} R(\eta _\tau )(t) = \gamma (\eta _\tau - \eta ^{0})(t) + \alpha {u_2}_\tau (({\lambda _{3}}_\tau - {\lambda _1}_\tau )(t). \end{aligned}$$
(87)

Proof

Let again \(\eta _\tau\) be the minimizer of the Tikhonov functional (29). Strong convexity of the functional (29) on the space \(L_2(\Omega _T)\) implies that

$$\begin{aligned} \gamma \left\| \eta _\tau - \eta \right\| _{L_2(\Omega _T)} ^{2}\le |\left( J^{\prime }\left( \eta _\tau \right) - J^{\prime }\left( \eta \right) , \eta _\tau -\eta \right) |. \end{aligned}$$
(88)

Applying splitting (80) to (88) we obtain (82) where the term \(J'(\eta _\tau )\) can be estimated via (43). More precisely, when \(u(t), \lambda (t)\) are exact functions, we have for \(\eta _\tau\):

$$\begin{aligned} L(v(\eta _\tau )) = J(\eta _\tau ), \end{aligned}$$

and thus, for exact functions \(u(t), \lambda (t)\) one can write

$$\begin{aligned} J'(\eta _\tau ) = L'(\eta _\tau ) = \gamma (\eta _\tau - \eta ^{0})(t) + \alpha {u_2}_\tau (({\lambda _{3}}_\tau - {\lambda _1}_\tau )(t). \end{aligned}$$
(89)

From (88) and (89) (noting that \(J'(\eta )=0\)) we get

$$\begin{aligned} || \eta _\tau - \eta ||_{L_2(\Omega _T)} \le \sqrt{\frac{\Vert R(\eta _\tau )\Vert }{\gamma } || \eta _\tau ^I - \eta ||_{L_2(\Omega _T)}}, \end{aligned}$$
(90)

where \(R(\eta _\tau )\) is the residual defined as in (87).

Using the interpolation property (84) and further the estimate (74) we obtain following a posteriori error estimate for the regularized solution with the interpolation constant \(C_I\):

$$\begin{aligned} || \eta _\tau - \eta ||_{L_2(\Omega _T)} \le \sqrt{\frac{\Vert R(\eta _\tau )\Vert }{\gamma } C_I \left( || \tau \eta _\tau ||_{L_2(\Omega _T)} +\left\| [\eta _\tau ] \right\| _{L_2(\Omega _T)} \right) } \, \forall \eta _\tau \in W_\tau ^\eta . \end{aligned}$$

and since \(\eta _\tau \in W_\tau ^\eta\) the above estimate reduces to

$$\begin{aligned} || \eta _\tau - \eta ||_{L_2(\Omega _T)} \le \sqrt{\frac{\Vert R(\eta _\tau )\Vert }{\gamma } C_I || \tau \eta _\tau ||_{L_2(\Omega _T)}} \, \forall \eta _\tau \in W_\tau ^\eta . \end{aligned}$$

\(\square\)

Algorithms for Solution of PIP

Here we present two algorithms for solution of PIP:

  • CGA—usual conjugate gradient algorithm on a coarse time partition,

  • ACGA—time-adaptive conjugate gradient algorithm which minimized the Tikhonov functional (29) on a locally refined meshes in time.

We denote the nodal value of the gradient at the observation points \(\{t_i\}\) by \(G^{m}(t_i)\) and compute it accordingly to (43) as

$$\begin{aligned} G^{m}(t_i) = \gamma (\eta _\tau ^m(t_i) - \eta _\tau ^0(t_i)) +\alpha {u_2}_\tau ^m(t_i) ({\lambda _3}_\tau ^m(t_i) - {\lambda _1}_\tau ^m(t_i)). \end{aligned}$$
(91)

The approximate computed solutions \({u_2}_\tau ^{m}\) and \({\lambda _{1,3}}_\tau ^m\) are obtained computationally by Newton’s method with \(\eta :=\eta _\tau ^{m}\). A sequence \(\{{\eta _\tau }^{m} \}_{m=1,\ldots ,M}\) of approximations to \(\eta\) is computed as follows

$$\begin{aligned} \eta _\tau ^{m+1}(t_i) = \eta _\tau ^{m}(t_i) + r^m d^m(t_i), \end{aligned}$$
(92)

with

$$\begin{aligned} d^m(t_i)= -G^m(t_i) + \beta ^m d^{m-1}(t_i), \end{aligned}$$

and

$$\begin{aligned} \beta ^m = \frac{|| G^m(t_i)||^2}{|| G^{m-1}(t_i)||^2}, \end{aligned}$$

where \(d^0(t_i)= -G^0(t_i)\) and \(G^{m}(t_i)\) is the gradient vector which is computed by (91) in time moments \(t_i\). In (92) the parameter \(r^m\) is the step-size in the gradient update at the iteration m which is computed as

$$\begin{aligned} r^m = -\frac{(G^m, d^m)}{\gamma \Vert d^m\Vert ^2}. \end{aligned}$$
(93)
figure a

In the adaptive algorithm ACGA we have used Theorem 3 for the error \(e = \Vert \eta _\tau - \eta \Vert _{L_2(\Omega _T)}\) on locally refined meshes. More precisely, first we choose tolerance \(0<\theta < 1\) and run adaptive algorithm until

$$\begin{aligned} e = \Vert \eta _\tau - \eta \Vert _{L_2(\Omega _T)} \le \theta . \end{aligned}$$

For the time-mesh refinements we propose following refinement procedure based on the Theorem 3.

The Time Mesh Refinements Criterion

Refine the time-mesh \({\mathcal {J}}_{\tau }\) in neighborhoods of those time-mesh points \(t\in {\Omega _T}\) where the residual \(\left| R\left( \eta _{\tau }\right) \left( t\right) \right|\) defined in (87) attains its maximal values. More precisely, let \(\beta _{1}\in \left( 0,1\right)\) be the tolerance number. Refine the time-mesh in such subdomains of \({\Omega _T}\) where

$$\begin{aligned} \left| R(\eta _\tau ) \left( t\right) \right| \ge \beta _{1}\max _{\Omega _T}\left| R(\eta _\tau ) \left( t\right) \right| . \end{aligned}$$

Using the above mesh refinement recommendation we propose the following time-adaptive algorithm in computations:

figure b

Numerical Results

In this section we present several numerical results which show performance and effectiveness of the time-adaptive reconstruction of unknown parameter \(\eta (t), t \in [0,T]\) in PIP using ACGA algorithm. Numerical tests are performed in Matlab R2019b using the developed code for solution of the studied problem available for download at [31]. Numerical results of reconstruction of function \(\eta (t)\) using usual conjugate gradient Algorithm 1 on the nonrefined time-meshes are presented in [25]. We note that observations of all \(u_i, i=1,2,3,4\) functions in system (1) were used in [25].

The goal of numerical tests of this note is to determine the unknown function \(\eta (t)\) from observation of the virus population function \(u_4(t)\) in (1) on the interval \([T_1,T_2] \subset [0,T], 0 \le T_1 < T_2 \le T\). In all numerical tests assumed that parameter \(\eta (t)\) satisfy conditions (5) and is unknown in the system (1), but all other parameters \(\{s\), \(\mu\), k, \(\mu _1\), \(\alpha\), b, \(\delta\), c, \(N \}\) of this system are known and their values are chosen as in the Table 1. The observation interval \([T_1, T_2]\) is such that \(T_2 = T = 300\), but \(T_1\) is taken differently in different tests since observations of the virus population function \(u_4(t)\) can be taken after the first \(3-9\) weeks since the virus started to be reproduced in the body of host.

For generation of data \(u_4(t) = g(t)\) the problem (1)–(2) was solved numerically with exact values of the test model function \(\eta (t)\). For solution of problem (1)–(2) was used Newton’s method presented in “The Parameter Identification Problem”. Next, the random noise was added to the observed solution \(u_4(t)\) as

$$\begin{aligned} {u_4}_\sigma (t) = {u_4}_\sigma (t)( 1 + \sigma \alpha ), \end{aligned}$$
(95)

where \(\sigma \in [0,1]\) is nose level and \(\alpha \in [-1,1]\) is random number.

In Algorithms 1, 2 it is of vital importance to take initial guess \(\eta ^0\) such that it satisfy condition (62) which means that \(\eta ^0\) is located in the close neighborhood of the exact solution. This condition is fulfilled in our PIP since we can compute explicitly values of the parameter \(\eta (t)\) on the initial non-refined time mesh using, for example, the third equation of system (1) as

$$\begin{aligned} \eta (t) = 1 - \frac{\frac{\partial u_3(t)}{\partial t} + \delta u_3(t)}{\alpha u_2(t)}. \end{aligned}$$
(96)

We used following discretised version of this equation to get initial guess \(\eta _\tau ^0\)

$$\begin{aligned} \eta _\tau ^0(t) \approx 1 - \frac{\frac{ {u_3}_\tau ^{k+1} -{u_3}_{\tau }^{k}}{ \tau _k } + \delta {u_3}_\tau ^k}{\alpha {u_2}_\tau ^k}. \end{aligned}$$
(97)

Here, \({u_3}_\tau ^{k+1}, {u_3}_\tau ^k, {u_2}_\tau ^k\) are known computed approximations of functions \(u_3, u_2\) at time iterations \(k+1\) and k, respectively. In our computations we take values of \({u_3}_\tau ^{k+1}, {u_3}_\tau ^k, {u_2}_\tau ^k\) using solution of the problem (1)–(2) with exact values of the test model function \(\eta (t)\) and then adding the noise \(\delta\) as

$$\begin{aligned} {u_i}_\sigma (t) = {u_i}_\sigma (t)( 1 + \sigma \alpha ), \quad i=2,3, \end{aligned}$$
(98)

where \(\sigma \in [0,1]\) is nose level and \(\alpha \in [-1,1]\) is random number. We note that denominator in (97) is not approaching zero because \(\alpha = 0.4\) and \({u_2}_\tau (t) > 0 \,\, \forall t \in [0,T]\). To get reasonable approximation \(\eta _\tau ^0\) for the initial guess \(\eta ^0\) in Algorithm 2 we assume that noisy functions \({u_3}_{\sigma }, {u_2}_{\sigma }\) are known on the initial non-refined mesh, apply (97) and then use polynomial fitting to obtained noisy data \(\eta _\tau ^0\) in order smooth them. Finally, the condition (5) was applied for the computed \(\eta _\tau ^0\) in order to ensure that \(\eta ^0(t)\) belongs to the set of admissible parameters \(M_\eta\). Second order discretization of the first time derivative in (97) is also possible. We note that numerical differentiation of noisy data is an ill-posed problem and it is discussed in detail in Section 4 of [25].

All tests are performed with tolerance \(\Theta = 10^{-7}\) in ACGA algorithm and \(\beta _1 = 0.1\) in (8.94). The value of \(\beta _1\) is chosen such that it allows local refinements and avoids refinement of the very large time region in the time mesh. All tests are performed for different \(T_1= 25, 50, 100\) for the time interval \([T_1, T_2] = [T_1, 300]\) which corresponds to the fact that HIV virus can be detected in the first 3-9 weeks after infection.

Relative errors in the reconstructed parameters \(\eta (t)\) presented in the Tables are measured in \(L_2\)-norm and are computed as

$$\begin{aligned} e_{\eta } = \frac{\Vert \eta - \eta _\tau \Vert _{L_2(\Omega _T)}}{\Vert \eta \Vert _{L_2(\Omega _T)}}. \end{aligned}$$
(99)

Complete description of all numerical tests with reconstruction results are presented in the recent work [11].

Test 1

See Table 4.

Table 4 Test 1. Relative errors \(e_{\eta }\) computed for reconstruction of the function \(\eta (t)= 0.7 e^{-t} + 0.05, t \in [0, 300]\) for \(T_1= 25, 50, 100\) on different locally adaptively refined time-meshes
Fig. 4
figure 4

Test 1. Left figures: simulated \({u_4}_\tau\) vs. noisy \({{u_4}_\tau }_\sigma\) on the different adaptively refined time meshes. Here, noisy observed data are presented by circles. Middle figures: least squares fitting for the noisy \(\eta _\tau\). Right figures: results of ACGA on adaptively refined meshes. Computations are done for noise level \(\sigma =10\%\) in \(u_4\) and for \(T_1 = 50\)

Fig. 5
figure 5

Test 1. Left figures: simulated \({u_4}_\tau\) vs. noisy \({{u_4}_\tau }_\sigma\) on the different adaptively refined time meshes. Here, noisy observed data are presented by circles. Middle figures: least squares fitting for the noisy data \(\eta _\tau\). Right figures: results of ACGA on adaptively refined meshes. Computations are done for noise level \(\sigma =40\%\) in \(u_4\) and for \(T_1 = 100\)

In this test we present the reconstruction results of the smooth model function

$$\begin{aligned} \eta (t)= 0.7 e^{-t} + 0.05, t \in [0, 300] \end{aligned}$$
(100)

for different starting time points \(T_1= 25, 50, 100\) and for the number of discretization points \(k=15\) on the initial time partition \({\mathcal {J}}_{\tau }^0\) which is generated with equidistant time step \(\tau = 300/(k-1)\). More precisely, in this test we model the control parameter \(\eta (t)\) as a smooth function given by the equation (100), and we want to recover this function on the whole time interval [0, T] using measurements of the noisy virus population function \({{u_4}_\tau }_\sigma\) for different values of the initial measurement of this function, or for different times \(T_1\).

Left figures of Figs. 4 and 5 show simulated \({u_4}_\tau\) versus noisy \({{u_4}_\tau }_\sigma\) on the different adaptively refined in time meshes. On these figures noisy observed data is presented by blue circles and simulated data without noise is shown by the solid blue line. Middle figures of Figs. 4 and 5 present least squares fitting for the noisy \(\eta _\tau\). We recall that initial guess for noisy \(\eta _\tau\) is computed via applying (97) and is represented by the red circles on the middle figures of Figs. 4 and 5. The least squares fitting to the noisy \(\eta _\tau\) is shown by the solid blue line on these figures.

Results of reconstruction of the model function (100) for noise levels \(\sigma = 5\%, 10\%, 20\%, 40\%\) in virus population function \(u_4(t)\) are presented Table 1. Right figures of Fig. 4 show the reconstruction results of the function (100) for a noise level \(\sigma = 10\%\) in the data \(u_4(t)\) for starting observed time \(T_1 = 50\). Right figures of Fig. 5 shows the reconstruction results of this function for a noise level \(\sigma = 40\%\) in the data \(u_4(t)\) and for \(T_1 = 100\).

Table 1 and Figs. 4 and 5 confirm that with local time-mesh refinements the reconstruction of the drug efficacy function \(\eta _\tau\) is significantly improved compared to the reconstruction of \(\eta _\tau\) obtained on the initial non-refined time-mesh.

Test 2

See Table 5.

Table 5 Test 2. Relative errors \(e_{\eta }\) computed for reconstruction of the function \(\eta (t)= 0.7, t \in [0, 300]\) for \(T_1= 25, 50, 100\) on different locally adaptively refined time-meshes
Fig. 6
figure 6

Test 2. Left figures: simulated \({u_4}_\tau\) vs. noisy \({{u_4}_\tau }_\sigma\) on the different adaptively refined time meshes. Here, noisy observed data are presented by circles. Middle figures: least squares fitting for the noisy \(\eta _\tau\). Right figures: results of ACGA on adaptively refined meshes. Computations are done for noise level \(\sigma =40\%\) in \(u_4\) and for \(T_1 = 50\)

Fig. 7
figure 7

Test 2. Left figures: simulated \({u_4}_\tau\) vs. noisy \({{u_4}_\tau }_\sigma\) on different adaptively refined time meshes. Here, noisy observed data are presented by circles. Middle figures: least squares fitting to noisy data for \(\eta _\tau\). Right figures: results of ACGA on adaptively refined meshes. Computations are done for noise level \(\sigma =40\%\) in \(u_4\) and for \(T_1 = 100\)

In this test we present numerical reconstruction results of the constant model function \(\eta (t)= 0.7\) from noisy observations of the virus population function \({u_4(t)}_\sigma\) at the observation interval \([T_1, T_2]\). We again took \(T_1= 25, 50, 100\), but number of observation points were 20 at the time interval \([T_1, T_2] =[T_1, 300]\). We generate initial time partition \({\mathcal {J}}_{\tau }\) with equidistant time step \(\tau = 300/19\). The reconstruction results of the model function \(\eta (t)= 0.7\) for noise levels \(\sigma = 5\%, 10\%, 20\%, 40\%\) in data \(u_4(t)\) are presented in Table 2. Right figures of Figs. 6, 7 show the reconstruction results of the function \(\eta (t)= 0.7\) for noise level \(\sigma = 40\%\) in the data \(u_4(t)\) for \(T_1 = 50\) and \(T_1 = 100\), respectively.

We again observe from the results of Table 2 and Figs. 6 and 7 that with local time-mesh refinements the reconstruction of the drug efficacy \(\eta _\tau\) is significantly improved compared to the reconstruction of \(\eta _\tau\) obtained on the initial non-refined time-mesh even if we add large noise \(\sigma = 40\%\) to the observed data \(u_4(t)\).

Conclusion

The finite element time-adaptive optimization method for determination of the drug efficacy in a mathematical model of HIV infection with drug therapy is presented. Time-adaptive optimization means that first the time-dependent drug efficacy is determined at a known coarse time partition using several known values of observed function (usually, we used 15–20 observations). Then the time-mesh is locally refined at points where the computed residual \(|R(\eta _\tau )|\) attains its maximal values and the drug efficacy is computed on a new refined time-mesh until the relative error in the reconstructed parameter \(\eta\) is reduced to the desired accuracy. Numerical experiments show efficiency and reliability of proposed adaptive method on reconstruction of different model functions \(\eta\) from noisy observed virus population function.

The proposed new time-adaptive method can eventually be used by clinicians to determine the drug-response for each treated individual. The exact knowledge of the personal drug efficacy can aid in the determination of the most suitable drug as well as the most optimal dose for each person, in the long run resulting in a personalized treatment with maximum efficacy and minimum adverse drug reactions.

The proposed time-adaptive method can be adopted to solve multi- parameter identification problems for a bread class of problems stated by the system of ODE.