1 Introduction

The spread of infectious diseases within human populations is an important topic of research in mathematical modelling that has become widely studied in recent times. Many different models, such as the SI, SIS, SIR and SIRS types, have been proposed to better understand the underlying transmission mechanism of infections. In these models, the population under study is divided into several classes or subpopulations, such as susceptible (S), infected (I) and recovered (R).

The incidence rate of a disease, which measures how many people become infected by that disease per time unit, plays an important role in the dynamics of epidemic models. Traditionally, the incidence rate has been assumed to be bilinear with respect to the number of susceptible individuals (S) and the number of infected individuals (I). However, many other functions have been proposed to model the transmission of infectious diseases. In general, the incidence function can be written in the form F(SI), where the function F satisfies some common properties, such as the following:

(A1’):

\(F(S,I) = IF_1(S,I)\) with \(F,F_1\in C^1(\mathbb {R}_+^2)\).

(A2’):

\(F(0,I)=F(S,0)=0\) for all \(S,I\ge 0\).

(A3’):

\(\frac{\partial F}{\partial I}(S,0)>0\) for all \(S>0\); \(\frac{\partial F_1}{\partial S}(S,I)>0\) and \(\frac{\partial F_1}{\partial I}(S,I)\le 0\) for all \(S,I\ge 0\).

(A4’):

There exists a positive constant \(\beta \) such that \(F(S,I)\le \beta SI\) for all \(S,I\ge 0\).

Many specific forms of incidence functions commonly studied in the literature satisfy (A1’)(A4’). Some examples are:

  • \(F(S,I) = \beta SI\) (bilinear) [1];

  • \(F(S,I) = \beta SI/(1+a_1I)\) (saturated with respect to infectives) [2];

  • \(F(S,I) = \beta SI/(1+a_2S)\) (saturated with respect to susceptibles) [3];

  • \(F(S,I) = \beta SI/(1+k_1I+k_2S)\) (Beddington–DeAngelis) [4];

  • \(F(S,I) = \beta SI/\big [(1+k_1I)(1+k_2S)\big ]\) (Crowley–Martin) [5];

It should be noted that the above forms of incidence rate are monotone with respect to S, I and concave with respect to I. On the other hand, non-monotone incidence rates have been used to describe the effect of media coverage: at the beginning of an epidemic, the population has little awareness of the preventive measures, so the contact rate increases rapidly. As the population becomes more aware of risks, they take measures to control the outbreak, so the number of infectious contacts decreases. Some examples of non-monotone incidence functions are the following:

  • \(F(S,I) = \beta e^{-mI}SI\) with \(m>0\) [6];

  • \(F(S,I) = \beta SI/(1+aS+bI^2)\) with \(a,b>0\) [7];

  • \(F(S,I) = \beta SI/(1+\omega _1I+\omega _2I^2)\) with \(\omega _1,\omega _2>0\) [8];

Given the wide variety of forms that have been proposed for the incidence rate, some authors have opted to study epidemic models that include a general class of incidence functions. This has the advantage that the results proved for particular models can be generalized to a broader class of models, and the researchers can focus on other features, such as spatial heterogeneity that could yield more complicated dynamics or bifurcations.

A SIRS model with transfer from the infected to the susceptible class was studied in [9] using the system of ordinary differential equations

$$\begin{aligned} \begin{aligned} \dfrac{d S}{d t}&= \Lambda - \mu S - Sf(I) + \gamma _1I + \delta R, \\ \dfrac{d I}{d t}&= Sf(I) - (\mu +\alpha +\gamma _1+\gamma _2)I, \\ \dfrac{d R}{d t}&= \gamma _2I - \mu R - \delta R, \end{aligned} \end{aligned}$$
(1.1)

where S(t), I(t) and R(t) denote the number of susceptible, infected, and recovered individuals at time t, respectively, and the parameters are interpreted as follows:

  • \(\Lambda \): recruitment rate of susceptible population.

  • \(\mu \): natural death rate.

  • \(\alpha \): disease-induced death rate.

  • \(\gamma _1\): transfer rate from the infected class to the susceptible class.

  • \(\gamma _2\): transfer rate from the infected class to the recovered class.

  • \(\delta \): rate of immunity loss.

The authors in [9] considered an incidence function of the form Sf(I) and proved that the threshold dynamics for model (1.1) is completely determined by the basic reproduction number, which is denoted as \(\mathcal {R}_0\). Variants of such model were later studied in [10] and [11] using more general incidence rates, which take the form f(SI). The authors in [10, 11] showed that the generalized model retains its threshold dynamics under certain conditions of the parameters.

On the other hand, we should consider that the individuals of the population under study may move randomly in the space. Several studies on modelling of infectious diseases such as influenza [12], cholera [13], dengue [14], brucellosis [15] and COVID-19 [16] have highlighted the importance of individual motility in the dynamics of an outbreak. Thus, it is appropriate to consider epidemic models based on reaction–diffusion equations, where the moving patterns for susceptible, infected and recovered individuals are modelled through the diffusion rate of each subpopulation.

Moreover, many communicable diseases occur in a heterogeneous environment due to the differences in environmental conditions such as humidity, temperature or the varying availability of medical resources. This has led some researchers to study epidemic models where some of the parameters depend on a spatial variable x. In particular, the transmission rate of the disease may be represented by a function that depends not only on the number of susceptible and infected individuals, but also on the spatial location. For example, a SIRS-type model with diffusion and heterogeneous parameters was studied in [12] with the incidence function f(xI)S. Other reaction–diffusion models have been proposed in [17] using the general incidence function \(\beta (x)f(S,I)\) and in [13] using the function f(xSI), which take into account both the dependence on the spatial variable and the non-linearity with respect to S and I.

A diffusive epidemic model based on system (1.1) was studied by Yang et al. in [18], where the authors established the global attractiveness of the disease-free equilibrium when \(\mathcal {R}_0<1\) and the persistence of the disease when \(\mathcal {R}_0>1\). Although the authors in [18] included a general incidence rate of the form \(\beta (x)f(S,I)\) and spatially variable parameters, they assumed that the diffusion coefficients for the susceptible, infected and recovered individuals are all equal to a constant D. In reality, there could be differences in the motility patterns of these subpopulations due to the individuals changing their behaviour when they get the disease, and the diffusion rate may also vary with the spatial location.

Based on the above discussion, we propose here a diffusive version of the SIRS model (1.1) that includes three different diffusion coefficients with spatial heterogeneity. The model to be studied is given by the following system of partial differential equations:

$$\begin{aligned} \begin{array}{ll} \partial _tS = \nabla \cdot \big (d_S(x)\nabla S\big ) + \Lambda (x) - \mu (x)S - F(x,S,I) + \gamma _1(x)I + \delta (x)R, &{} \ x\in \Omega ,\ t>0, \\ \partial _tI = \nabla \cdot \big (d_I(x)\nabla I\big ) + F(x,S,I) - (\mu +\alpha +\gamma _1+\gamma _2)(x)I, &{} \ x\in \Omega ,\ t>0, \\ \partial _tR = \nabla \cdot \big (d_R(x)\nabla R\big ) + \gamma _2(x)I - \mu (x)R - \delta (x)R, &{} \ x\in \Omega ,\ t>0, \\ \big [d_S(x)\nabla S(x,t)\big ]\cdot \mathbf{n} = \big [d_I(x)\nabla I(x,t)\big ]\cdot \mathbf{n} = \big [d_R(x)\nabla R(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t>0, \end{array} \end{aligned}$$
(1.2)

subject to the initial conditions

$$\begin{aligned} S(x,0)=S_0(x)\ge 0,\quad I(x,0)=I_0(x)\ge 0,\quad R(x,0)=R_0(x)\ge 0,\quad x\in \Omega . \end{aligned}$$
(1.3)

Here, the variables S(xt), I(xt) and R(xt) represent the number of susceptible, infected, and recovered individuals, respectively, at position x and time t, while \(\nabla \) is the gradient operator. We assume that the domain \(\Omega \) is a connected, bounded subset of \(\mathbb {R}^n\) with smooth boundary \(\partial \Omega \). The positive functions \(d_S(\cdot )\), \(d_I(\cdot )\) and \(d_R(\cdot )\) denote the spatially heterogeneous diffusion coefficients for each subpopulation. The spatially dependent parameters \(\Lambda (\cdot )\), \(\mu (\cdot )\), \(\alpha (\cdot )\), \(\gamma _1(\cdot )\), \(\gamma _2(\cdot )\) and \(\delta (\cdot )\) have the same meaning as for model (1.1); for biological reasons, they are assumed to be strictly positive (except \(\gamma _1\), which is nonnegative) and uniformly bounded on \(\overline{\Omega }\). Furthermore, for the model to be well-posed, we assume that

$$\begin{aligned} \Lambda (\cdot ), \mu (\cdot ), \alpha (\cdot ), \gamma _1(\cdot ), \gamma _2(\cdot ), \delta (\cdot ) \in C^0(\overline{\Omega },\mathbb {R}). \end{aligned}$$

Based on (A1’)(A4’), we make the following assumptions on the incidence function F(xSI) for model (1.2):

(A1):

\(F(x,S,I) = IF_1(x,S,I)\) with \(F,F_1\in C^1(\Omega \times \mathbb {R}_+^2)\).

(A2):

\(F(x,0,I)=F(x,S,0)=0\) for all \(x\in \Omega \), \(S,I\ge 0\).

(A3):

\(\frac{\partial F}{\partial I}(x,S,0)>0\) for all \(x\in \Omega \), \(S>0\); \(\frac{\partial F_1}{\partial S}(x,S,I)>0\) and \(\frac{\partial F_1}{\partial I}(x,S,I)\le 0\) for all \(x\in \Omega \), \(S,I\ge 0\).

(A4):

There exists a Hölder continuous function \(\beta :\Omega \rightarrow \mathbb {R}_+\) such that \(F(x,S,I)\le \beta (x)SI\) for all \(x\in \Omega \), \(S,I\ge 0\).

The rest of this paper is organized as follows. First, in Sect. 2, we prove the existence of bounded global solutions for our model. Next, in Sect. 3, we compute the basic reproduction number and study the stability of the disease-free steady state. In Sect. 4, we study the persistence of the model and the existence of endemic equilibria. In Sect. 5, we perform a bifurcation analysis for a special case of our model. In Sect. 6, we carry out some numerical simulations. Finally, we provide some concluding remarks in Sect. 7.

2 Basic properties of the model

Let \(\mathbb {X}:=C(\overline{\Omega },\mathbb {R}^3)\) be the Banach space with the supremum norm \(\left\| \cdot \right\| \), and define \(\mathbb {X}^+:=C(\overline{\Omega },\mathbb {R}_+^3)\). Then, \((\mathbb {X},\mathbb {X}^+)\) is a strongly ordered space.

We will now prove the existence of unique global solutions for our model. For that, we introduce the notation

$$\begin{aligned} \psi ^- := \min _{x\in \overline{\Omega }}\psi (x),\qquad \psi ^+ := \max _{x\in \overline{\Omega }}\psi (x), \end{aligned}$$

where \(\psi \) is any of \(\Lambda \), \(\mu \), \(\alpha \), \(\gamma _1\), \(\gamma _2\), \(\delta \) or \(\beta \).

Theorem 2.1

For every initial value function \(\phi :=(\phi _1,\phi _2,\phi _3)\in \mathbb {X}^+\), model (1.2) has a unique solution \(U(\cdot ,t;\phi )=\big (S(\cdot ,t;\phi ),\ I(\cdot ,t;\phi ),\ R(\cdot ,t;\phi )\big )\) with \(U(\cdot ,0;\phi )=\phi \) defined on \([0,\infty )\).

Proof

Define \(d_1(x)=d_S(x)\), \(d_2(x)=d_I(x)\), \(d_3(x)=d_R(x)\), \(\pi _1(x)=\mu (x)\), \(\pi _2(x)=\mu (x) + \alpha (x) + \gamma _1(x) + \gamma _2(x)\) and \(\pi _3(x)=\mu (x) + \delta (x)\). For \(i=1,2,3\), let \(\Gamma _i(t):C(\overline{\Omega },\mathbb {R})\rightarrow C(\overline{\Omega },\mathbb {R})\) be the \(C_0\) semigroup associated with \(\nabla \cdot (d_i(\cdot )\nabla ) - \pi _i(\cdot )\) subject to the Neumann boundary condition. Then,

$$\begin{aligned} \big (\Gamma _i(t)\phi \big )(x) = \int \limits _\Omega T_i(x,y,t)\phi (y)\,d y, \qquad \forall t>0,\ \phi \in C(\overline{\Omega },\mathbb {R}), \end{aligned}$$

where \(T_i(x,y,t)\) represents the Green function associated with \(\nabla \cdot (d_i(\cdot )\nabla ) - \pi _i(\cdot )\) subject to the Neumann boundary condition. By [19, Corollary 7.2.3], it follows that \(\Gamma (t):=\big (\Gamma _1(t),\Gamma _2(t),\Gamma _3(t)\big )\) is strongly positive and compact for each \(t>0\).

Following [20, 21], we can see that there exist constants \(M_i>0\) (\(i=1,2,3\)) such that

$$\begin{aligned} \left\| \Gamma _i(t)\right\| \le M_ie^{\alpha _it} \quad \text {for all }t\ge 0, \end{aligned}$$
(2.1)

where \(\alpha _i<0\) is the principal eigenvalue of \(\nabla \cdot (d_i(\cdot )\nabla ) - \pi _i(\cdot )\) subject to the Neumann boundary condition.

For every initial value function \(\phi =\big (\phi _1(\cdot ),\phi _2(\cdot ),\phi _3(\cdot )\big )\in \mathbb {X}^+\), we define \(G=(G_1,G_2,G_3):\mathbb {X}^+\rightarrow \mathbb {X}\) by

$$\begin{aligned} G_1(\phi )(x)&= \Lambda (x) - F\big (x,\phi _1(x),\phi _2(x)\big ) + \gamma _1(x)\phi _2(x) + \delta (x)\phi _3(x), \\ G_2(\phi )(x)&= F\big (x,\phi _1(x),\phi _2(x)\big ), \\ G_3(\phi )(x)&= \gamma _2(x)\phi _2(x). \end{aligned}$$

Then, model (1.2) can be rewritten as the integral equation

$$\begin{aligned} U(t) = \Gamma (t)\phi + \int \limits _0^t\Gamma (t-s)G(U(s))\,d s, \end{aligned}$$
(2.2)

where \(U(t)=\big (S(t),I(t),R(t)\big )^T\). It is easy to see that the subtangential condition in [22, Corollary 4] is satisfied. Thus, the model has a unique positive solution \(\big (S(\cdot ,t;\phi ),\ I(\cdot ,t;\phi ),\ R(\cdot ,t;\phi )\big )\) on \([0,\tau )\), where \(0<\tau \le +\infty \).

We will now prove that the local solution can be extended to a global one, i.e. \(\tau =+\infty \).

Suppose by contradiction that \(\tau <+\infty \). By the theory of abstract functional differential equations (see [22, Theorem 2]), we know that

$$\begin{aligned} \left\| U(x,t,\phi )\right\| \rightarrow +\infty \text { as }t\rightarrow \tau . \end{aligned}$$
(2.3)

Hence, it suffices to prove that the solution is bounded on \(\Omega \times [0,\tau )\). To this end, we define

$$\begin{aligned} K(t) = \int \limits _\Omega \big [S(x,t) + I(x,t) + R(x,t)\big ]\,d x. \end{aligned}$$

By the divergence theorem [23, Theorem 3.7] and the homogeneous Neumann boundary conditions, we have

$$\begin{aligned} \int \limits _\Omega \nabla \cdot \big (d_S(x)\nabla S\big )\,d x = 0,\quad \int \limits _\Omega \nabla \cdot \big (d_I(x)\nabla I\big )\,d x = 0,\quad \int \limits _\Omega \nabla \cdot \big (d_R(x)\nabla R\big )\,d x = 0. \end{aligned}$$

Thus,

$$\begin{aligned} \frac{\text {d}}{\text {d}t}K(t)&= \int \limits _\Omega \left[ \Lambda (x) - \mu (x)\big (S(x,t)+I(x,t)+R(x,t)\big ) - \alpha (x)I(x,t)\right] \,d x \\&\le \Lambda ^+|\Omega | - \mu ^-K(t),\qquad \forall t\in [0,\tau ). \end{aligned}$$

By the comparison principle, there exists a constant \(N_1>0\) and \(t_1>0\) such that \(K(t)\le N_1\) for all \(t\in [t_1,\tau )\). Consequently,

$$\begin{aligned} \int \limits _\Omega S(x,t)\,d x\le N_1,\quad \int \limits _\Omega I(x,t)\,d x\le N_1,\quad \int \limits _\Omega R(x,t)\,d x\le N_1,\quad \forall t\in [t_1,\tau ). \end{aligned}$$
(2.4)

Next, denote by \(\tau ^i_j\) the eigenvalue of \(\nabla \cdot (d_i(\cdot )\nabla ) - \pi _i(\cdot )\) subject to the Neumann boundary condition corresponding to the eigenfunction \(\varphi ^i_j(x)\), such that \(\tau ^i_1 > \tau ^i_2 \ge \tau ^i_3 \ge \cdots \ge \tau ^i_j \ge \cdots \) for \(i=1,2,3\). From [24, Chapter 5], we can write

$$\begin{aligned} T_i(x,y,t) = \sum _{j\ge 1} \exp \left( \tau ^i_jt\right) \varphi ^i_j(x)\varphi ^i_j(y). \end{aligned}$$

Since \(\varphi ^i_j\) is uniformly bounded, there exists \(\omega >0\) such that

$$\begin{aligned} T_i(x,y,t) \le \omega \sum _{j\ge 1} \exp \left( \tau ^i_jt\right) , \qquad \forall t>0,\ x,y\in \overline{\Omega }. \end{aligned}$$

For \(i=1,2,3\), define

$$\begin{aligned} d_i^- := \min _{x\in \overline{\Omega }}d_i(x) \quad \text {and}\quad \pi _i^- := \min _{x\in \overline{\Omega }}\pi _i(x). \end{aligned}$$

Denote by \(\rho ^i_j\) the eigenvalues of \(\nabla \cdot (d_i^-\nabla ) - \pi _i^-\) subject to the Neumann boundary condition, such that \(-\pi _i^- = \rho ^i_1 > \rho ^i_2 \ge \rho ^i_3 \ge \cdots \ge \rho ^i_j \ge \cdots \). Then, \(-\rho ^i_j\) and \(-\tau ^i_j\) are the j-th eigenvalues of \(-\nabla \cdot (d_i^-\nabla ) + \pi _i^-\) and \(-\nabla \cdot (d_i(\cdot )\nabla ) + \pi _i(\cdot )\), respectively, subject to the Neumann boundary condition.

For all \(x\in \Omega \) and \(z\in \mathbb {R}\), we have \(d_i^-z^2 \le d_i(x)z^2\) and \(\pi _i^- \le \pi _i(x)\). Then, by [25, Theorem 2.4.7], we know that \(-\rho ^i_j \le -\tau ^i_j\) for all \(j=1,2,\ldots \), which implies that \(\rho ^i_j \ge \tau ^i_j\) for all \(j=1,2,\ldots \). Since \(\rho ^i_j\) decreases like \(-j^2\), there exists \(w_i>0\) such that

$$\begin{aligned} T_i(x,y,t) \le \omega \sum _{j\ge 1} \exp \left( \rho ^i_jt\right) \le w_i\exp \left( \rho ^i_1t\right) = w_i\exp \left( -\pi _i^-t\right) , \quad \forall t>0,\ x,y\in \overline{\Omega },\ i=1,2,3. \end{aligned}$$

Using (2.2) together with (2.1) and (2.4), we obtain

$$\begin{aligned} S(x,t)&= \Gamma _1(t)S(x,t_1) \\&\quad + \int \limits _{t_1}^t\Gamma _1(t-s)\big [\Lambda (x) - F(x,S(x,t),I(x,t)) + \gamma _1(x)I(x,t) + \delta (x)R(x,t)\big ]\,d s \\&\le \Gamma _1(t)S(x,t_1) + \int \limits _{t_1}^t\Gamma _1(t-s)\big [\Lambda (x) + \gamma _1(x)I(x,t) + \delta (x)R(x,t)\big ]\,d s \\&\le M_1e^{\alpha _1(t-t_1)}\left\| S(\cdot ,t_1)\right\| + \int \limits _{t_1}^t\int \limits _\Omega T_1(x,y,t-s)\big [\Lambda (x) + \gamma _1(x)I(x,t) + \delta (x)R(x,t)\big ]\,d y\,d s \\&\le M_1e^{\alpha _1(t-t_1)}\left\| S(\cdot ,t_1)\right\| + \int \limits _{t_1}^tw_1e^{-(t-s)\pi _1^-}\left[ \Lambda ^+ + \gamma _1^+\int \limits _\Omega I(y,s)\,d y + \delta ^+\int \limits _\Omega R(y,s)\,d y\right] \,d s \\&\le M_1e^{\alpha _1(t-t_1)}\left\| S(\cdot ,t_1)\right\| + \int \limits _{t_1}^tw_1e^{-(t-s)\pi _1^-}\left( \Lambda ^+ + \gamma _1^+N_1 + \delta ^+N_1\right) \,d s \\&= M_1e^{\alpha _1(t-t_1)}\left\| S(\cdot ,t_1)\right\| + w_1\left( \Lambda ^+ + \gamma _1^+N_1 + \delta ^+N_1\right) \frac{1-e^{-(t-t_1)\pi _1^-}}{\pi _1^-} \\&\le M_1e^{-\alpha _1t_1}\left\| S(\cdot ,t_1)\right\| + \frac{w_1\left( \Lambda ^+ + \gamma _1^+N_1 + \delta ^+N_1\right) }{\mu ^-},\qquad \forall t\in [t_1,\tau ). \end{aligned}$$

Hence,

$$\begin{aligned} \left\| S(\cdot ,t)\right\| \le M_1e^{-\alpha _1t_1}\left\| S(\cdot ,t_1)\right\| + \frac{w_1\left( \Lambda ^+ + \gamma _1^+N_1 + \delta ^+N_1\right) }{\mu ^-} =:N_2, \qquad \forall t\in [t_1,\tau ). \end{aligned}$$

From the second equation of (1.2) and assumption (A4), we get

$$\begin{aligned} I(x,t)&= \Gamma _2(t)I(x,t_1) + \int \limits _{t_1}^t\Gamma _2(t-s)F(x,S(x,t),I(x,t))\,d s \\&\le \Gamma _2(t)I(x,t_1) + \int \limits _{t_1}^t\Gamma _2(t-s)\beta (x)S(x,t)I(x,t)\,d s \\&\le M_2e^{\alpha _2(t-t_1)}\left\| I(\cdot ,t_1)\right\| + \int \limits _{t_1}^t\int \limits _\Omega T_2(x,y,t-s)\beta (x)S(x,t)I(x,t)\,d y\,d s \\&\le M_2e^{\alpha _2(t-t_1)}\left\| I(\cdot ,t_1)\right\| + w_2\beta ^+N_1N_2\frac{1-e^{-(t-t_1)\pi _2^-}}{\pi _2^-} \\&\le M_2e^{-\alpha _2t_1}\left\| I(\cdot ,t_1)\right\| + \frac{w_2\beta ^+N_1N_2}{\mu ^- + \alpha ^- + \gamma _1^- + \gamma _2^-},\qquad \forall t\in [t_1,\tau ). \end{aligned}$$

Hence, we obtain

$$\begin{aligned} \left\| I(\cdot ,t)\right\| \le M_2e^{-\alpha _2t_1}\left\| I(\cdot ,t_1)\right\| + \frac{w_2\beta ^+N_1N_2}{\mu ^- + \alpha ^- + \gamma _1^- + \gamma _2^-} =:N_3, \qquad \forall t\in [t_1,\tau ). \end{aligned}$$

Similarly, the third equation of (1.2) yields

$$\begin{aligned} R(x,t)&= \Gamma _3(t)R(x,t_1) + \int \limits _{t_1}^t\Gamma _3(t-s)\gamma _2(x)I(x,t)\,d s \\&\le M_3e^{\alpha _3(t-t_1)}\left\| R(\cdot ,t_1)\right\| + \int \limits _{t_1}^t\int \limits _\Omega T_3(x,y,t-s)\gamma _2(x)I(x,t)\,d y\,d s \\&\le M_3e^{\alpha _3(t-t_1)}\left\| R(\cdot ,t_1)\right\| + w_3\gamma _2^+N_1\frac{1-e^{-(t-t_1)\pi _3^-}}{\pi _3^-} \\&\le M_3e^{-\alpha _3t_1}\left\| R(\cdot ,t_1)\right\| + \frac{w_3\gamma _2^+N_1}{\mu ^- + \delta ^-},\qquad \forall t\in [0,\tau ), \end{aligned}$$

and thus,

$$\begin{aligned} \left\| R(\cdot ,t)\right\| \le M_3e^{-\alpha _3t_1}\left\| R(\cdot ,t_1)\right\| + \frac{w_3\gamma _2^+N_1}{\mu ^- + \delta ^-}, \qquad \forall t\in [t_1,\tau ). \end{aligned}$$

Hence, we conclude that S, I and R are bounded on \(\Omega \times [0,\tau )\), which contradicts (2.3). This proves that \(\tau =+\infty \). \(\square \)

In a similar way, we can obtain the following corollary on the boundedness of solutions on \([0,\infty )\).

Corollary 2.2

For each solution \(U(\cdot ,t;\phi )=\big (S(\cdot ,t;\phi ),\ I(\cdot ,t;\phi ),\ R(\cdot ,t;\phi )\big )\) of (1.2) with initial value function \(\phi \in \mathbb {X}^+\), there exist positive constants \(M_S,M_I,M_R\) independent of initial data such that

$$\begin{aligned} \limsup _{t\rightarrow \infty }\left\| S(\cdot ,t)\right\| \le M_S,\qquad \limsup _{t\rightarrow \infty }\left\| I(\cdot ,t)\right\| \le M_I,\qquad \limsup _{t\rightarrow \infty }\left\| R(\cdot ,t)\right\| \le M_R. \end{aligned}$$
(2.5)

Furthermore, the solution semiflow \(\Phi _t:\mathbb {X}^+\rightarrow \mathbb {X}^+\) is point dissipative and has a global compact attractor.

Proof

Using the same notation as in the proof of Theorem 2.1 and replacing \(\tau \) by \(+\infty \), we can show that

$$\begin{aligned} M_S = \frac{w_1\left( \Lambda ^+ + \gamma _1^+N_1 + \delta ^+N_1\right) }{\mu ^-},\qquad M_I = \frac{w_2\beta ^+N_1M_S}{\mu ^- + \alpha ^- + \gamma _1^- + \gamma _2^-},\qquad M_R = \frac{w_3\gamma _2^+N_1}{\mu ^- + \delta ^-} \end{aligned}$$

satisfy (2.5). From this, it follows that the system is point dissipative. In addition, we know by [26, Theorem 2.2.6] that the solution semiflow \(\Phi _t\) is compact for any \(t>0\). Then, we have by [27, Theorem 3.4.8] that \(\Phi _t\) has a global compact attractor. Consequently, the proof is complete. \(\square \)

3 Basic reproduction number and stability of the disease-free steady state

In this section, we will study the disease-free dynamics of model (1.2) and determine the basic reproduction number \(\mathcal {R}_0\), which is defined as the average number of secondary infections generated by a single infected individual introduced in a completely susceptible population.

Setting \(I(x,t)=R(x,t)\equiv 0\) in (1.2), we obtain the following equation for the density of susceptible population in absence of the disease:

$$\begin{aligned} \begin{array}{ll} \partial _tS = \nabla \cdot \big (d_S(x)\nabla S\big ) + \Lambda (x) - \mu (x)S, &{} \ x\in \Omega ,\ t>0, \\ \big [d_S(x)\nabla S(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t>0, \end{array} \end{aligned}$$
(3.1)

System (3.1) admits a unique positive steady state \(S_0(x)\), which is globally asymptotically stable (see [20, Lemma 1]). We will call \(E_0:=(S_0(\cdot ),0,0)\in \mathbb {X}^+\) the disease-free steady state of model (1.2).

Linearizing the equation for the infected population in (1.2) around \(E_0\), we obtain

$$\begin{aligned} \begin{array}{ll} \partial _tI = \nabla \cdot \big (d_I(x)\nabla I\big ) + \big [\frac{\partial F}{\partial I}(x,S_0,0) - (\mu +\alpha +\gamma _1+\gamma _2)(x)\big ]I, &{} \ x\in \Omega ,\ t>0, \\ \big [d_I(x)\nabla I(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t>0. \end{array} \end{aligned}$$
(3.2)

Substituting \(I(x,t)=e^{\lambda t}\varphi (x)\) in the above equation yields the following eigenvalue problem:

$$\begin{aligned} \begin{array}{ll} \lambda \varphi (x) = \nabla \cdot \big (d_I(x)\nabla \varphi (x)\big ) + \big [\frac{\partial F}{\partial I}(x,S_0,0) - (\mu +\alpha +\gamma _1+\gamma _2)(x)\big ]\varphi (x), &{} \ x\in \Omega , \\ \big [d_I(x)\nabla \varphi (x)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega . \end{array} \end{aligned}$$
(3.3)

It follows from the standard Krein–Rutman theorem that (3.3) has a principal eigenvalue \(\lambda ^*(S_0) = \mathbf {s}\big (\nabla \cdot (d_I\nabla ) + \frac{\partial F}{\partial I}(\cdot ,S_0,0) - (\mu +\alpha +\gamma _1+\gamma _2)\big )\) corresponding to a positive eigenfunction \(\varphi ^*(x)\), where \(\mathbf {s}(A)\) denotes the spectral bound of a closed linear operator A.

It is well-known that \(\lambda ^*(S_0)\) is given by the variational characterization

$$\begin{aligned} \begin{aligned} \lambda ^*(S_0) = -\inf \Bigg \{&\int \limits _\Omega \big [d_I(x)\big |\nabla \varphi (x)\big |^2 + [(\mu +\alpha +\gamma _1+\gamma _2)(x) - \frac{\partial F}{\partial I}(x,S_0,0)]\varphi ^2(x) \big ]\,d x :\\&\varphi \in H^1(\Omega )\text { and }\int \limits _\Omega \varphi ^2(x)\,d x = 1\Bigg \}. \end{aligned} \end{aligned}$$
(3.4)

Let \(\mathcal {T}_2(t)\) be the semigroup on \(C(\overline{\Omega },\mathbb {R})\) associated with \(\nabla \cdot \big (d_I\nabla I\big ) - (\mu +\alpha +\gamma _1+\gamma _2)\). In order to define the basic reproduction number for our model, we assume that the population is near the disease-free steady state \(E_0\), and introduce infected individuals at time \(t=0\), where the spatial distribution of infected population is described by \(\phi _2(x)\). Then, \(\mathcal {T}_2(t)\phi _2(x)\) is the distribution of infected population as time evolves. It follows that the distribution of new infections at time t is \(\frac{\partial F}{\partial I}(x,S_0,0)\mathcal {T}_2(t)\phi _2(x)\). Thus, the distribution of total new infections becomes

$$\begin{aligned} \mathcal {L}(\phi _2)(x) := \int \limits _0^\infty \frac{\partial F}{\partial I}(x,S_0,0)\mathcal {T}_2(t)\phi _2(x)\,d t = \frac{\partial F}{\partial I}(x,S_0,0)\int \limits _0^\infty \mathcal {T}_2(t)\phi _2(x)\,d t, \end{aligned}$$

where \(\mathcal {L}\) is the next infection operator, which maps the initial distribution \(\phi _2\) of infected individuals to the distribution of total infected individuals produced during the infection period.

Following [28,29,30,31], we define the basic reproduction number for model (1.2) as the spectral radius of \(\mathcal {L}\), that is,

$$\begin{aligned} \mathcal {R}_0 := \rho (\mathcal {L}) = \sup _{0\ne \varphi \in H^1(\Omega )}\left\{ \frac{\int \limits _\Omega \frac{\partial F}{\partial I}(x,S_0,0)\varphi ^2(x)\,d x}{\int \limits _\Omega \left[ d_I(x)\big |\nabla \varphi (x)\big |^2 + (\mu +\alpha +\gamma _1+\gamma _2)(x)\varphi ^2(x)\right] \,d x} \right\} . \end{aligned}$$
(3.5)

Define \(u = (I,S,R)\),

$$\begin{aligned} \mathcal {F}(x,u) = \begin{pmatrix} F(x,S,I) \\ 0 \\ 0 \end{pmatrix},\qquad \mathcal {V}(x,u) = \begin{pmatrix} (\mu +\alpha +\gamma _1+\gamma _2)(x)I \\ \mu (x)S + F(x,S,I) - \Lambda (x) - \gamma _1(x)I - \delta (x)R \\ \mu (x)R + \delta (x)R - \gamma _2(x)I \end{pmatrix}. \end{aligned}$$

Thus, model (1.2) can be rewritten as

$$\begin{aligned} \begin{array}{ll} \partial _tu_i = \nabla \cdot \big (d_i(x)\nabla u_i\big ) + \mathcal {F}_i(x,u) - \mathcal {V}_i(x,u), &{} \ 1\le i\le 3,\ x\in \Omega ,\ t>0, \\ \big [d_i(x)\nabla u_i(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ 1\le i\le 3,\ x\in \partial \Omega ,\ t>0, \end{array} \end{aligned}$$
(3.6)

where \(d_1=d_I\), \(d_2=d_S\) and \(d_3=d_R\). The disease-free steady state is given by \(u_0(x) = \big (0,S_0(x),0\big )\), with the variables ordered as (ISR).

We can immediately verify that assumptions (A1)–(A4) in [31] hold. Moreover, if we define

$$\begin{aligned} M^0(x) = \begin{pmatrix} - \mu (x) &{} \delta (x) \\ 0 &{} - \mu (x) - \delta (x) \end{pmatrix},\quad F_0(x) = \frac{\partial F}{\partial I}(x,S_0,0)\quad \text {and}\quad V(x) = (\mu +\alpha +\gamma _1+\gamma _2)(x), \end{aligned}$$

then \(M^0(x)\) and \(-V(x)\) are cooperative matrices for all \(x\in \overline{\Omega }\), i.e. their off-diagonal elements are nonnegative. Hence, assumptions (A5) and (A6) in [31] also hold, so we can conclude the following result by [31, Theorem 3.1].

Lemma 3.1

\(\mathcal {R}_0-1\) has the same sign as \(\lambda ^*\). Furthermore, if \(\mathcal {R}_0<1\), then \(E_0\) is locally asymptotically stable for system (1.2).

Before proving the main result of this section, we give the following lemma.

Lemma 3.2

Suppose that \(U(\cdot ,t;\phi )=\big (S(\cdot ,t;\phi ),\ I(\cdot ,t;\phi ),\ R(\cdot ,t;\phi )\big )\) is the solution of system (1.2) with \(U(\cdot ,0;\phi )=\phi \in \mathbb {X}^+\). Then:

  1. (i)

    For any \(\phi \in \mathbb {X}^+\), we always have \(S(x,t;\phi )>0\) for all \(t>0\), \(x\in \overline{\Omega }\), and

    $$\begin{aligned} \liminf _{t\rightarrow \infty }S(x,t;\phi ) \ge \frac{\Lambda ^-}{\mu ^+ + \beta ^+M_I}, \qquad \text {uniformly for }x\in \overline{\Omega }. \end{aligned}$$
  2. (ii)

    If there exists \(t_0\ge 0\) such that \(I(\cdot ,t_0;\phi )\not \equiv 0\) (respectively, \(R(\cdot ,t_0;\phi )\not \equiv 0\)), then, for \(t>t_0\), we have \(I(\cdot ,t;\phi )>0\) (respectively, \(R(\cdot ,t;\phi )>0\)).

Proof

By assumption (A4) and Corollary 2.2, we have \(F(x,S,I) \le \beta ^+SI \le \beta ^+M_IS\) for all \(x\in \Omega \), \(S,I\ge 0\). Then, from the first equation of (1.2), we get

$$\begin{aligned} \begin{array}{ll} \partial _tS \ge \nabla \cdot \big (d_S(x)\nabla S\big ) + \Lambda ^- - \left( \mu ^+ + \beta ^+M_I\right) S, &{} \ x\in \Omega ,\ t>0, \\ \big [d_S(x)\nabla S(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t>0. \end{array} \end{aligned}$$
(3.7)

By the comparison principle, this shows that \(\liminf _{t\rightarrow \infty }S(x,t;\phi ) \ge \Lambda ^-/\left( \mu ^+ + \beta ^+M_I\right) \) uniformly for \(x\in \overline{\Omega }\).

From system (1.2), we can also get

$$\begin{aligned} \begin{array}{ll} \partial _tI \ge \nabla \cdot \big (d_I(x)\nabla I\big ) - (\mu +\alpha +\gamma _1+\gamma _2)(x)I, &{} \ x\in \Omega ,\ t>0, \\ \big [d_I(x)\nabla I(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t>0. \end{array} \end{aligned}$$
(3.8)

and

$$\begin{aligned} \begin{array}{ll} \partial _tR \ge \nabla \cdot \big (d_R(x)\nabla R\big ) - (\mu +\delta )(x)R, &{} \ x\in \Omega ,\ t>0, \\ \big [d_R(x)\nabla R(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t>0. \end{array} \end{aligned}$$
(3.9)

Using the strong maximum principle (see [32, Proposition 13.1]), part (ii) of the lemma is concluded. \(\square \)

Next, we give the main result concerning the stability of \(E_0\) in terms of the basic reproduction number.

Theorem 3.3

Assume that \(U(\cdot ,t;\phi )\) is the solution of system (1.2) with \(U(\cdot ,0;\phi )=\phi \in \mathbb {X}^+\). Then, the following statements are valid.

  1. (i)

    If \(\mathcal {R}_0<1\), then the disease-free steady state \(E_0=(S_0,0,0)\) is globally asymptotically stable.

  2. (ii)

    If \(\mathcal {R}_0>1\), then there exists a constant \(\varepsilon _0>0\) such that any positive solution of (1.2) satisfies

    $$\begin{aligned} \limsup _{t\rightarrow \infty }\left\| \big (S(\cdot ,t),I(\cdot ,t),R(\cdot ,t)\big ) - (S_0,0,0)\right\| > \varepsilon _0. \end{aligned}$$

Proof

  1. (i)

    By Lemma 3.1, we have \(\lambda ^*(S_0)<0\) when \(\mathcal {R}_0<1\). Thus, there exists a sufficiently small \(\varepsilon \) such that \(\lambda ^*(S_0+\varepsilon )<0\). According to Corollary 2.2, there exists a \(\tau _1>0\) such that

    $$\begin{aligned} S(x,t) \le S_0(x)+\varepsilon , \qquad \text {for all }t\ge \tau _1,\ x\in \Omega . \end{aligned}$$

    It follows from (A1)(A3) that

    $$\begin{aligned} F(x,S,I)&= IF_1(x,S,I) \le IF_1(x,S_0(x)+\varepsilon ,I) \le I\lim _{h\rightarrow 0}F_1(x,\,S_0(x)+\varepsilon ,\,h) \\&= I\lim _{h\rightarrow 0}\frac{F(x,\,S_0(x)+\varepsilon ,\,h)-F(x,\,S_0(x)+\varepsilon ,\,0)}{h} \\&= I\frac{\partial F}{\partial I}(x,\,S_0(x)+\varepsilon ,\,0) \end{aligned}$$

    for \(t\ge \tau _1\). Hence, by the second equation of (1.2), we have

    $$\begin{aligned} \begin{array}{ll} \partial _tI \le \nabla \cdot \big (d_I(x)\nabla I\big ) + \big [\frac{\partial F}{\partial I}(x,\,S_0(x)+\varepsilon ,\,0) - (\mu +\alpha +\gamma _1+\gamma _2)(x)\big ]I, &{} \ x\in \Omega ,\ t\ge \tau _1, \\ \big [d_I(x)\nabla I(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t\ge \tau _1. \end{array} \end{aligned}$$
    (3.10)

    Let \(\psi (x)\) be the eigenfunction corresponding to the principal eigenvalue \(\lambda ^*(S_0+\varepsilon )<0\). Let \(\xi _1>0\) be such that \(I(x,\tau _1)\le \xi _1\psi (x)\). By the comparison principle, we get

    $$\begin{aligned} I(x,t) \le \xi _1\psi (x)e^{\lambda ^*(S_0(x)+\varepsilon )(t-\tau _1)}, \qquad \text {for all }t\ge \tau _1,\ x\in \Omega . \end{aligned}$$

    This yields that \(\lim _{t\rightarrow \infty }I(x,t) = 0\) uniformly for \(x\in \overline{\Omega }\). Then, the third equation of (1.2) is asymptotic to

    $$\begin{aligned} \partial _tR = \nabla \cdot \big (d_R(x)\nabla R\big ) - (\mu +\delta )(x)R, \end{aligned}$$

    so \(\lim _{t\rightarrow \infty }R(x,t) = 0\) uniformly for \(x\in \overline{\Omega }\). Moreover, the first equation of (1.2) is asymptotic to

    $$\begin{aligned} \partial _tS = \nabla \cdot \big (d_S(x)\nabla S\big ) + \Lambda (x) - \mu (x)S, \end{aligned}$$

    which implies that \(\lim _{t\rightarrow \infty }S(x,t) = S_0(x)\) uniformly for \(x\in \overline{\Omega }\). This completes the proof.

  2. (ii)

    Assume by contradiction that there exists a positive solution of (1.2) such that

    $$\begin{aligned} \limsup _{t\rightarrow \infty }\left\| \big (S(\cdot ,t),I(\cdot ,t),R(\cdot ,t)\big ) - (S_0,0,0)\right\| < \varepsilon _0. \end{aligned}$$

    Then, there exists \(t_1>0\) such that \(S_0-\varepsilon _0< S(x,t) < S_0+\varepsilon _0\) and \(0< I(x,t) < \varepsilon _0\) for \(t\ge t_1\). It follows from (A1)–(A3) that

    $$\begin{aligned} \frac{F(x,S,I)}{I} = F_1(x,S,I) \ge F_1(x,\,S_0-\varepsilon _0,\,\varepsilon _0) \ge \frac{\partial F}{\partial I}(x,\,S_0-\varepsilon _0,\,\varepsilon _0). \end{aligned}$$

    Therefore, \(I(x,t;\phi )\) satisfies

    $$\begin{aligned} \begin{array}{ll} \partial _tI \ge \nabla \cdot \big (d_I(x)\nabla I\big ) + \big [\frac{\partial F}{\partial I}(x,\,S_0-\varepsilon _0,\,\varepsilon _0) - (\mu +\alpha +\gamma _1+\gamma _2)(x)\big ]I, &{} \ x\in \Omega ,\ t\ge t_1, \\ \big [d_I(x)\nabla I(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t\ge t_1. \end{array} \end{aligned}$$
    (3.11)

    For any \(\varepsilon \in \left( 0,\min _{x\in \overline{\Omega }}S_0(x)\right) \), we consider the eigenvalue problem

    $$\begin{aligned} \begin{array}{ll} \lambda \varphi (x) = \nabla \cdot \big (d_I(x)\nabla \varphi (x)\big ) + \big [\frac{\partial F}{\partial I}(x,\,S_0-\varepsilon ,\,\varepsilon ) - (\mu +\alpha +\gamma _1+\gamma _2)(x)\big ]\varphi (x), &{} \ x\in \Omega , \\ \big [d_I(x)\nabla \varphi (x)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega . \end{array} \end{aligned}$$

    Define \(\mathcal {R}_\varepsilon \) as the spectral radius of the operator

    $$\begin{aligned} \mathcal {L}_\varepsilon : \phi \rightarrow \frac{\partial F}{\partial I}(\cdot ,\,S_0-\varepsilon ,\,\varepsilon )\int \limits _0^\infty \mathcal {T}_2(t)\phi \,d t. \end{aligned}$$

    Since \(\lim _{\varepsilon \rightarrow 0}\mathcal {R}_\varepsilon = \mathcal {R}_0>1\), we restrict \(\varepsilon \) to be small enough such that \(\mathcal {R}_\varepsilon >1\). Hence \(\lambda ^*_\varepsilon := \mathbf {s}\big (\nabla \cdot (d_I\nabla ) + \frac{\partial F}{\partial I}(\cdot ,\,S_0-\varepsilon ,\,\varepsilon ) - (\mu +\alpha +\gamma _1+\gamma _2)\big )>0\). As a consequence, we can fix a small \(\varepsilon _0\in \left( \varepsilon ,\min _{x\in \overline{\Omega }}S_0(x)\right) \) such that \(\lambda ^*_{\varepsilon _0}>0\).

    By assumption, \(I(x,t)>0\) for all \(x\in \overline{\Omega }\) and \(t>0\). Then, by Lemma 3.2(ii), we can choose a sufficiently small number \(\eta >0\) such that \(I(\cdot ,t) \ge \eta \phi ^*_{\varepsilon _0}(\cdot )\), where \(\phi ^*_{\varepsilon _0}(\cdot )\) is a strongly positive eigenfunction corresponding to \(\lambda ^*_{\varepsilon _0}\). Notice that \(u_1(x,t) := \eta \exp \big (\lambda ^*_{\varepsilon _0}(t-t_1)\big )\phi ^*_{\varepsilon _0}(x)\) is a solution to the linear system

    $$\begin{aligned} \begin{array}{ll} \partial _tI = \nabla \cdot \big (d_I(x)\nabla I\big ) + \big [\frac{\partial F}{\partial I}(x,\,S_0-\varepsilon _0,\,\varepsilon _0) - (\mu +\alpha +\gamma _1+\gamma _2)(x)\big ]I, &{} \ x\in \Omega ,\ t\ge t_1, \\ \big [d_I(x)\nabla I(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t\ge t_1. \end{array} \end{aligned}$$
    (3.12)

    It then follows from (3.11) and the comparison principle that

    $$\begin{aligned} I(x,t) \ge \eta \exp \big (\lambda ^*_{\varepsilon _0}(t-t_1)\big )\phi ^*_{\varepsilon _0}(x), \qquad \text {for all }x\in \Omega ,\ t\ge t_1. \end{aligned}$$

    Since \(\lambda ^*_{\varepsilon _0}>0\), this implies that I(xt) is unbounded, which is a contradiction.

\(\square \)

4 Uniform persistence

We will now study the existence and persistence of the endemic equilibrium for model (1.2).

Theorem 4.1

Suppose that \(\mathcal {R}_0>1\). Then, (1.2) is uniformly persistent in the sense that there exists a constant \(\varepsilon >0\) such that for any \(\phi \in \mathbb {X}^+\) with \(\phi _2(\cdot )\not \equiv 0\), we have

$$\begin{aligned} \liminf _{t\rightarrow \infty }S(x,t;\phi ) \ge \varepsilon ,\quad \liminf _{t\rightarrow \infty }I(x,t;\phi ) \ge \varepsilon ,\quad \liminf _{t\rightarrow \infty }R(x,t;\phi ) \ge \varepsilon ,\quad \text {uniformly for }x\in \overline{\Omega }. \end{aligned}$$
(4.1)

Moreover, (1.2) admits at least one endemic equilibrium \((S^*,I^*,R^*)\).

Proof

Let

$$\begin{aligned} \mathbb {W}_0 = \{(\phi _1,\phi _2,\phi _3)\in \mathbb {X}^+ : \phi _2\not \equiv 0\text { and }\phi _3\not \equiv 0\} \end{aligned}$$

and

$$\begin{aligned} \partial \mathbb {W}_0 = \mathbb {X}^+\setminus \mathbb {W}_0 = \{(\phi _1,\phi _2,\phi _3)\in \mathbb {X}^+ : \phi _2\equiv 0\text { or }\phi _3\equiv 0\}. \end{aligned}$$

By Lemma 3.2(ii), it follows that for any \(\phi \in \mathbb {W}_0\), we have \(I(x,t;\phi )>0\) and \(R(x,t;\phi )>0\) for all \(x\in \overline{\Omega }\), \(t>0\). Hence, \(\Phi _t\mathbb {W}_0 \subseteq \mathbb {W}_0\) for all \(t\ge 0\).

Define

$$\begin{aligned} \mathbb {M}_\partial = \{\phi \in \partial \mathbb {W}_0 : \Phi _t(\phi )\in \partial \mathbb {W}_0\text { for all }t\ge 0\} \end{aligned}$$

and let \(\omega (\phi )\) be the omega limit set of the orbit \(O^+(\phi ) := \{\Phi _t(\phi ) : t\ge 0\}\). We will now show that \(\omega (\phi ) = \{(S_0(\cdot ),0,0)\}\) for all \(\phi \in M_\partial \).

For any given \(\phi \in \mathbb {M}_\partial \), we have \(\Phi _t(\phi )\in \partial \mathbb {W}_0\). It then follows that for each \(t\ge 0\), either \(I(\cdot ,t;\phi )\equiv 0\) or \(R(\cdot ,t;\phi )\equiv 0\). In the case where \(I(\cdot ,t;\phi )\equiv 0\), we can see that \(R(x,t;\phi )\) satisfies

$$\begin{aligned} \begin{array}{ll} \partial _tR = \nabla \cdot \big (d_R(x)\nabla R\big ) - (\mu +\delta )(x)R, &{} \ x\in \Omega ,\ t>0, \\ \big [d_R(x)\nabla R(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t>0. \end{array} \end{aligned}$$
(4.2)

which implies that \(\lim _{t\rightarrow \infty }R(\cdot ,t;\phi ) = 0\) uniformly for \(x\in \overline{\Omega }\). Thus, for any sufficiently small \(\varepsilon >0\), there is a \(t_2>0\) such that \(R(x,t;\phi )<\varepsilon \) for \(t\ge t_2\). Then, from the first equation of (1.2), we can get

$$\begin{aligned} \begin{array}{ll} \partial _tS = \nabla \cdot \big (d_S(x)\nabla S\big ) + \Lambda (x) - \mu (x)S + \varepsilon \delta (x), &{} \ x\in \Omega ,\ t>t_2, \\ \big [d_S(x)\nabla S(x,t)\big ]\cdot \mathbf{n} = 0, &{} \ x\in \partial \Omega ,\ t>t_2. \end{array} \end{aligned}$$
(4.3)

Since \(\varepsilon \) is arbitrary, this shows that \(\lim _{t\rightarrow \infty }S(x,t;\phi ) = S_0(x)\). On the other hand, when \(I(\cdot ,t_3;\phi )\not \equiv 0\) for some \(t_3\ge 0\), we obtain by Lemma 3.2(ii) that \(I(x,t;\phi )>0\) for all \(x\in \Omega \), \(t>t_3\). Hence, \(R(\cdot ,t;\phi )\equiv 0\) for all \(t>t_3\), but the last equation of (1.2) implies that \(I(\cdot ,t;\phi )\equiv 0\) for \(t>t_3\), which is a contradiction. Thus, we have proved that \(\omega (\phi ) = \{(S_0(\cdot ),0,0)\}\) for all \(\phi \in M_\partial \). Moreover, since \(\mathcal {R}_0>1\), it follows from Theorem 3.3(ii) that \(E_0=(S_0,0,0)\) is a uniform weak repeller for \(\mathbb {W}_0\).

Define a continuous function \(p:\mathbb {X}^+\rightarrow [0,\infty )\) by

$$\begin{aligned} p(\phi ) := \min \left\{ \min _{x\in \overline{\Omega }}\phi _2(x),\ \min _{x\in \overline{\Omega }}\phi _3(x)\right\} \quad \text {for }\phi =(\phi _1,\phi _2,\phi _3)\in \mathbb {X}^+. \end{aligned}$$

It follows from Lemma 3.2(ii) that \(p^{-1}(0,\infty )\subseteq \mathbb {W}_0\), and p has the property that if \(p(\phi )>0\) or \(\phi \in \mathbb {W}_0\) with \(p(\phi )=0\), then \(p\big (\Phi _t(\phi )\big )>0\) for all \(t>0\). Thus, p is a generalized distance function for the semiflow \(\Phi _t:\mathbb {X}^+\rightarrow \mathbb {X}^+\) (see [33]).

Note that any forward orbit of \(\Phi _t\) in \(M_\partial \) converges to \(E_0\). Furthermore, the claim above implies that \(E_0\) is isolated in \(\mathbb {X}^+\) and \(W^S(E_0)\cap \mathbb {W}_0 = \emptyset \), where \(W^S(E_0)\) is the stable set of \(E_0\). Moreover, it is clear that there are no cycles in \(M_\partial \) from \(E_0\) to \(E_0\). By [33, Theorem 3], it follows that there exists a constant \(\eta _1>0\) such that \(\min \{p(\psi ) : \psi \in \omega (\phi )\}>\eta _1\) for any \(\phi \in \mathbb {W}_0\). Hence,

$$\begin{aligned} \liminf _{t\rightarrow \infty }I(\cdot ,t;\phi ), \liminf _{t\rightarrow \infty }R(\cdot ,t;\phi ) \ge \eta _1 \quad \text {for all }\phi \in \mathbb {W}_0. \end{aligned}$$

Further, it follows from Lemma 3.2(i) that

$$\begin{aligned} \liminf _{t\rightarrow \infty }S(\cdot ,t;\phi ) \ge \frac{\Lambda ^-}{\mu ^+ + \beta ^+M_I} =: \eta _2 \quad \text {for all }\phi \in \mathbb {W}_0. \end{aligned}$$

Thus, (4.1) holds by taking \(\varepsilon =\min \{\eta _1,\eta _2\}\), which proves the uniform persistence result. By Theorem 3.7 and Remark 3.10 in [34], it follows that \(\Phi _t:\mathbb {W}_0\rightarrow \mathbb {W}_0\) has a global attractor. It then follows from [34, Theorem 4.7] that \(\Phi _t\) has a steady state \((S^*,I^*,R^*)\in \mathbb {W}_0\), which is positive by Lemma 3.2(ii). Therefore, \((S^*,I^*,R^*)\) is an endemic equilibrium of model (1.2). \(\square \)

5 Bifurcation analysis

In this section, we will use \(\gamma _1\) (the transfer rate from the infected class to the susceptible class) as the main bifurcation parameter to perform the bifurcation analysis of model (1.2). To apply the local and global bifurcation theory by Crandall and Rabinowitz [35], we need to assume that the diffusion coefficient \(d_I\) and the parameter \(\gamma _1\) are constant, where \(d_I>0\) and \(\gamma _1\ge 0\). However, we still allow all other parameters and the diffusion rates \(d_S(x)\) and \(d_R(x)\) to be spatially variable.

A steady state of model (1.2) is a solution of the elliptic problem

$$\begin{aligned} \begin{array}{ll} \nabla \cdot \big (d_S(x)\nabla \tilde{S}(x)\big ) + \Lambda (x) - \mu (x)\tilde{S}(x) - F\big (x,\tilde{S}(x),\tilde{I}(x)\big ) + \gamma _1(x)\tilde{I}(x) &{} \\ \quad + \delta (x)\tilde{R}(x) = 0, &{} \ x\in \Omega , \\ \nabla \cdot \big (d_I(x)\nabla \tilde{I}(x)\big ) + F\big (x,\tilde{S}(x),\tilde{I}(x)\big ) - (\mu +\alpha +\gamma _1+\gamma _2)(x)\tilde{I}(x) = 0, &{} \ x\in \Omega , \\ \nabla \cdot \big (d_R(x)\nabla \tilde{R}(x)\big ) + \gamma _2(x)\tilde{I}(x) - \mu (x)\tilde{R}(x) - \delta (x)\tilde{R}(x) = 0, &{} \ x\in \Omega , \\ \big [d_S(x)\nabla \tilde{S}(x)\big ]\cdot \mathbf {n} = \big [d_I(x)\nabla \tilde{I}(x)\big ]\cdot \mathbf {n} = \big [d_R(x)\nabla \tilde{R}(x)\big ]\cdot \mathbf {n} = 0, &{} \ x\in \partial \Omega . \end{array} \end{aligned}$$
(5.1)

When \(d_I(x)=d_I\) and \(\gamma _1(x)=\gamma _1\) are constant, system (5.1) becomes

$$\begin{aligned} \begin{array}{ll} \nabla \cdot \big (d_S(x)\nabla \tilde{S}(x)\big ) + \Lambda (x) - \mu (x)\tilde{S}(x) - F\big (x,\tilde{S}(x),\tilde{I}(x)\big ) + \gamma _1\tilde{I}(x) &{} \\ \quad + \delta (x)\tilde{R}(x) = 0, &{} \ x\in \Omega , \\ d_I\Delta \tilde{I}(x) + F\big (x,\tilde{S}(x),\tilde{I}(x)\big ) - (\mu +\alpha +\gamma _2)(x)\tilde{I}(x) - \gamma _1\tilde{I}(x)= 0, &{} \ x\in \Omega , \\ \nabla \cdot \big (d_R(x)\nabla \tilde{R}(x)\big ) + \gamma _2(x)\tilde{I}(x) - \mu (x)\tilde{R}(x) - \delta (x)\tilde{R}(x) = 0, &{} \ x\in \Omega , \\ \big [d_S(x)\nabla \tilde{S}(x)\big ]\cdot \mathbf {n} = \big [d_I\nabla \tilde{I}(x)\big ]\cdot \mathbf {n} = \big [d_R(x)\nabla \tilde{R}(x)\big ]\cdot \mathbf {n} = 0, &{} \ x\in \partial \Omega . \end{array} \end{aligned}$$
(5.2)

It is easy to see that \((S_0(\cdot ),0,0)\) is a semi-trivial steady state solution of (5.2). Denote by \(\gamma _1^*\) the principal eigenvalue of the eigenvalue problem

$$\begin{aligned} \begin{array}{ll} d_I\Delta \psi (x) + \big [\frac{\partial F}{\partial I}\big (x,S_0(x),0\big ) - (\mu +\alpha +\gamma _2)(x)\big ]\psi (x) = \gamma _1\psi (x), &{} \ x\in \Omega , \\ \big [d_I\nabla \psi (x)\big ]\cdot \mathbf {n} = 0, &{} \ x\in \partial \Omega , \end{array} \end{aligned}$$
(5.3)

associated with a positive eigenfunction \(\psi _0(x)\), which is uniquely determined by the normalization \(\max _{x\in \overline{\Omega }}\psi _0(x) = 1\). By expression (3.5), we know that \(\mathcal {R}_0\) is decreasing with respect to \(\gamma _1\). Furthermore, (5.3) with \(\gamma _1=\gamma _1^*\) is equivalent to system (3.3) with \(\gamma _1(x)=\gamma _1^*\), \(d_I(x)=d_I\) and \(\lambda =0\), and by Lemma 3.1, we have that \(\lambda ^*=0\) if and only if \(\mathcal {R}_0=1\). This shows that the condition \(\gamma _1=\gamma _1^*\) is equivalent to \(\mathcal {R}_0=1\).

Next, we define a function

$$\begin{aligned} H(x) = \frac{\partial F}{\partial I}\big (x,S_0(x),0\big ) - (\mu +\alpha +\gamma _2)(x). \end{aligned}$$
(5.4)

Then, it follows that \(\gamma _1^* = H\) when H is a constant.

We will now study the case where H is not constant and it could change sign in \(\Omega \). Consider the eigenvalue problem with indefinite weight

$$\begin{aligned} \begin{array}{ll} \Delta \varphi (x) + \Lambda H(x)\varphi (x) = 0, &{} \ x\in \Omega , \\ \big [\nabla \varphi (x)\big ]\cdot \mathbf {n} = 0, &{} \ x\in \partial \Omega . \end{array} \end{aligned}$$
(5.5)

It follows from [36, Theorem 4.2] that the problem (5.5) has a nonzero principal eigenvalue \(\Lambda _0=\Lambda _0(H)\) if and only if H(x) changes sign in \(\Omega \) and \(\int \limits _\Omega H(x)\,d x\ne 0\).

It then follows from [36, Proposition 4.4] that the sign of the principal eigenvalue \(\gamma _1^*\) of the problem (5.3) is described by the following result.

Lemma 5.1

The following statements hold.

  1. (i)

    If \(\int \limits _\Omega H(x)\,d x \ge 0\), then \(\gamma _1^*>0\) for all \(d_I>0\).

  2. (ii)

    If \(\int \limits _\Omega H(x)\,d x < 0\), then

    $$\begin{aligned} {\left\{ \begin{array}{ll} \gamma _1^*>0 &{}{} \mathrm {{for\; all\;}} d_I<\frac{1}{\Lambda _0(H)}, \\ \gamma _1^*<0 &{}{} \mathrm {{for\; all\;}} d_I>\frac{1}{\Lambda _0(H)}. \end{array}\right. } \end{aligned}$$

We next regard \(\gamma _1\) as a bifurcation parameter and investigate a local branch of positive solutions of (5.2) that bifurcates from the branch of semi-trivial solutions \(\big \{(S_0(\cdot ),0,0,\gamma _1) :\gamma _1\ge 0\big \}\). We note that \(S_0(\cdot )\) is independent of the parameter \(\gamma _1\).

Theorem 5.2

Let

$$\begin{aligned} \mathcal {S} = \big \{(S,I,R,\gamma _1)\in X\times \mathbb {R}_+ : (S,I,R,\gamma _1) = (S_0(\cdot ),0,0,\gamma _1)\big \}, \end{aligned}$$

be the set of semi-trivial solutions of (5.2), where

$$\begin{aligned} X = \big \{&(u,v,w)\in W^{2,p}(\Omega )\times W^{2,p}(\Omega )\times W^{2,p}(\Omega ) :\\&\big [d_S(x)\nabla u(x)\big ]\cdot \mathbf {n}=\big [d_I\nabla v(x)\big ]\cdot \mathbf {n}=\big [d_R(x)\nabla w(x)\big ]\cdot \mathbf {n}=0,\ \forall x\in \partial \Omega \big \} \end{aligned}$$

for \(p>n\). Then, a branch of positive solutions of (5.2) bifurcates from \(\mathcal {S}\) if and only if \(\gamma _1=\gamma _1^*\). More precisely, all positive solutions of (5.2) near \((S,I,R,\gamma _1)=(S_0,0,0,\gamma _1^*)\) can be parametrized as a smooth curve

$$\begin{aligned} \begin{aligned} \Gamma _1 = \Big \{&(S,I,R,\gamma _1) = \Big (S_0 + \big (\phi _0+\overline{S}(s)\big )s,\ \big (\psi _0+\overline{I}(s)\big )s,\ \big (\chi _0+\overline{R}(s)\big )s,\ \gamma _1(s)\Big ) \in X\times \mathbb {R}_+ : \\&0<s\le \epsilon _0\Big \} \end{aligned} \end{aligned}$$
(5.6)

for some \(\epsilon _0>0\). Here, \(\big (\overline{S}(s),\overline{I}(s),\overline{R}(s),\gamma _1(s)\big )\) is a smooth function of s that satisfies

$$\begin{aligned} \big (\overline{S}(0),\overline{I}(0),\overline{R}(0),\gamma _1(0)\big )=(0,0,0,\gamma _1^*) \text { and } \int \limits _\Omega \overline{I}(s)\psi _0(x)\,d x=0, \end{aligned}$$
(5.7)

\(\psi _0>0\) is the principal eigenfunction of (5.3), and \(\phi _0\), \(\chi _0\) satisfy

$$\begin{aligned} \begin{array}{ll} -\nabla \cdot \big (d_S(x)\nabla \phi _0(x)\big ) + \mu (x)\phi _0(x) = \left[ \gamma _1^* - \frac{\partial F}{\partial I}(x,S_0,0)\right] \psi _0(x) + \delta (x)\chi _0(x), &{} \ x\in \Omega , \\ -\nabla \cdot \big (d_R(x)\nabla \chi _0(x)\big ) + [\mu (x) + \delta (x)]\chi _0(x) = \gamma _2(x)\psi _0(x) &{} \ x\in \Omega , \\ \big [d_S(x)\nabla \phi _0(x)\big ]\cdot \mathbf {n} = \big [d_R(x)\nabla \chi _0(x)\big ]\cdot \mathbf {n} = 0, &{} \ x\in \partial \Omega . \end{array} \end{aligned}$$
(5.8)

Moreover, \(\gamma _1'(0)\) can be calculated by

$$\begin{aligned} \gamma _1'(0) = \frac{L}{\int \limits _\Omega \psi _0^2(x)\,d x}, \end{aligned}$$

where

$$\begin{aligned} L = \int \limits _\Omega \left[ \frac{1}{2}\cdot \frac{\partial ^2f}{\partial I^2}\big (x,S_0(x),0\big )\psi _0^3(x) + \frac{\partial ^2f}{\partial S\partial I}\big (x,S_0(x),0\big )\phi _0(x)\psi _0^2(x)\right] \,d x. \end{aligned}$$
(5.9)

Proof

We begin by making the change of variables \(u=\tilde{S}-S_0\), \(v=\tilde{I}\), \(w=\tilde{R}\), so the disease-free steady state \((\tilde{S},\tilde{I},\tilde{R})=\big (S_0(\cdot ),0,0\big )\) is shifted to the origin \((u,v,w)=(0,0,0)\).

Let \(Y=L^p(\Omega )\). Define a mapping \(\mathcal {G}:X\times \mathbb {R}_+\rightarrow Y\times Y\times Y\) by

$$\begin{aligned} \begin{aligned} \mathcal {G}(u,v,w,\gamma _1)&= \begin{pmatrix} \nabla \cdot (d_S(x)\nabla u) + f(u+S_0,v,w,\gamma _1) \\ d_I\Delta v + g(u+S_0,v,w,\gamma _1) \\ \nabla \cdot (d_R(x)\nabla w) + h(u+S_0,v,w,\gamma _1) \end{pmatrix} \\&= \begin{pmatrix} \nabla \cdot (d_S(x)\nabla u) + \Lambda (x) - \mu (x)(u+S_0) - F(x,u+S_0,v) + \gamma _1v + \delta (x)w \\ d_I\Delta v + F(x,u+S_0,v) - (\mu +\alpha +\gamma _2)(x)v - \gamma _1v \\ \nabla \cdot (d_R(x)\nabla w) + \gamma _2(x)v - \mu (x)w - \delta (x)w \end{pmatrix}. \end{aligned} \end{aligned}$$

Let \(\tilde{f}(u,v,w,\gamma _1) = f(u+S_0,v,w,\gamma _1)\), \(\tilde{g}(u,v,w,\gamma _1) = g(u+S_0,v,w,\gamma _1)\) and \(\tilde{h}(u,v,w,\gamma _1) = h(u+S_0,v,w,\gamma _1)\). The Fréchet derivative of \(\mathcal {G}\) with respect to (uvw) at \((u,v,w,\gamma _1)=(0,0,0,\gamma _1)\) is given by

$$\begin{aligned} \begin{aligned}&\mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1)[\phi ,\psi ,\chi ] \\&\quad = \begin{pmatrix} \nabla \cdot (d_S(x)\nabla \phi ) + \tilde{f}_u(0,0,0,\gamma _1)\phi + \tilde{f}_v(0,0,0,\gamma _1)\psi + \tilde{f}_w(0,0,0,\gamma _1)\chi \\ d_I\Delta \psi + \tilde{g}_u(0,0,0,\gamma _1)\phi + \tilde{g}_v(0,0,0,\gamma _1)\psi + \tilde{g}_w(0,0,0,\gamma _1)\chi \\ \nabla \cdot (d_R(x)\nabla \chi ) + \tilde{h}_u(0,0,0,\gamma _1)\phi + \tilde{h}_v(0,0,0,\gamma _1)\psi + \tilde{h}_w(0,0,0,\gamma _1)\chi \end{pmatrix} \\&\quad = \begin{pmatrix} \nabla \cdot (d_S(x)\nabla \phi ) - \mu (x)\phi + \left[ \gamma _1 - \frac{\partial F}{\partial I}(x,S_0,0)\right] \psi + \delta (x)\chi \\ d_I\Delta \psi + \left[ \frac{\partial F}{\partial I}(x,S_0,0) - (\mu +\alpha +\gamma _2)(x) - \gamma _1\right] \psi \\ \nabla \cdot (d_R(x)\nabla \chi ) + \gamma _2(x)\psi - [\mu (x) + \delta (x)]\chi \end{pmatrix}. \end{aligned} \end{aligned}$$

In particular,

$$\begin{aligned} \mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*)[\phi ,\psi ,\chi ] = \begin{pmatrix} \nabla \cdot (d_S(x)\nabla \phi ) - \mu (x)\phi - \left[ \frac{\partial F}{\partial I}(x,S_0,0) - \gamma _1^*\right] \psi + \delta (x)\chi \\ d_I\Delta \psi - \gamma _1^*\psi + H(x)\psi \\ \nabla \cdot (d_R(x)\nabla \chi ) + \gamma _2(x)\psi - [\mu (x) + \delta (x)]\chi \end{pmatrix}. \end{aligned}$$

We recall that the positive eigenfunction \(\psi _0\) of (5.3) satisfies

$$\begin{aligned} d_I\Delta \psi _0 - \gamma _1^*\psi _0 + H(x)\psi _0 = 0. \end{aligned}$$

If we define

$$\begin{aligned} \chi _0 = -\big [\nabla \cdot (d_R(x)\nabla ) - \mu (x) - \delta (x)\big ]^{-1}\gamma _2(x)\psi _0 \end{aligned}$$
(5.10)

and

$$\begin{aligned} \phi _0 = \big [\nabla \cdot (d_S(x)\nabla ) - \mu (x)\big ]^{-1}\Big [\left( \tfrac{\partial F}{\partial I}(x,S_0,0) - \gamma _1^*\right) \psi _0 - \delta (x)\chi _0\Big ], \end{aligned}$$
(5.11)

we can see that \((\phi _0,\psi _0,\chi _0)\) satisfies

$$\begin{aligned} \mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*)[\phi _0,\psi _0,\chi _0]=0. \end{aligned}$$

Hence, \((\phi _0,\psi _0,\chi _0)\in \ker \mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*)\setminus \{0\}\), so \(\ker \mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*)\) is non-trivial.

On the other hand, it follows from the definition of \(\gamma _1^*\) that the principal eigenvalue \(\lambda ^*\) of (3.3) equals zero when \(\gamma _1=\gamma _1^*\). Since \(\lambda ^*=0\) is a simple eigenvalue, the eigenfunction \(\psi _0\) is unique up to constant multiples. Moreover, \(\chi _0\) and \(\phi _0\) are uniquely determined by (5.10) and (5.11) for a fixed \(\psi _0\). This implies that \(\ker \mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*)={\text {span}}\left\{ \left( \phi _0,\psi _0,\chi _0\right) \right\} \).

It is easy to see that \((h_1,h_2,h_3)\in {\text {range}}\mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*)\) if and only if there exists \((\phi ,\psi ,\chi )\in X\) such that

$$\begin{aligned} \begin{aligned} h_1&= \nabla \cdot (d_S(x)\nabla \phi ) - \mu (x)\phi - \left[ \tfrac{\partial F}{\partial I}(x,S_0,0) - \gamma _1^*\right] \psi + \delta (x)\chi , \\ h_2&= d_I\Delta \psi - \gamma _1^*\psi + H(x)\psi , \\ h_3&= \nabla \cdot (d_R(x)\nabla \chi ) + \gamma _2(x)\psi - [\mu (x) + \delta (x)]\chi . \end{aligned} \end{aligned}$$
(5.12)

It follows from the Fredholm alternative theorem that the second equation of the above system is solvable if and only if \(\int \limits _\Omega h_2(x)\psi _0(x)\,d x = 0\). Then, for the obtained solution \(\psi \), the third equation of (5.12) has a unique solution

$$\begin{aligned} \chi = \big [\nabla \cdot (d_R(x)\nabla ) - \mu (x) - \delta (x)\big ]^{-1}\big [h_3 - \gamma _2(x)\psi \big ]. \end{aligned}$$

Then, for such \(\psi \) and \(\chi \), the first equation of (5.12) has a unique solution

$$\begin{aligned} \phi = \big [\nabla \cdot (d_S(x)\nabla ) - \mu (x)\big ]^{-1}\Big [h_1 + \left[ \tfrac{\partial F}{\partial I}(x,S_0,0) - \gamma _1^*\right] \psi - \delta (x)\chi \Big ]. \end{aligned}$$

From this, it follows that the range of \(\mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*)\) is given by

$$\begin{aligned} {\text {range}}\mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*) = \left\{ (h_1,h_2,h_3)\in Y^3 : \int \limits _\Omega h_2(x)\psi _0(x)\,d x=0\right\} . \end{aligned}$$

The above equation defines a constraint on the variable \(h_2\). Since there are no constraints on \(h_1\) or \(h_3\), the solution to system (5.12) has two degrees of freedom. This shows that \(\dim {\text {range}}\mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*)=2\) and thus, \({\text {codim}}{\text {range}}\mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*)=1\).

Furthermore, we can calculate

$$\begin{aligned} \begin{aligned}&\mathcal {G}_{(u,v,w),\gamma _1}(0,0,0,\gamma _1^*)[\phi _0,\psi _0,\chi _0] \\&\quad = \begin{pmatrix} \tilde{f}_{u,\gamma _1}(0,0,0,\gamma _1^*)\phi _0 + \tilde{f}_{v,\gamma _1}(0,0,0,\gamma _1^*)\psi _0 + \tilde{f}_{w,\gamma _1}(0,0,0,\gamma _1^*)\chi _0 \\ \tilde{g}_{u,\gamma _1}(0,0,0,\gamma _1^*)\phi _0 + \tilde{g}_{v,\gamma _1}(0,0,0,\gamma _1^*)\psi _0 + \tilde{g}_{w,\gamma _1}(0,0,0,\gamma _1^*)\chi _0 \\ \tilde{h}_{u,\gamma _1}(0,0,0,\gamma _1^*)\phi _0 + \tilde{h}_{v,\gamma _1}(0,0,0,\gamma _1^*)\psi _0 + \tilde{h}_{w,\gamma _1}(0,0,0,\gamma _1^*)\chi _0 \end{pmatrix} \\&\quad = \begin{pmatrix} \psi _0 \\ -\psi _0 \\ 0 \end{pmatrix}, \end{aligned} \end{aligned}$$

where \(\mathcal {G}_{(u,v,w),\gamma _1}\) denotes the second partial Fréchet derivative of \(\mathcal {G}\) with respect to the variables (uvw) and \(\gamma _1\). Similarly, we denote by \(\tilde{f}_{u,\gamma _1}\) the second partial derivative of \(\tilde{f}\) with respect to u and \(\gamma _1\), and so on.

Since \(\int \limits _\Omega \big [-\psi _0(x)\big ]\psi _0(x)\,d x=-\int \limits _\Omega \big [\psi _0(x)\big ]^2\,d x < 0\), we obtain

$$\begin{aligned} \mathcal {G}_{(u,v,w),\gamma _1}(0,0,0,\gamma _1^*)[\phi _0,\psi _0,\chi _0] \notin {\text {range}}\mathcal {G}_{(u,v,w)}(0,0,0,\gamma _1^*). \end{aligned}$$

This allows us to apply the theorem of bifurcation from a simple eigenvalue [35] to conclude that the set of positive solutions to (5.2) near \((S,I,R,\gamma _1) = \big (S_0(\cdot ),0,0,\gamma _1^*\big )\) is a curve of the form (5.6). We also note that the possibility of other bifurcation points different from \(\gamma _1=\gamma _1^*\) is excluded by virtue of the Krein–Rutman theorem. Furthermore, according to the direction of bifurcation formula by Shi [37], we have

$$\begin{aligned} \gamma _1'(0) = -\frac{\left\langle l,\; \mathcal {G}_{(u,v,w),(u,v,w)}(0,0,0,\gamma _1^*)[\phi _0,\psi _0,\chi _0]^2\right\rangle }{2\left\langle l,\; \mathcal {G}_{(u,v,w),\gamma _1}(0,0,0,\gamma _1^*)[\phi _0,\psi _0,\chi _0]\right\rangle }, \end{aligned}$$

where the linear functional \(l:Y^2\rightarrow \mathbb {R}\) is defined as

$$\begin{aligned} \langle l,\; [h_1,h_2,h_3]\rangle = \int \limits _\Omega h_2(x)\psi _0(x)\,d x, \end{aligned}$$

while \(\mathcal {G}_{(u,v,w),(u,v,w)}\) and \(\mathcal {G}_{(u,v,w),\gamma _1}\) denote the second partial Fréchet derivatives of \(\mathcal {G}\).

By direct calculation, we can see that the second component of

$$\begin{aligned} \mathcal {G}_{(u,v,w),(u,v,w)}(0,0,0,\gamma _1^*)[\phi _0,\psi _0,\chi _0]^2 \end{aligned}$$

takes the form

$$\begin{aligned} G_0(x)&:= \tilde{g}_{uu}(0,0,0,\gamma _1^*)\phi _0^2 + \tilde{g}_{vv}(0,0,0,\gamma _1^*)\psi _0^2 + \tilde{g}_{ww}(0,0,0,\gamma _1^*)\chi _0^2 \\&\quad + 2\tilde{g}_{uv}(0,0,0,\gamma _1^*)\phi _0\psi _0 + 2\tilde{g}_{vw}(0,0,0,\gamma _1^*)\psi _0\chi _0 + 2\tilde{g}_{wu}(0,0,0,\gamma _1^*)\chi _0\phi _0 \\&= \frac{\partial ^2f}{\partial I^2}\big (x,S_0(x),0\big )\psi _0^2 + 2\frac{\partial ^2f}{\partial S\partial I}\big (x,S_0(x),0\big )\phi _0\psi _0. \end{aligned}$$

Thus,

$$\begin{aligned} \gamma _1'(0)&= -\frac{\int \limits _\Omega G_0(x)\psi _0(x)\,d x}{2\left[ -\int \limits _\Omega \psi _0^2(x)\,d x\right] } = \frac{L}{\int \limits _\Omega \psi _0^2(x)\,d x}, \end{aligned}$$

where L is given by (5.9). This completes the proof. \(\square \)

Next, we use the global bifurcation theory to give an extension of the local bifurcation branch (5.6). We define the set of positive solutions of (5.2) as

$$\begin{aligned} \Sigma = \big \{(S,I,R,\gamma _1)\in X\times \mathbb {R}_+ : (S,I,R,\gamma _1) is a solution of (5.2) \text {with }S,I,R,\gamma _1>0\big \}. \end{aligned}$$

We can thus obtain the following result.

Theorem 5.3

The set of positive solutions of (5.2) with bifurcation parameter \(\gamma _1\) forms a continuum \(\Sigma _1\subset X\times \mathbb {R}_+\) that bifurcates from the disease-free steady state \((S_0,0,0)\) when \(\mathcal {R}_0=1\).

Moreover, \(\Sigma _1\) is a connected component of \(\overline{\Sigma }\) containing the curve \(\Gamma _1\). The projection \({\text {proj}}_{\gamma _1}\Sigma _1\) of \(\Sigma _1\) into the \(\gamma _1\)-axis satisfies

$$\begin{aligned} {\text {proj}}_{\gamma _1}\Sigma _1 = [0,\gamma _1^*], \end{aligned}$$

and \(\Sigma _1\) contains a point of the form \((\hat{S},\hat{I},\hat{R},0)\), where \((\hat{S},\hat{I},\hat{R})\) is a positive solution of (5.2) with \(\gamma _1=0\).

Proof

For the local bifurcation branch obtained in Theorem 5.2, let \(\Sigma _1\) be any maximal extension in \(X\times \mathbb {R}_+\) as a connected set of solutions of (5.2). From the remarks in [38], we can verify that all conditions in [38, Theorem 4.4] are satisfied. Therefore, \(\Sigma _1\) must satisfy one of the following:

  1. (i)

    it is not compact; or

  2. (ii)

    it contains a point \((S_0,0,0,\hat{\gamma }_1)\) with \(\hat{\gamma }_1\ne \gamma _1^*\); or

  3. (iii)

    it contains a point \((S_0+U,V,W,\gamma _1)\) where \((U,V,W)\ne 0\) and (UVW) is in the complement of \({\text {span}}\left\{ \phi _0,\psi _0,\chi _0\right\} \).

Since the eigenvalue of (5.3) with positive eigenfunction is unique, it is clear that (ii) cannot occur.

Now, by Theorem 3.3(i), system (5.2) has no positive steady state solutions for \(\mathcal {R}_0<1\), that is, when \(\gamma _1 > \gamma _1^*\). This implies that \({\text {proj}}_{\gamma _1}\Sigma _1 \subset [0,\gamma _1^*]\). On the other hand, due to Theorem 4.1, there exists a positive steady state for all \(\gamma _1 < \gamma _1^*\) (which corresponds to \(\mathcal {R}_0>1\)). Therefore, we can conclude that \({\text {proj}}_{\gamma _1}\Sigma _1 = [0,\gamma _1^*]\). This result, together with the boundedness of solutions, shows that \(\Sigma _1\) must be compact.

The above discussion implies that case (iii) holds. Then, by a standard argument with the global bifurcation theorem, we can find a certain \((\hat{S},\hat{I},\hat{R},\hat{\gamma }_1)\in \Sigma _1\) such that \((\hat{S},\hat{I},\hat{R},\hat{\gamma }_1)\) is in the boundary of the set of positive solutions \(\Sigma \). Since there are no equilibrium states with \(\hat{S}=0\), \(\hat{I}=0\) or \(\hat{R}=0\) other than the disease-free steady state, the only possibility is \(\hat{\gamma }_1=0\) and \(\hat{S},\hat{I},\hat{R}>0\). Hence, \((\hat{S},\hat{I},\hat{R})\) is a positive solution of system (5.2) with \(\gamma _1=0\), which completes the proof. \(\square \)

6 Numerical simulations

In this section, we will perform some numerical simulations for system (1.2) to investigate the dynamics of the solutions as some of the parameters are varied. For simplicity, we consider the one-dimensional domain \(\Omega =(0,20)\subset \mathbb {R}\). We will use the parameter values

$$\begin{aligned} \Lambda =10,\quad \mu =0.001,\quad \alpha =0.01,\quad \gamma _1=0.02, \quad \gamma _2=0.07,\quad \delta =0.02. \end{aligned}$$
(6.1)

Furthermore, we consider the saturated incidence function

$$\begin{aligned} F(x,S,I) = \frac{\beta SI}{1 + a_1(x)I}, \end{aligned}$$
(6.2)

where \(\beta =0.0001\) and the parameter \(a_1\) is a positive function of the spatial variable. This allows us to study the case when the incidence function saturates more rapidly in some locations, due to the different measures taken by the population to eradicate the epidemic outbreak. It is readily shown that this form of incidence function satisfies (A1)(A4).

Fig. 1
figure 1

Time variation of S(xt), I(xt) and R(xt) for system (1.2) with the parameters given in (6.1), \(d_S=d_I=d_R=1\) and \(a_1(x) \equiv 0.5\)

Fig. 2
figure 2

Time variation of S(xt), I(xt) and R(xt) for system (1.2) with the parameters given in (6.1), \(d_S=d_I=d_R=1\) and \(a_1(x) = 0.5[1 + 0.95\sin (2\pi x/20)]\)

Throughout this section, we will use the initial conditions

$$\begin{aligned} S_0(x) = 9700,\quad I_0(x) = 10\exp (-x),\quad R_0(x) = 0\quad \text {for }x\in [0,20]. \end{aligned}$$

6.1 Effect of heterogeneous saturation rate

We will first study the effect that heterogeneity in the saturation parameter has on the model dynamics. We define the function \(a_1:\Omega \rightarrow \mathbb {R}_+\) by \(a_1(x) = 0.5[1 + k\sin (2\pi x/20)]\), where \(k\in [0,1]\) is a number that represents the magnitude of spatial heterogeneity. The case \(k=0\) corresponds to \(a_1(x) \equiv 0.5\).

Now, we fix the values of the diffusion rates as \(d_S=d_I=d_R=1\) and plot in Fig. 1 the solutions to system (1.2) when \(k=0\), that is, when the incidence rate \(F(x,S,I) = \beta SI/(1 + a_1I)\) is independent of x. In this case, the densities of susceptible, infected and recovered populations converge to positive constants.

If we increase the heterogeneity by choosing \(k=0.95\) while keeping all other parameter values as before, we obtain the solutions plotted in Fig. 2. Here, the values of S(xt), I(xt) and R(xt) tend to a non-homogeneous endemic equilibrium as \(t\rightarrow \infty \).

Lastly, we plot the spatial distribution of susceptible, infected and recovered populations at \(t=1000\) for \(k=0,0.5,0.8,0.95\) in Fig. 3. We can see that greater heterogeneity in the saturation parameter results in higher values of infected and recovered populations.

Fig. 3
figure 3

Spatial distribution of S(xt), I(xt) and R(xt) at \(t=1000\) for system (1.2) with \(d_S=d_I=d_R=1\) and \(a_1(x) = 0.5[1 + k\sin (2\pi x/20)]\) for several values of k. The other parameter values are given in (6.1)

Fig. 4
figure 4

Spatial distribution of S(xt), I(xt) and R(xt) at \(t=1000\) for system (1.2) with the parameters given in (6.1), \(a_1(x) = 0.5[1 + 0.95\sin (2\pi x/20)]\) and several values for the diffusion rates. a \(d_S=D\), \(d_I=1\), \(d_R=D\), where D varies from 1 to 0.01. b \(d_S=d_I=d_R=D\), where D varies from 1 to 0.01

6.2 Effect of varying the diffusion rates

We will consider now the effects of using different diffusion rates for system (1.2) while keeping all other parameter values fixed. Here, we assume that \(a_1(x) = 0.5[1 + 0.95\sin (2\pi x/20)]\) and other parameters are taken as in (6.1).

Suppose first that the diffusion rate of infected population is fixed as \(d_I=1\), while the diffusion rates of susceptible and recovered populations are both equal to some constant D. Figure 4a depicts the distribution of S, I and R at \(t=1000\) for several values of D. The figure shows that reducing the value of the diffusion rates \(d_S=d_R=D\) contributes to reducing the infected population levels.

Next, we make some simulations assuming that the diffusion rates \(d_S\), \(d_I\) and \(d_R\) are all equal to D. In this case, the spatial distribution of solutions is shown in Fig. 4b. Our simulations show that reducing the diffusion rate in all three subpopulations results in greater heterogeneity for the distribution of the infected; however, the density of infected population actually increases at certain locations for lower values of D. Hence, restricting the motility for all individuals does not necessarily reduce the total number of infections.

6.3 Simulations of bifurcation when varying the transfer rate from infected to susceptible

We will now simulate the bifurcation dynamics of our model when the parameter \(\gamma _1\) is varied. Here, we fix the parameter values

$$\begin{aligned} \begin{aligned}&\Lambda =10,\quad \mu =0.001,\quad \alpha =0.01, \quad \gamma _2=0.07,\quad \delta =0.02,\quad d_S=d_I=d_R=1, \\&F(x,S,I) = \frac{\beta SI}{1 + a_1(x)I},\quad a_1(x) = 0.5\left[ 1 + 0.95\sin \left( \frac{2\pi x}{20}\right) \right] \end{aligned} \end{aligned}$$
(6.3)

and let \(\gamma _1\) vary from 0 to 0.8.

Using the same initial conditions as before, we obtain the distribution of susceptible, infected and recovered individuals shown in Fig. 5 for \(t=4000\). We can see that for \(\gamma _1\in [0,0.6]\), the solution tends to a positive endemic steady state, which corresponds to the case when \(\mathcal {R}_0>1\). As we increase the value of \(\gamma _1\), the infected population decreases. The basic reproduction number crosses unity at a certain value \(\gamma _1^*\in (0.6,0.8)\), and there is no endemic equilibrium for \(\gamma _1>\gamma _1^*\). Figure 5 shows that the solutions of the model tend to the disease-free steady state (\(E_0=(10^4,0,0)\)) when \(\gamma _1=0.8\).

Fig. 5
figure 5

Spatial distribution of S(xt), I(xt) and R(xt) at \(t=4000\) for system (1.2) with the parameter values given in (6.3) and several values for \(\gamma _1\)

7 Conclusions

We proposed a diffusive SIRS epidemic model that generalizes several previously studied models in the literature, such as [9,10,11, 18]. In particular, we extended the results obtained by Yang et al. in [18] by considering a model with three different diffusion rates, \(d_S(x)\), \(d_I(x)\) and \(d_R(x)\), which may vary depending on the location of individuals. Moreover, our model assumes that the incidence function takes the general form F(xSI), which includes many different types of nonlinear or non-monotone functions and allows for spatial heterogeneity.

We established the threshold dynamics for model (1.2) with respect to the basic reproduction number: the disease-free steady state is globally asymptotically stable when \(\mathcal {R}_0<1\), while the disease persists and an endemic equilibrium appears when \(\mathcal {R}_0>1\).

Furthermore, we performed a bifurcation analysis for model (1.2) in the case when the diffusion coefficient of infected population and the rate of transfer from infected to susceptible classes are constant. We used the theory by Crandall and Rabinowitz [35, 37, 38] to determine the type of bifurcation that occurs when the basic reproduction number crosses unity. From Theorems 5.2 and 5.3, we can infer that the model always undergoes a forward bifurcation at \(\mathcal {R}_0=1\), and the existence of endemic equilibria is completely ruled out when \(\mathcal {R}_0\) is less than one.

We also carried out some simulations to illustrate the effects of heterogeneity on the dynamics of model (1.2). We used, as an example, the incidence rate of the form \(F(x,S,I) = \beta SI/[1 + a_1(x)I]\), which does not meet the assumptions considered in [18] but is included by our general model. Our examples show that a larger variation in the incidence rate may increase the number of infections in the population.