1 Introduction

Since the pioneer work of Kermack and McKendrick (1927), the SIR model has been very popular in epidemiology, as the basic model for infectious diseases with direct transmission (e.g., Anderson and May 1991). It retakes great importance nowadays due to the recent coronavirus pandemic. While early models were not spatialized, the importance of accounting for spatial heterogeneity has been often reported in the literature (see, e.g., Angulo et al. 1979; Sattenspiel and Dietz 1995; Keeling et al. 2004; Keeling and Rohani 2007; Kelly et al. 2016; Li et al. 2021). However, different mechanisms come into play to explain the spatial spreading of a disease. Although diffusion appears to be a natural process to describe the local propagation of an infectious agent among a population, which leads to models with partial differential equations (Murray 2003), it appears to be not well suited for describing long distance spreading. In particular, transportation between cities comes into the picture as a major source of rapid spreading among nonhomogeneous populations (Arino and van den Driessche 2003; Arino et al. 2007; Takeuchi et al. 2007; Liu and Stechlinski 2013; Mpolya et al. 2014; Chen et al. 2014; Yin et al. 2020; Tocto-Erazo et al. 2021; Lipshtat et al. 2021). Meta-populations or multi-patches models are then more appropriate to describe the spatial characteristics of the propagation (Wang and Mulone 2003; Wang and Zhao 2004; Arino and van den Driessche 2006; Gao 2007; Arino 2009), as already well considered in ecology (Hanski 1999; MacArthur 2001). These models require a precise description of the movements between patches, which are most of the time assumed to be linear and thus encoded into a connection matrix (Arino and van den Driessche 2006; Arino 2009). Typically one obtains a system of ordinary differential equations on a graph, which couples the communication dynamics with the epidemiological one.

For diseases spreading among human populations living in different cities, commuters (individuals housing in a city, traveling regularly for short periods in a neighboring city, and coming back to their home city) play a crucial role in the disease propagation among territories (Keeling and Rohani 2002; Keeling et al. 2004; Keeling and Rohani 2007; Mpolya et al. 2014; Yin et al. 2020). Such coupling between patches have been already considered in the literature, distinguishing among populations \(N_i\) attached to a city i the subpopulation \(N_{ii}\) present in its permanent housing from other subpopulations \(N_{ij}\) temporary present in another city \(j \ne i\) [it can be also seen as multi-groups models as in Clancy (1996), Guo et al. (2006), Iggidr et al. (2012)]. However, such models explicitly assume that the whole population housing in a given city can potentially commute to another one. We believe that this is not always fully realistic and that a subpopulation that never (or very rarely) moves to another city should be distinguished from the subpopulation that visits at a regular basis another city. Therefore, we consider an extension of such models, which explicitly takes into consideration two kinds of movement: an Eulerian one which describes the flow between patches that mixes populations, and an Lagrangian one which assigns home locations of individuals, as described in the more general framework (Citron et al. 2021). The study of this extension, which has not yet been analyzed analytically in the literature, to our knowledge, and how it impacts the disease spreading, is the primary objective of the present work. For this purpose, we establish an analytical expression of the reproduction number [as the epidemic threshold formerly introduced and analyzed in Diekmann et al. (1990), van den Driessche and Watmough (2002), Diekmann et al. (2007), Dhirasakdanon et al. (2007)] for the two patches case (that is also valid for the particular case when the whole populations travel, for which the exact expression of the reproduction number has not been yet provided in the literature).

We also had in mind to consider heterogeneity among territories when disease transmission differs from one city to another one. Typically, non-pharmaceutical interventions (such as reducing physical distance in the population) could be applied with different strength in each city, providing distinct transmission rates. When one territory being isolated presents a higher reproduction number than the other territory, it can be considered as a core group in the epidemiological terminology (Hadeler and Castillo-Chavez 1995; Brunham 1997), and commuters contribute then to spread the epidemics in both territories. We aim at analyzing more precisely how the proportions of commuters in each city can increase or decrease the overall reproduction number. Intuitively, one may believe that the best way to reduce the spreading is to encourage commuters from the city with the lowest transmission rate not to travel to the other city, and on the opposite to encourage as much as possible commuters from the other city to spend time in the safer city. Indeed, we shall see that this is not always true... The second objective of the present work is thus to study the minimization of the epidemic threshold of the two-patches model with respect to these proportions, depending on the commuting rates. This analysis can potentially serve for decisions making to prevent epidemic outbreak [as in Knipl (2016), for instance].

The paper is organized as follows. In the next section, we present the complete model in dimension 18 and give some preliminaries. Sect. 3 is devoted to the analysis of the asymptotic behavior of the solutions of the model. We give and demonstrate an explicit expression of the reproduction number, introducing four relevant quantities \(q_{ij}\) (\(i, j = 1,2\)). In a corollary, we also give an alternative way of computation, which is useful in the following. In Sect. 4, we study the minimization of the reproduction number with respect to the proportions of commuters in each patch. Finally, Sect. 5 gives a numerical illustration of the results, considering two territories with intrinsic basic reproduction numbers lower and higher than one. We depict the relative sizes of the permanently resident populations that can avoid the outbreak of the epidemic depending on the commuting rates, and discuss the various cases. We end with a conclusion.

2 The Model

We follow the modeling of commuters proposed in Keeling and Rohani (2002) between two patches (such as cities or territories), but here we consider in addition that a part of the population in each patch do not commute (the permanently resident subpopulation). We consider populations of size \(N_i\) whose home belongs to a patch \(i \in \{1,2\}\), structured in three groups:

  1. 1.

    permanently resident, being all the time in patch i, whose population size is denoted \(N_{ir}\),

  2. 2.

    commuters to patch j, but located in patch i at time t, of population size denoted \(N_{ii}\),

  3. 3.

    commuters to patch j and located in patch j at time t, of population size denoted \(N_{ij}\).

We shall denote \(N_{ic}=N_{ii}+N_{ij}\) the size of the total population of commuters with their home in patch i. The individuals commutes to patch j at a rate \(\lambda _i\) with a return rate \(\mu _i\). For each group \(g \in \{ir,ii,ij\}\) we denote by \(S_g\), \(I_g\), \(R_g\) the sizes of susceptible, infected and recovered subpopulations. This modeling implicitly assumes that at any time there is no individual out the territories, that is traveling time is negligible. This assumption is therefore only valid for adjoining territories with short transportation times (by train, road, etc.). It would not be valid between distant territories connected for example by boat or plane with non-negligible crossing times. In this case, it would be necessary to consider additional nodes of in-transit populations, as it has been considered for example in Colizza et al. (2006), Patil et al. (2021) or Ruan et al. (2015) where distance between nodes are explicitly taken into consideration. This would of course complicates the model and its study.

We consider the SIR model assuming that the recovery parameter \(\gamma \) is identical everywhere while the transmission rate \(\beta _i\) depends on the patch i but is identical among each group. Typically lifestyle and hygienic measures may differ between two cities, implying different values of \(\beta \). Moreover, if two cities are on both sides of the border between two countries, the strength of non-pharmaceutical interventions are likely to be different, as is was, for instance, the case between European countries during the SARS-2 outbreak. The model is written as follows (with \(i \ne j\) in \(\{1,2\}\)).

$$\begin{aligned} \dot{S}_{ir}&=-\beta _i S_{ir}\frac{I_{ir}+I_{ii}+I_{ji}}{N_{ir}+N_{ii}+N_{ji}},\\ \dot{I}_{ir}&=\beta _i S_{ir}\frac{I_{ir}+I_{ii}+I_{ji}}{N_{ir}+N_{ii}+N_{ji}}-\gamma I_{ir},\\ \dot{R}_{ir}&=\gamma I_{ir},\\ \dot{S}_{ii}&=-\beta _i S_{ii}\frac{I_{ir}+I_{ii}+I_{ji}}{N_{ir}+N_{ii}+N_{ji}}-\lambda _i S_{ii}+\mu _i S_{ij},\\ \dot{I}_{ii}&=\beta _i S_{ii}\frac{I_{ir}+I_{ii}+I_{ji}}{N_{ir}+N_{ii}+N_{ji}}-\gamma I_{ii}-\lambda _i I_{ii}+\mu _i I_{ij},\\ \dot{R}_{ii}&=\gamma I_{ii}-\lambda _i R_{ii}+\mu _i R_{ij},\\ \dot{S}_{ij}&=-\beta _j S_{ij}\frac{I_{jr}+I_{jj}+I_{ij}}{N_{jr}+N_{jj}+N_{ij}}+\lambda _i S_{ii}-\mu _i S_{ij},\\ \dot{I}_{ij}&=\beta _j S_{ij}\frac{I_{jr}+I_{jj}+I_{ij}}{N_{jr}+N_{jj}+N_{ij}}-\gamma I_{ij}+\lambda _i I_{ii}-\mu _i I_{ij},\\ \dot{R}_{ij}&=\gamma I_{ij}+\lambda _i R_{ii}-\mu _i R_{ij} \end{aligned}$$

Parameters \(\lambda _i\), \(\mu _i\) represent switching rates of populations i, leaving home and returning. This modeling implicitly assumes that movements between territories are not synchronized, as often considered in multi-city models (see, e.g.,  Sattenspiel and Dietz 1995; Keeling and Rohani 2002; Arino and van den Driessche 2003; Wang and Mulone 2003; Wang and Zhao 2004; Keeling et al. 2004; Arino and van den Driessche 2006; Takeuchi et al. 2007; Keeling and Rohani 2007; Liu and Stechlinski 2013; Chen et al. 2014). Note that we also consider, in all generality, that commuting is asymmetrical (i.e., \(\lambda _1\) and \(\lambda _2\) may be different, as well as \(\mu _1\), \(\mu _2\)). Typically, each territory may offer different activities that attract commuters from the other territory and thus different mean sojourn times. One can check that the population sizes \(N_{ir}\) and \(N_{ic}\) are constant. Moreover \(N_{ii}\), \(N_{ij}\) fulfill the system of equations

$$\begin{aligned} \left\{ \begin{array}{l} \dot{N}_{ii} = -\lambda _i N_{ii} + \mu _i N_{ij},\\ \dot{N}_{ij} = \lambda _i N_{ii} - \mu _i N_{ij} \end{array}\right. \end{aligned}$$

whose solutions verify

$$\begin{aligned} \lim _{t \rightarrow +\infty } N_{ii}(t)=\bar{N}_{ii}:=\frac{\mu _i}{\lambda _i+\mu _i}N_{ic}, \quad \lim _{t \rightarrow +\infty } N_{ij}(t)=\bar{N}_{ij}:=\frac{\lambda _i}{\lambda _i+\mu _i}N_{ic} \end{aligned}$$
(1)

We shall assume that populations are already balanced at initial time, i.e., that one has \(N_{ii}={\bar{N}}_{ii}\), \(N_{ij}={\bar{N}}_{ij}\) (constant). For simplicity, we shall drop the notation \(\bar{ }\,\) in the following and denote

$$\begin{aligned} N_{ip}:=N_{ir}+N_{ii}+N_{ji} \end{aligned}$$

which represents the (constant) size of the total population present in patch i.

3 The Epidemic Threshold

We denote the vectors

$$\begin{aligned} I=(I_{1r},I_{11},I_{12},I_{2r},I_{22},I_{21})^\top , \quad S=(S_{1r},S_{11},S_{12},S_{2r},S_{22},S_{21})^\top \end{aligned}$$

and consider the state vector

$$\begin{aligned} X=\left[ \begin{array}{c} I\\ S\end{array} \right] \end{aligned}$$

which belongs to the invariant domain

$$\begin{aligned} {{\mathcal {D}}} : = \{ X \in {\mathbb {R}}_+^{12} ; \; {\mathbb {M}}X\le {\mathbb {N}}\} \end{aligned}$$

where \({\mathbb {N}}\) is the vector

$$\begin{aligned} {\mathbb {N}} =(N_{1r},N_{11},N_{12},N_{2r},N_{22},N_{21})^\top \end{aligned}$$

and \({\mathbb {M}}\) the \(6\times 12\) matrix which consists in the concatenation of the identity matrix \({\mathbb {I}}_6\) of dimension \(6\times 6\)

$$\begin{aligned} {\mathbb {M}}=[{\mathbb {I}}_6 , {\mathbb {I}}_6] \end{aligned}$$

The disease free equilibrium is defined as

$$\begin{aligned} X^\star =\left[ \begin{array}{c}0\\ {\mathbb {N}}\end{array}\right] \end{aligned}$$

Let \({{\mathcal {R}}}_i\) be the intrinsic reproduction number in the patch i (i.e., when there is no connection between patches), that is

$$\begin{aligned} {{\mathcal {R}}}_i:=\frac{\beta _i}{\gamma } . \end{aligned}$$

We give now an explicit expression of the epidemic threshold when the two patches communicates via commuters.

Proposition 1

Let

$$\begin{aligned} {{\mathcal {R}}}_{1,2}:=\frac{ q_{11} + q_{22} + \sqrt{ (q_{22} - q_{11})^2 + 4q_{12}q_{21}}}{2} \end{aligned}$$
(2)

where

$$\begin{aligned} \left\{ \begin{array}{l} q_{11}= {{\mathcal {R}}}_1 \left( \frac{N_{1r}}{N_{1p}}+ \frac{N_{11}}{N_{1p}}\frac{\gamma +\mu _1}{\gamma +\lambda _1+\mu _1}+ \frac{N_{21}}{N_{1p}}\frac{\gamma +\lambda _2}{\gamma +\lambda _2+\mu _2}\right) \\ q_{22}= {{\mathcal {R}}}_2 \left( \frac{N_{2r}}{N_{2p}}+ \frac{N_{22}}{N_{2p}}\frac{\gamma +\mu _2}{\gamma +\lambda _2+\mu _2}+ \frac{N_{12}}{N_{2p}}\frac{\gamma +\lambda _1}{\gamma +\lambda _1+\mu _1}\right) \\ q_{21}= {{\mathcal {R}}}_1 \left( \frac{N_{11}}{N_{1p}}\frac{\lambda _1}{\gamma +\lambda _1+\mu _1}+ \frac{N_{21}}{N_{1p}}\frac{\mu _2}{\gamma +\lambda _2+\mu _2}\right) \\ q_{12}= {{\mathcal {R}}}_2 \left( \frac{N_{12}}{N_{2p}}\frac{\mu _1}{\gamma +\lambda _1+\mu _1}+ \frac{N_{22}}{N_{2p}}\frac{\lambda _2}{\gamma +\lambda _2+\mu _2}\right) \end{array}\right. \end{aligned}$$
(3)

Then, one has the following properties.

  1. 1.

    If \({{\mathcal {R}}}_{1,2}>1\), then \(X^\star \) is unstable.

  2. 2.

    If \({{\mathcal {R}}}_{1,2}<1\), then \(X^\star \) is exponentially stable with respect to the variableFootnote 1I.

  3. 3.

    If \({{\mathcal {R}}}_1={{\mathcal {R}}}_2:={{\mathcal {R}}}\), then \({{\mathcal {R}}}_{1,2}={{\mathcal {R}}}\).

Proof

Write the dynamics of X as \(\dot{X}=f(X)\). The Jacobian matrix J of f at \(X^\star \) is of the form

$$\begin{aligned} J=\left[ \begin{array}{cc} A &{} 0\\ \star &{} B \end{array}\right] \text{ with } A=F-V \end{aligned}$$

where

$$\begin{aligned} F= & {} \left[ \begin{array}{cccccc} \beta _1\frac{N_{1r}}{N_{1p}}&{}\beta _1\frac{N_{1r}}{N_{1p}}&{}0&{}0&{}0&{}\beta _1\frac{N_{1r}}{N_{1p}}\\ \beta _1\frac{N_{11}}{N_{1p}}&{}\beta _1\frac{N_{11}}{N_{1p}}&{}0&{}0&{}0&{}\beta _1\frac{N_{11}}{N_{1p}}\\ 0&{}0&{}\beta _2\frac{N_{12}}{N_{2p}}&{}\beta _2\frac{N_{12}}{N_{2p}}&{}\beta _2\frac{N_{12}}{N_{2p}}&{}0\\ 0&{}0&{}\beta _2\frac{N_{2r}}{N_{2p}}&{}\beta _2\frac{N_{2r}}{N_{2p}}&{}\beta _2\frac{N_{2r}}{N_{2p}}&{}0\\ 0&{}0&{}\beta _2\frac{N_{22}}{N_{2p}}&{}\beta _2\frac{N_{22}}{N_{2p}}&{}\beta _2\frac{N_{22}}{N_{2p}}&{}0\\ \beta _1\frac{N_{21}}{N_{1p}}&{}\beta _1\frac{N_{21}}{N_{1p}}&{}0&{}0&{}0&{}\beta _1\frac{N_{21}}{N_{1p}}\end{array} \right] , \\ V= & {} \left[ \begin{array}{cccccc} \gamma &{}0&{}0&{}0&{}0&{}0\\ 0&{} \gamma +\lambda _1&{}-\mu _1&{}0&{}0&{}0\\ 0&{}-\lambda _1&{}\gamma +\mu _1&{}0&{}0 &{}0\\ 0&{}0&{}0&{}\gamma &{}0&{}0\\ 0&{}0&{}0&{}0&{}\gamma +\lambda _2&{}-\mu _2\\ 0&{}0&{}0&{}0&{}-\lambda _2&{}\gamma +\mu _2 \end{array} \right] \end{aligned}$$

and

$$\begin{aligned} B=\left[ \begin{array}{cccccc} 0&{}0&{}0&{}0&{}0&{}0\\ 0&{}-\lambda _1&{}\mu _1&{}0&{}0&{}0\\ 0&{}\lambda _1&{}-\mu _1&{}0&{}0&{}0\\ 0&{}0&{}0&{}0&{}0&{}0\\ 0&{}0&{}0&{}0&{}-\lambda _2&{}\mu _2\\ 0&{}0&{}0&{}0&{}\lambda _2&{}-\mu _2 \end{array} \right] \end{aligned}$$

Note that F is a nonnegative matrix and V is a non-singular M-matrix. We recall [see, for instance, from van den Driessche and Watmough (2002)] that one has the property

$$\begin{aligned} \max Re(Spec(A))\underset{\displaystyle>}{<}0 \Longleftrightarrow \rho (FV^{-1})\underset{\displaystyle >}{<}1 \end{aligned}$$

The computation of the matrix \(M:=FV^{-1}\) gives the following expression

$$\begin{aligned} M=\left[ \begin{array}{cccccc} {{\mathcal {R}}}_1\frac{N_{1r}}{N_{1p}}&{} \mathcal{R}_1\frac{N_{1r}(\gamma +\mu _1)}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_1\frac{N_{1r}\mu _1}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} 0&{} {{\mathcal {R}}}_1\frac{N_{1r}\lambda _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_1\frac{N_{1r}(\gamma +\lambda _2)}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ {{\mathcal {R}}}_1\frac{N_{11}}{N_{1p}}&{} \mathcal{R}_1\frac{N_{11}(\gamma +\mu _1)}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_1\frac{N_{11}\mu _1}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} 0&{} {{\mathcal {R}}}_1\frac{N_{11}\lambda _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_1\frac{N_{11}(\gamma +\lambda _2)}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ 0&{} \mathcal{R}_2\frac{N_{12}\lambda _1}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} \mathcal{R}_2\frac{N_{12}(\gamma +\lambda _1)}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_2\frac{N_{12}}{N_{2p}}&{} \mathcal{R}_2\frac{N_{12}(\gamma +\mu _2)}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_2\frac{N_{12}\mu _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ 0&{} \mathcal{R}_2\frac{N_{2r}\lambda _1}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} \mathcal{R}_2\frac{N_{2r}(\gamma +\lambda _1)}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_2\frac{N_{2r}}{N_{2p}}&{} \mathcal{R}_2\frac{N_{2r}(\gamma +\mu _2)}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_2\frac{N_{2r}\mu _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ 0&{} \mathcal{R}_2\frac{N_{22}\lambda _1}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} \mathcal{R}_2\frac{N_{22}(\gamma +\lambda _1)}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_2\frac{N_{22}}{N_{2p}}&{} \mathcal{R}_2\frac{N_{22}(\gamma +\mu _2)}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_2\frac{N_{22}\mu _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ {{\mathcal {R}}}_1\frac{N_{21}}{N_{1p}}&{} \mathcal{R}_1\frac{N_{21}(\gamma +\mu _1)}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_1\frac{N_{21}\mu _1}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} 0&{} {{\mathcal {R}}}_1\frac{N_{21}\lambda _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} \mathcal{R}_1\frac{N_{21}(\gamma +\lambda _2)}{N_{1p}(\gamma +\lambda _2+\mu _2)} \end{array} \right] \end{aligned}$$

Let us consider the diagonal matrix

$$\begin{aligned} D:=\left[ \begin{array}{cccccc} {{\mathcal {R}}}_1\frac{N_{1r}}{N_{1p}} &{} \\ &{} {{\mathcal {R}}}_1\frac{N_{11}}{N_{1p}} \\ &{} &{} {{\mathcal {R}}}_2\frac{N_{12}}{N_{2p}}\\ &{} &{} &{} {{\mathcal {R}}}_2\frac{N_{2r}}{N_{2p}}\\ &{} &{} &{} &{} {{\mathcal {R}}}_2\frac{N_{22}}{N_{2p}}\\ &{} &{} &{} &{} &{} {{\mathcal {R}}}_1\frac{N_{21}}{N_{1p}} \end{array}\right] \end{aligned}$$

and the matrix \(Q=D^{-1} M D\), whose computation gives the expression

$$\begin{aligned} Q=\left[ \begin{array}{cccccc} {{\mathcal {R}}}_1\frac{N_{1r}}{N_{1p}}&{} \mathcal{R}_1\frac{N_{11}(\gamma +\mu _1)}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_2\frac{N_{12}\mu _1}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} 0&{} {{\mathcal {R}}}_2\frac{N_{22}\lambda _2}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_1\frac{N_{21}(\gamma +\lambda _2)}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ {{\mathcal {R}}}_1\frac{N_{1r}}{N_{1p}}&{} \mathcal{R}_1\frac{N_{11}(\gamma +\mu _1)}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_2\frac{N_{12}\mu _1}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} 0&{} {{\mathcal {R}}}_2\frac{N_{22}\lambda _2}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_1\frac{N_{21}(\gamma +\lambda _2)}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ 0&{} \mathcal{R}_1\frac{N_{11}\lambda _1}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} \mathcal{R}_2\frac{N_{12}(\gamma +\lambda _1)}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_2\frac{N_{2r}}{N_{2p}}&{} \mathcal{R}_2\frac{N_{22}(\gamma +\mu _2)}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_1\frac{N_{21}\mu _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ 0&{} \mathcal{R}_1\frac{N_{11}\lambda _1}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} \mathcal{R}_2\frac{N_{12}(\gamma +\lambda _1)}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_2\frac{N_{2r}}{N_{2p}}&{} \mathcal{R}_2\frac{N_{22}(\gamma +\mu _2)}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_1\frac{N_{21}\mu _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ 0&{} \mathcal{R}_1\frac{N_{11}\lambda _1}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} \mathcal{R}_2\frac{N_{12}(\gamma +\lambda _1)}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_2\frac{N_{2r}}{N_{2p}}&{} \mathcal{R}_2\frac{N_{22}(\gamma +\mu _2)}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} {{\mathcal {R}}}_1\frac{N_{21}\mu _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ {{\mathcal {R}}}_1\frac{N_{1r}}{N_{1p}}&{} \mathcal{R}_1\frac{N_{11}(\gamma +\mu _1)}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} {{\mathcal {R}}}_2\frac{N_{12}\mu _1}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} 0&{} {{\mathcal {R}}}_2\frac{N_{22}\lambda _2}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} \mathcal{R}_1\frac{N_{21}(\gamma +\lambda _2)}{N_{1p}(\gamma +\lambda _2+\mu _2)} \end{array} \right] \end{aligned}$$

The matrix Q is nonnegative and irreducible. By Perron–Frobenius theorem [see, for instance, Berman and Plemmons (1994)], this matrix admits a unique positive eigenvector (up to a scalar multiplication) that corresponds to the simple (positive) eigenvalue \(\ell =\rho (Q)=\rho (M)\).

Note that the rank of Q is two. We posit

$$\begin{aligned} Y=(1,1,0,0,0,1)^\top , \quad Z=(0,0,1,1,1,0)^\top \end{aligned}$$

and define \(Q_Y\), \(Q_Z\) the first and third lines, respectively, of the matrix Q. Then, for any vector \(X \in {\mathbb {R}}^6\), one has \(QX=(Q_Y X)Y+(Q_Z X)Z\). We look for an positive eigenvector X of the form \(X=\alpha Y + (1-\alpha ) Z\) with \(\alpha \in (0,1)\). One has then

$$\begin{aligned} QX&= \alpha QY + (1-\alpha ) QZ\nonumber \\&= \alpha \big ( (Q_Y Y)Y+(Q_Z Y)Z \big ) + (1-\alpha ) \big ( (Q_Y Z)Y+(Q_Z Z)Z \big )\nonumber \\&=\big ( \alpha (Q_Y Y) + (1-\alpha ) (Q_Y Z) \big ) Y + \big ( \alpha (Q_Z Y) + (1-\alpha ) (Q_Z Z) \big ) Z \end{aligned}$$
(4)

On the other hand, as X is an eigenvector, one has

$$\begin{aligned} QX=\ell X = \alpha \ell Y + (1-\alpha )\ell Z \end{aligned}$$
(5)

The vectors Y and Z being orthogonal, one obtains from (4)–(5) the conditions

$$\begin{aligned} \left\{ \begin{array}{l} \alpha Q_Y Y + (1-\alpha ) Q_Y Z = \alpha \ell \\ \alpha Q_Z Y + (1-\alpha ) Q_Z Z = (1-\alpha )\ell \end{array}\right. \end{aligned}$$
(6)

Let \(r=\frac{1-\alpha }{\alpha }\). Eliminating \(\ell \) in the two previous equations, r is the positive solution of the polynomial

$$\begin{aligned} r^2 Q_Y Z + r(Q_Y Y -Q_Z Z) -Q_Z Y=0 \end{aligned}$$

and \(\ell =Q_Y Y + r Q_Y Z\). One obtains the expression of the eigenvalue

$$\begin{aligned} \ell = \frac{ Q_Y Y + Q_Z Z + \sqrt{ (Q_Y Y - Q_Z Z)^2 + 4 (Q_Y Z)(Q_Z Y)}}{2} \end{aligned}$$

Finally, from the expression of Q, one gets

$$\begin{aligned} \begin{array}{l} q_{11}=Q_Y Y = {{\mathcal {R}}}_1 \left( \frac{N_{1r}}{N_{1p}}+ \frac{N_{11}}{N_{1p}}\frac{\gamma +\mu _1}{\gamma +\lambda _1+\mu _1}+ \frac{N_{21}}{N_{1p}}\frac{\gamma +\lambda _2}{\gamma +\lambda _2+\mu _2}\right) \\ q_{22}=Q_Z Z = {{\mathcal {R}}}_2 \left( \frac{N_{2r}}{N_{2p}}+ \frac{N_{22}}{N_{2p}}\frac{\gamma +\mu _2}{\gamma +\lambda _2+\mu _2}+ \frac{N_{12}}{N_{2p}}\frac{\gamma +\lambda _1}{\gamma +\lambda _1+\mu _1}\right) \\ q_{21}=Q_Z Y = {{\mathcal {R}}}_1 \left( \frac{N_{11}}{N_{1p}}\frac{\lambda _1}{\gamma +\lambda _1+\mu _1}+ \frac{N_{21}}{N_{1p}}\frac{\mu _2}{\gamma +\lambda _2+\mu _2}\right) \\ q_{12}=Q_Y Z = {{\mathcal {R}}}_2 \left( \frac{N_{12}}{N_{2p}}\frac{\mu _1}{\gamma +\lambda _1+\mu _1}+ \frac{N_{22}}{N_{2p}}\frac{\lambda _2}{\gamma +\lambda _2+\mu _2}\right) \\ \end{array} \end{aligned}$$

and thus \(\ell ={{\mathcal {R}}}_{1,2}\), which is exactly \(\rho (M)\).

i. When \({{\mathcal {R}}}_{1,2}>1\), the matrix A has at least one eigenvalue with positive real part and the matrix J as well. The equilibrium \(X^\star \) is thus unstable on \({{\mathcal {D}}}\).

ii. When \({{\mathcal {R}}}_{1,2}<1\), the matrix A is Hurwitz, but \(X^\star \) is not an hyperbolic equilibrium. However, on can write the dynamics of the vector I as an non-autonomous system

$$\begin{aligned} \dot{I} = g(t,I):=\left( \begin{array}{c} \beta _1 S_{1r}(t)\frac{I_{1r}+I_{11}+I_{21}}{N_{1p}}-\gamma I_{1r}\\ \beta _1 S_{11}(t)\frac{I_{1r}+I_{11}+I_{21}}{N_{1p}}-(\gamma +\lambda _1)I_{11}+\mu _1I_{12}\\ \beta _2 S_{12}(t)\frac{I_{2r}+I_{22}+I_{12}}{N_{2p}}+\lambda _1I_{11}-(\gamma +\mu _1)I_{12}\\ \beta _2 S_{2r}(t)\frac{I_{2r}+I_{22}+I_{12}}{N_{2p}}-\gamma I_{2r}\\ \beta _2 S_{22}(t)\frac{I_{2r}+I_{22}+I_{12}}{N_{2p}}-(\gamma +\lambda _2)I_{22}+\mu _2I_{21}\\ \beta _1 S_{21}(t)\frac{I_{1r}+I_{11}+I_{21}}{N_{1p}}+\lambda _2I_{22}-(\gamma +\mu _2)I_{21} \end{array}\right) \end{aligned}$$

Note that this dynamics is cooperative and as for any \(t\ge 0\) one has \(S_{ij}(t)\le N_{ij}\) for \(ij \in \{1r,11,12,2r,22,21\}\), one gets

$$\begin{aligned} g(t,I) \le {\bar{g}}(I):=AI, \; I \ge 0 \end{aligned}$$

Therefore, any solution \(I(\cdot )\) of \(\dot{I}=g(t,I)\) with \(I(0)=I_0\ge 0\) verifies \(0\le I(t) \le {\bar{I}}(t)\) for any \(t\ge 0\), where \({\bar{I}}(\cdot )\) is solution of the linear dynamics \(\dot{{\bar{I}}}={\bar{g}}({\bar{I}})\) with \({\bar{I}}(0)=I_0\). As A is Hurwitz, we conclude that \(X^\star \) is exponentially stable with respect to I, which proves point ii.

iii. For the particular case \({{\mathcal {R}}}_1={{\mathcal {R}}}_2:={{\mathcal {R}}}\), the transpose of the matrix M writes

$$\begin{aligned} M^\top ={{\mathcal {R}}} \left[ \begin{array}{cccccc} \frac{N_{1r}}{N_{1p}}&{} \frac{N_{11}}{N_{1p}}&{} 0&{} 0&{} 0&{} \frac{N_{21}}{N_{1p}}\\ \frac{N_{1r}(\gamma +\mu _1)}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{11}(\gamma +\mu _1)}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{12}\lambda _1}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{2r}\lambda _1}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{22}\lambda _1}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{21}(\gamma +\mu _1)}{N_{1p}(\gamma +\lambda _1+\mu _1)}\\ \frac{N_{1r}\mu _1}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{11}\mu _1}{N_{1p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{12}(\gamma +\lambda _1)}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{2r}(\gamma +\lambda _1)}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{22}(\gamma +\lambda _1)}{N_{2p}(\gamma +\lambda _1+\mu _1)}&{} \frac{N_{21}\mu _1}{N_{1p}(\gamma +\lambda _1+\mu _1)}\\ 0&{} 0&{} \frac{N_{12}}{N_{2p}}&{} \frac{N_{2r}}{N_{2p}}&{} \frac{N_{22}}{N_{2p}}&{} 0\\ \frac{N_{1r}\lambda _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{11}\lambda _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{12}(\gamma +\mu _2)}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{2r}(\gamma +\mu _2)}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{22}(\gamma +\mu _2)}{N_{2p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{21}\lambda _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}\\ \frac{N_{1r}(\gamma +\lambda _2)}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{11}(\gamma +\lambda _2)}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{12}\mu _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{2r}\mu _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{22}\mu _2}{N_{1p}(\gamma +\lambda _2+\mu _2)}&{} \frac{N_{21}(\gamma +\lambda _2)}{N_{1p}(\gamma +\lambda _2+\mu _2)} \end{array} \right] \end{aligned}$$

One can check that one has \(M^\top U={{\mathcal {R}}}U\) where \(U=(1,1,1,1,1,1)^\top \). As U is a positive vector, we deduce from the Perron–Frobenius theorem that one has \(\rho (M)=\rho (M^T)=\mathcal{R}\), which ends the proof. \(\square \)

Remark 1

More generally, the next-generation matrix \(M=FV^{-1}\) can be shown to have a rank equal to the number n of patches and that its Perron vector can be expressed as a convex combination of a family of orthogonal vectors in the image of M. This implies that the positive eigenvalue of M (i.e., the reproduction number) is also the positive eigenvalue of the n-dimensional positive matrix given by the decomposition of the image of this vectors by the matrix M.

Alternatively, one may consider the epidemic spread in a virgin population as a Markov process, to determine the expected numbers of secondary cases in each patch, and obtain this \(n\times n\) matrix, as described in Diekmann et al. (2013). This method consists in a first-step analysis by determining the mean residence times of an infected individual of each group in each of the patches. Then, for a given patch the expected numbers of new infected present in each path are given by the products of the mean residence times by the transmission rate, averaged by the constant distribution given in (1).

This explains why the formula (2) takes the expression of a root of the characteristic polynomial of a 2 by 2 matrix.

Remark 2

The explicit expression (2) of the epidemic threshold given in Proposition 1 is also relevant in absence of permanently resident populations, which has not been yet provided explicitly in the literature (up to our knowledge).

Corollary 2

One has

$$\begin{aligned} \min \left( {{\mathcal {R}}}_1, {{\mathcal {R}}}_2\right) \le {{\mathcal {R}}}_{1,2} \le \max \left( {{\mathcal {R}}}_1, {{\mathcal {R}}}_2\right) . \end{aligned}$$

Proof

Denote by \(M({{\mathcal {R}}}_1,{{\mathcal {R}}}_2)\) the matrix \(FV^{-1}\) for the parameters \({{\mathcal {R}}}_1\), \({{\mathcal {R}}}_2\), and let \(\mathcal{R}_-:=\min \left( {{\mathcal {R}}}_1,{{\mathcal {R}}}_2\right) \), \(\mathcal{R}_+:=\max \left( {{\mathcal {R}}}_1,{{\mathcal {R}}}_2\right) \). From the expression of the nonnegative matrices M, one gets

$$\begin{aligned} M({{\mathcal {R}}}_-,{{\mathcal {R}}}_-) \le M({{\mathcal {R}}}_1,{{\mathcal {R}}}_2) \le M(\mathcal{R}_+,{{\mathcal {R}}}_+) \end{aligned}$$

which implies [see, for instance, Berman and Plemmons (1994)] the inequalities

$$\begin{aligned} \rho (M({{\mathcal {R}}}_-,{{\mathcal {R}}}_-)) \le \rho (M({{\mathcal {R}}}_1,{{\mathcal {R}}}_2)) \le \rho (M({{\mathcal {R}}}_+,{{\mathcal {R}}}_+)) \end{aligned}$$

and thus

$$\begin{aligned} {{\mathcal {R}}}_- \le {{\mathcal {R}}}_{1,2} \le {{\mathcal {R}}}_+ . \end{aligned}$$

\(\square \)

Alternatively, the number \({{\mathcal {R}}}_{1,2}\) can be determined as follows.

Corollary 3

Assume \({{\mathcal {R}}}_2>{{\mathcal {R}}}_1\). Then, one has

$$\begin{aligned} {{\mathcal {R}}}_{1,2}=\alpha {{\mathcal {R}}}_1 + (1-\alpha )\mathcal{R}_2 \end{aligned}$$
(7)

where \(\alpha \in [0,1)\) is the smallest root of the polynomial

$$\begin{aligned} P(\alpha )=\alpha ^2({{\mathcal {R}}}_2-{{\mathcal {R}}}_1)-\alpha ({{\mathcal {R}}}_2-\mathcal{R}_1+q_{12}+q_{21})+q_{12} \end{aligned}$$

Proof

One can check, from expressions (3), that one has \(q_{11}+q_{21}={{\mathcal {R}}}_1\) and \(q_{22}+q_{12}={{\mathcal {R}}}_2\). Then, from (6), one get

$$\begin{aligned} {{\mathcal {R}}}_{1,2}=l=\alpha {{\mathcal {R}}}_1 + (1-\alpha ){{\mathcal {R}}}_2 \end{aligned}$$
(8)

where \(\alpha \) is a root of the polynomial P obtained from (6) by eliminating l, that is

$$\begin{aligned} P(\alpha )=\alpha ^2({{\mathcal {R}}}_2-{{\mathcal {R}}}_1)-\alpha ({{\mathcal {R}}}_2-\mathcal{R}_1+q_{12}+q_{21})+q_{12} \end{aligned}$$

From Corollary 2, we know that \(\alpha \) belongs to [0, 1].

Note that one has \(P(0)=q_{12}\ge 0\) and \(P(1)=-q_{21}\le 0\). Therefore, when \({{\mathcal {R}}}_2-{{\mathcal {R}}}_1>0\), P admits exactly one root in [0, 1) and another one in \([1,\rightarrow )\). However, if \(\alpha =1\) one should have \(q_{21}=0\) and thus \(\lambda _1=0\), \(\mu _2=0\), which implies \(N_{11}=N_{1c}\), \(N_{12}=0\), \(N_{22}=0\), \(N_{21}=N_{2c}\). Then, one obtains \(q_{11}={{\mathcal {R}}}_1\), \(q_{22}={{\mathcal {R}}}_2\) and from the expression (2) on gets \({{\mathcal {R}}}_{1,2}=\max (\mathcal{R}_1,{{\mathcal {R}}}_2)={{\mathcal {R}}}_2\) which contradicts \(\alpha =1\). We conclude that \(\alpha \) belongs to [0, 1) and is thus the smallest root of P. \(\square \)

Remark 3

When there is no communication between patches (that is \(N_{1r}=N_{1p}=N_1\), \(N_{2r}=N_{2p}=N_2\)), one has \(q_{21}=0\) and \(q_{12}=0\). If \({{\mathcal {R}}}_2>{{\mathcal {R}}}_1\), resp. \({{\mathcal {R}}}_1>\mathcal{R}_2\), one has \(\alpha =0\), resp. \(\alpha =1\), which gives

$$\begin{aligned} {{\mathcal {R}}}_{1,2} =\max ({{\mathcal {R}}}_1,{{\mathcal {R}}}_2) . \end{aligned}$$

We look now for a characterization of the minimum value of the threshold \({{\mathcal {R}}}_{1,2}\).

4 Minimization of the Epidemic Threshold

In this section, we assume that the mixing is fast compared to the recovery rate (as its is often considered in the literature), which amounts to have numbers \(\lambda _i\), \(\mu _i\) large compared to \(\gamma \). Our objective is to study how the proportions of commuters in the populations impact the value of \({{\mathcal {R}}}_{1,2}\).

Given \({{\mathcal {R}}}_1\), \({{\mathcal {R}}}_2\), we consider the approximation \(\tilde{{\mathcal {R}}}_{1,2}\) of the threshold \({{\mathcal {R}}}_{1,2}\) which consists in keeping \(\gamma =0\) in the expressions (3). For convenience, we posit the numbers

$$\begin{aligned} \eta _i:= \frac{\lambda _i}{\lambda _i+\mu _i} \in (0,1) \qquad (i=1,2) \end{aligned}$$

One has a first result about the variations of \(\tilde{\mathcal{R}}_{1,2}\) with respect to \(N_{1c}\), \(N_{2c}\).

Proposition 4

Fix parameters \(N_i\), \(\beta _i\), \(\gamma \), \(\lambda _i\), \(\mu _i\) (\(i=1,2\)) such that \({{\mathcal {R}}}_2>{{\mathcal {R}}}_1\).

  1. 1.

    For any \(N_{1c} \in (0,N_1)\), the map \(N_{2c} \mapsto \tilde{{\mathcal {R}}}_{1,2}(N_{1c},N_{2c})\) is decreasing.

  2. 2.

    The map \(N_{1c} \mapsto \tilde{{\mathcal {R}}}_{1,2}(N_{1c},N_{2c})\) is increasing at \((N_{1c},N_{2c})\) when

    $$\begin{aligned} \eta _2(1-\eta _2)N_{2c} > (1-\eta _1)(N_2-\eta _2N_{2c}) \end{aligned}$$
    (9)
  3. 3.

    The map \(N_{1c} \mapsto \tilde{{\mathcal {R}}}_{1,2}(N_{1c},N_{2c})\) is increasing, resp. decreasing, at \((N_{1c},N_{2c})\) if the numbers A and B are negative, resp. positive, where

    $$\begin{aligned}&A:={{\mathcal {R}}}_2 \frac{\frac{N_2}{2}-\eta _1(\frac{1}{2}-\eta _1)N_{1c}-(\frac{3}{2}-\eta _2)\eta _2N_{2c}}{N_2-\eta _2N_{2c}+\eta _1N_{1c}}\\&\qquad -{{\mathcal {R}}}_1 \frac{\frac{N_1}{2}-(\frac{3}{2}-\eta _1)\eta _1N_{1c}-\eta _2(\frac{1}{2}-\eta _2)N_{2c}}{N_1-\eta _1N_{1c}+\eta _2N_{2c}} , \\&B := {{\mathcal {R}}}_2 \frac{(1-\eta _1)(N_2-\eta _2N_{2c})-\eta _2(1-\eta _2)N_{2c}}{(N_2-\eta _2N_{2c}+\eta _1N_{1c})^2}\\&\qquad -\mathcal{R}_1\frac{(1-\eta _1)(N_1+\eta _2N_{2c})+\eta _2(1-\eta _2)N_{2c}}{(N_1-\eta _1N_{1c}+\eta _2N_{2c})^2} \end{aligned}$$

Proof

Following Corollary 3, one has

$$\begin{aligned} \tilde{{\mathcal {R}}}_{1,2}={\tilde{\alpha }} {{\mathcal {R}}}_1 + (1-{\tilde{\alpha }}){{\mathcal {R}}}_2 \end{aligned}$$
(10)

where \({\tilde{\alpha }}\) is the smallest root of the polynomial

$$\begin{aligned} {\tilde{P}}(\alpha )=\alpha ^2({{\mathcal {R}}}_2-{{\mathcal {R}}}_1)-\alpha (\mathcal{R}_2-{{\mathcal {R}}}_1+{\tilde{q}}_{12}+{\tilde{q}}_{21})+{\tilde{q}}_{12} \end{aligned}$$

where \({\tilde{q}}_{12}\), \({\tilde{q}}_{21}\) are the approximations of \(q_{12}\), \(q_{21}\) defined in (3). Let us note that one can write \(N_{ii}=(1-\eta _i)N_{ic}\), \(N_{ij}=\eta _i N_{ic}\) (for \(j \ne i\)) and also \(N_{ip}=N_i-\eta _i N_{ic}+\eta _j N_{jc}\), which leads to the following expressions of \({\tilde{q}}_{12}\), \({\tilde{q}}_{21}\)

$$\begin{aligned}{} & {} {\tilde{q}}_{21}={{\mathcal {R}}}_1 \frac{(1-\eta _1)\eta _1N_{1c}+\eta _2(1-\eta _2)N_{2c}}{N_1-\eta _1N_{1c}+\eta _2N_{2c}}, \nonumber \\{} & {} {\tilde{q}}_{12}={{\mathcal {R}}}_2 \frac{\eta _1(1-\eta _1)N_{1c}+(1-\eta _2)\eta _2N_{2c}}{N_2-\eta _2N_{2c}+\eta _1N_{1c}} \end{aligned}$$
(11)

For simplicity, we shall drop the notation \(\tilde{ }\;\) in the rest of the proof. Note than \(\alpha \) being the smallest root of P, it verifies

$$\begin{aligned} \alpha < \frac{{{\mathcal {R}}}_2-\mathcal{R}_1+q_{12}+q_{21}}{2({{\mathcal {R}}}_2-{{\mathcal {R}}}_1)} \end{aligned}$$
(12)

Let us differentiate the equality \(P(\alpha )=0\) with respect to \(q_{12}\) and \(q_{21}\):

$$\begin{aligned}&2\alpha \frac{\partial \alpha }{\partial q_{12}}({{\mathcal {R}}}_2-{{\mathcal {R}}}_1) -\frac{\partial \alpha }{\partial q_{12}}({{\mathcal {R}}}_2-{{\mathcal {R}}}_1+q_{12}+q_{21}) -\alpha +1=0\\&2\alpha \frac{\partial \alpha }{\partial q_{21}}({{\mathcal {R}}}_2-\mathcal{R}_1) -\frac{\partial \alpha }{\partial q_{21}}({{\mathcal {R}}}_2-\mathcal{R}_1+q_{12}+q_{21}) -\alpha =0 \end{aligned}$$

which gives

$$\begin{aligned}&\frac{\partial \alpha }{\partial q_{12}} =\frac{1-\alpha }{{{\mathcal {R}}}_2-{{\mathcal {R}}}_1+q_{12}+q_{21}-2\alpha ({{\mathcal {R}}}_2-{{\mathcal {R}}}_1)}\\&\frac{\partial \alpha }{\partial q_{21}} =\frac{-\alpha }{\mathcal{R}_2-{{\mathcal {R}}}_1+q_{12}+q_{21}-2\alpha ({{\mathcal {R}}}_2-{{\mathcal {R}}}_1)} \end{aligned}$$

Then, one can write

$$\begin{aligned} \frac{\partial \alpha }{\partial N_{ic}}= & {} \frac{\partial \alpha }{\partial q_{12}}\frac{\partial q_{12}}{\partial N_{ic}}+ \frac{\partial \alpha }{\partial q_{21}}\frac{\partial q_{21}}{\partial N_{ic}}\\= & {} \frac{(1-\alpha )\frac{\partial q_{12}}{\partial N_{ic}}-\alpha \frac{\partial q_{21}}{\partial N_{ic}}}{\mathcal{R}_2-{{\mathcal {R}}}_1+q_{12}+q_{21}-2\alpha ({{\mathcal {R}}}_2-{{\mathcal {R}}}_1)} \qquad (i=1, 2) \end{aligned}$$

and from inequality (12), we obtain that the signs of the derivatives \(\frac{\partial \alpha }{\partial N_{ic}}\) are given by the sign of the numbers

$$\begin{aligned} \sigma _i := (1-\alpha )\frac{\partial q_{12}}{\partial N_{ic}}-\alpha \frac{\partial q_{21}}{\partial N_{ic}} \qquad (i=1,2) \end{aligned}$$
(13)

We begin by the dependency with respect to \(N_{2c}\). One has first

$$\begin{aligned} \frac{\partial q_{12}}{\partial N_{2c}}={{\mathcal {R}}}_2\eta _2 \frac{(1-\eta _2)( N_2+\eta _1N_{1c})+\eta _1(1-\eta _1)N_{1c}}{(N_2+\eta _1N_{1c}-\eta _2N_{2c})^2} >0 \end{aligned}$$

Note that one has

$$\begin{aligned} q_{21}=\frac{\mathcal{R}_1(N_2+\eta _1N_{1c}-\eta _2N_{2c})}{\mathcal{R}_2(N_1-\eta _1N_{1c}+\eta _2N_{2c})}q_{12} \end{aligned}$$
(14)

and thus

$$\begin{aligned} \frac{\partial q_{21}}{\partial N_{2c}}=\frac{\mathcal{R}_1(N_2+\eta _1N_{1c}-\eta _2N_{2c})}{\mathcal{R}_2(N_1-\eta _1N_{1c}+\eta _2N_{2c})}\frac{\partial q_{12}}{\partial N_{2c}} - \frac{{{\mathcal {R}}}_1 \eta _2 (N_1+N_2)}{\mathcal{R}_2(N_1-\eta _1N_{1c}+\eta _2N_{2c})^2}q_{12} \end{aligned}$$

Then, one gets the inequality

$$\begin{aligned} \sigma _2 > \left( 1 - \alpha - \alpha \frac{\mathcal{R}_1(N_2+\eta _1N_{1c}-\eta _2N_{2c})}{\mathcal{R}_2(N_1-\eta _1N_{1c}+\eta _2N_{2c})} \right) \frac{\partial q_{12}}{\partial N_{2c}} \end{aligned}$$

On another hand, one gets from \(P(\alpha )=0\) the inequality

$$\begin{aligned} (1-\alpha )q_{12}-\alpha q_{21}= \alpha (1-\alpha )({{\mathcal {R}}}_2-\mathcal{R}_1)> 0 \end{aligned}$$

and with (14)

$$\begin{aligned} (1-\alpha )q_{12}-\alpha q_{21}= \left( 1 - \alpha - \alpha \frac{{{\mathcal {R}}}_1(N_2+\eta _1N_{1c}-\eta _2N_{2c})}{\mathcal{R}_2(N_1-\eta _1N_{1c}+\eta _2N_{2c})} \right) q_{12} >0 \end{aligned}$$

We then conclude that \(\sigma _2\) is positive, and from (10) we deduce that the map \(N_{2c} \mapsto \mathcal{R}_{1,2}\) is decreasing. This proves the point i.

We study now the dependency with respect to \(N_{1c}\). A calculation of the partial derivative gives

$$\begin{aligned} \frac{\partial q_{12}}{\partial N_{1c}}=\mathcal{R}_2\eta _1 \frac{(1-\eta _1)(N_2-\eta _2N_{2c})-\eta _2(1-\eta _2)N_{2c}}{(N_2-\eta _2N_{2c}+\eta _1N_{1c})^2} \end{aligned}$$
(15)

and

$$\begin{aligned} \frac{\partial q_{21}}{\partial N_{1c}}=\mathcal{R}_1\eta _1 \frac{(1-\eta _1)(N_1+\eta _2N_{2c})+\eta _2(1-\eta _2)N_{2c}}{(N_1-\eta _1N_{1c}+\eta _2N_{2c})^2} >0 \end{aligned}$$
(16)

When \(\frac{\partial q_{12}}{\partial N_{1c}}<0\), we can conclude that \(\sigma _1\) is negative and \( {{\mathcal {R}}}_{1,2}\) is thus increasing with respect to \(N_{1c}\). This condition is equivalent to (9). This proves the point ii. When this last condition is not satisfied, having \(\frac{\partial q_{12}}{\partial N_{1c}}<\frac{\partial q_{21}}{\partial N_{1c}}\) with \(\alpha >\frac{1}{2}\) is another sufficient condition to obtain \(\sigma _1<0\) from expression (13). However, having \(\alpha >\frac{1}{2}\) amounts to have \(P(\frac{1}{2})>0\), that is

$$\begin{aligned} \frac{{{\mathcal {R}}}_2-{{\mathcal {R}}}_1}{4}-\frac{{{\mathcal {R}}}_2-\mathcal{R}_1+q_{12}+q_{21}}{2}+q_{12}>0 \end{aligned}$$

or equivalently

$$\begin{aligned} \frac{{{\mathcal {R}}}_2}{2}-q_{12} < \frac{{{\mathcal {R}}}_1}{2} - q_{21} \end{aligned}$$

One can check that this last condition is equivalent to \(A<0\) and that the condition \(\frac{\partial q_{12}}{\partial N_{1c}}<\frac{\partial q_{21}}{\partial N_{1c}}\) is equivalent to \(B<0\). In the same manner, having \(A>0\) and \(B>0\) implies \(\alpha <\frac{1}{2}\) and \(\frac{\partial q_{12}}{\partial N_{1c}}>\frac{\partial q_{21}}{\partial N_{1c}}\), which is a sufficient condition to have \(\sigma _1>0\), and thus \( \mathcal{R}_{1,2}\) increasing with respect to \(N_{1c}\). This proves the point iii. \(\square \)

This result suggests that the map \(N_{1c} \mapsto \tilde{\mathcal{R}}_{1,2}(N_{1c},N_{2c})\) is not necessarily monotonic, differently to the map \(N_{2c} \mapsto \tilde{{\mathcal {R}}}_{1,2}(N_{1c},N_{2c})\). We show now that the possibilities of its variations are limited.

Proposition 5

Under hypotheses of Proposition 4, for each \(N_{2c} \in (0,N_2)\) the map \(N_{1c} \mapsto \tilde{{\mathcal {R}}}_{1,2}(N_{1c},N_{2c})\) possesses one of the three properties

  1. 1.

    it is decreasing on \((0,N_1)\),

  2. 2.

    it is increasing on \((0,N_1)\),

  3. 3.

    there exists \(N_{1c}^\star \in (0,N_1)\) such that it is decreasing on \((0,N_{1c}^\star )\) and increasing on \((N_{1c}^\star ,N_1)\).

Proof

Fix \(N_{2c} \in (0,N_2)\). If the map \(N_{1c} \mapsto \tilde{\mathcal{R}}_{1,2}(N_{1c},N_{2c})\) is not monotonic, there exists \({\hat{N}}_{1c} \in (0,N_1)\) such that \(\frac{\partial \tilde{\mathcal{R}}_{1,2}}{\partial N_{1c}}({\hat{N}}_{1c},N_{2c})=0\). For simplicity, we shall drop the notation \(\tilde{ }\;\) in the rest of the proof. Following the proof of Proposition 4, one has \(\mathcal{R}_{1,2}=\alpha {{\mathcal {R}}}_1 + (1-\alpha ){{\mathcal {R}}}_2\) with

$$\begin{aligned} \frac{\partial \alpha }{\partial N_{1c}}=\frac{(1-\alpha )\frac{\partial q_{12}}{\partial N_{1c}}-\alpha \frac{\partial q_{21}}{\partial N_{1c}}}{\mathcal{R}_2-{{\mathcal {R}}}_1+q_{12}+q_{21}-2\alpha ({{\mathcal {R}}}_2-\mathcal{R}_1)}:=\frac{\sigma _1}{\nu } \end{aligned}$$

where \(\nu >0\). Therefore, one has \(\frac{\partial \alpha }{\partial N_{1c}}=0\) and \(\sigma _1=0\) at \(N_{1c}={\hat{N}}_{1c}\), and thus

$$\begin{aligned} \left. \frac{\partial ^2 \alpha }{\partial N_{1c}^2}\right| _{N_{1c}={\hat{N}}_{1c}} = \left. \frac{\frac{\partial \sigma _1}{\partial N_{1c}}}{\nu }\right| _{N_{1c}={\hat{N}}_{1c}} = \left. \frac{(1-\alpha )\frac{\partial ^2 q_{12}}{\partial N_{1c}^2}-\alpha \frac{\partial ^2 q_{21}}{\partial N_{1c}^2}}{\nu }\right| _{N_{1c}={\hat{N}}_{1c}} \end{aligned}$$

From expressions (15) and (16), a calculation of the partial derivatives gives

$$\begin{aligned} \frac{\partial ^2q_{12}}{\partial N_{1c}^2}=\frac{-2\eta _1\frac{\partial q_{12}}{\partial N_{1c}}}{N_1-\eta _1N_{1c}+\eta _2N_{2c}} , \quad \frac{\partial ^2q_{21}}{\partial N_{1c}^2}=\frac{2\eta _1 \frac{\partial q_{21}}{\partial N_{1c}}}{N_2-\eta _2N_{2c}+\eta _1N_{1c}} \end{aligned}$$

where \(\frac{\partial q_{21}}{\partial N_{1c}}>0\) and from \(\sigma _1=0\) one gets \(\frac{\partial q_{12}}{\partial N_{1c}}>0\) for \(N_{1c}={\hat{N}}_{1c}\). Finally, one obtains

$$\begin{aligned} \frac{\partial ^2 {{\mathcal {R}}}_{1,2}}{\partial N_{1c}^2}({\hat{N}}_{1c},N_{2c})=-({{\mathcal {R}}}_2-{{\mathcal {R}}}_1)\frac{\partial ^2 \alpha }{\partial N_{1c}^2}({\hat{N}}_{1c},N_{2c})<0 \end{aligned}$$

Consequently, any extremum of the map \(N_{1c} \mapsto \mathcal{R}_{1,2}(N_{1c},N_{2c})\) is a local minimizer, which implies that this map has at most one local minimizer. \(\square \)

Finally, we give conditions for which the minimization of the threshold \({{\mathcal {R}}}_{1,2}\) presents a trichotomy.

Proposition 6

Let parameters \(\beta _i\), \(\gamma \) be such that \({{\mathcal {R}}}_2>{{\mathcal {R}}}_1\) and assume that \(N_1\), \(N_2\) satisfy \(N_1{{\mathcal {R}}}_2>N_2{{\mathcal {R}}}_1\). Then, provided that \(\gamma \) is small enough compared to \(\lambda _i\) and \(\mu _i\), the function \((N_{1c},N_{2c}) \mapsto {{\mathcal {R}}}_{1,2}(N_{1c},N_{2c})\) admits an unique minimum at \((N_{1c}^\star ,N_{2c}^\star )\) with \(N_{2c}^\star =N_2\). Moreover, one has the following properties.

  1. 1.

    \(N_{1c}^\star =0\) if \(\eta _2 > 1-\eta _1\),

  2. 2.

    \(N_{1c}^\star =N_1\) if \(\eta _1\) and \(\eta _2\) are sufficiently small,

  3. 3.

    there exists \(\eta _1\), \(\eta _2\) for which \(N_{1c}^\star \in (0,N_1)\).

Proof

We first show that the announced properties are satisfied for the approximate function \(\tilde{{\mathcal {R}}}_{1,2}\).

From Propositions 4 and 5 , we know that \(\tilde{{\mathcal {R}}}_{1,2}\) admits an unique minimum at \(({\hat{N}}_{1c},{\hat{N}}_{2c})\) with \({\hat{N}}_{2c}=N_2\). For \(N_{2c}=N_2\), the condition (9) simply writes \(\eta _2 > 1-\eta _1\) which implies from point ii. of Proposition 4 that one has \({\hat{N}}_{1c}=0\) when this condition is fulfilled. This shows that point 1 is verified for the function \(\tilde{{\mathcal {R}}}_{1,2}\).

One obtains the limits

$$\begin{aligned} \lim _{\eta _1, \eta _2 \rightarrow 0} A = \frac{{{\mathcal {R}}}_2}{2}- \frac{\mathcal{R}_1}{2}> 0, \quad \lim _{\eta _1, \eta _2 \rightarrow 0} B = \frac{\mathcal{R}_2}{N_2}- \frac{{{\mathcal {R}}}_1}{N_1}> 0 \end{aligned}$$

which show that numbers A and B are positive when \(\eta _1\), \(\eta _2\) are small, and thus one has \({\hat{N}}_{1c}=N_1\) from point iii of Proposition 4. This shows that point 2 is verified for the function \(\tilde{{\mathcal {R}}}_{1,2}\).

Take now any \(N_{1c} \in (0,N_1)\). When \(\eta _2 > 1-\eta _1\), one has \(\frac{\partial \tilde{{\mathcal {R}}}_{1,2}}{\partial N_{1c}}(N_{1c},N_2)>0\), and for \(\eta _1\), \(\eta _2\) small, \(\frac{\partial \tilde{{\mathcal {R}}}_{1,2}}{\partial N_{1c}}(N_{1c},N_2)<0\) is verified. Then, by continuity of the function \(\tilde{{\mathcal {R}}}_{1,2}\) with respect to parameters \(\eta _1\), \(\eta _2\), one deduce that the existence of values \(\hat{\eta }_1\), \({\hat{\eta }}_2\) for which \(\frac{\partial \tilde{\mathcal{R}}_{1,2}}{\partial N_1}(N_{1c},N_2)=0\). As the function \(\tilde{{\mathcal {R}}}_{1,2}\) cannot have more than a local extremum (see Proposition 5), we deduce that \(N_{1c}\) realizes the minimum of the function \(N_{1c} \mapsto \tilde{\mathcal{R}}_{1,2}(N_{1c},N_2)\) when \(\eta _1={\hat{\eta }}_1\) and \(\eta _2=\hat{\eta }_2\). This shows that point 3 is verified for the function \(\tilde{{\mathcal {R}}}_{1,2}\).

Finally, note that the exact threshold \({{\mathcal {R}}}_{1,2}\) amounts to replace in the expression of \({\tilde{q}}_{12}\), \({\tilde{q}}_{21}\) the numbers \(\eta _i\) by \(\frac{\lambda _i+\gamma }{\lambda _i+\mu _i+\gamma }\) , which is continuous with respect to \(\gamma \) and equal to \(\eta _i\) for \(\gamma =0\). By continuity of \(\tilde{{\mathcal {R}}}_{1,2}\) with respect to \({\tilde{q}}_{12}\), \({\tilde{q}}_{21}\) , we deduce that uniqueness of the minimizer of \({{\mathcal {R}}}_{1,2}\) and properties 1. to 3. are also fulfilled by the function \((N_{1c},N_{2c})\mapsto {{\mathcal {R}}}_{1,2}\), provided that \(\gamma \) is small enough. \(\square \)

5 Numerical Illustration

We consider two territories of same population size \(N=N_1=N_2\) with different transmission rates such that one has \({{\mathcal {R}}}_1< 1 < {{\mathcal {R}}}_2\) (values are given in Table 1). Typically, some precautionary measures (such as social distance) are taken in the first territory so that the disease cannot spread in this territory if it is closed, while the epidemic can spread in the second territory in absence of communication with territory 1. We aim at studying how the epidemic can die out when commuting occur between territories, depending on the proportions of resident in each population, denoted

$$\begin{aligned} p_i=:= \frac{N_{ir}}{N}= 1-\frac{N_{ic}}{N}, \quad (i=1,2) \end{aligned}$$

(in other words, how to obtain \({{\mathcal {R}}}_{1,2}<1\) playing with \(p_1\), \(p_2\)). Note that when \(N_1=N_2\), the threshold \(\mathcal{R}_{1,2}\) depends on the proportions \(p_1\), \(p_2\) independently of N.

Table 1 Characteristics numbers of the epidemic
Table 2 Three sets of commuting parameters
Table 3 Quality of the approximation \(\tilde{{\mathcal {R}}}_{1,2}\)

Conditions of Proposition 6 are satisfied provided that commuting parameters \(\lambda _i\), \(\mu _i\) are large enough. We have considered three sets of these parameters, given in Table 2, that correspond to the three possible situations depicted in Proposition 6.

The approximate expression \(\tilde{{\mathcal {R}}}_{1,2}\) turns out to be a very good approximation of the exact value \({{\mathcal {R}}}_{1,2}\), even in case A for which \(\gamma \) is not so small compared to \(\mu _2\) (see Table 3).

Fig. 1
figure 1

\({{\mathcal {R}}}_{1,2}\) as a function of \(p_1\) in case A (each curve corresponds to a value of \(p_2 \in [0,1]\)) (Color Figure Online)

Fig. 2
figure 2

\({{\mathcal {R}}}_{1,2}\) as a function of \(p_1\) in case B (each curve corresponds to a value of \(p_2 \in [0,1]\)) (Color Figure Online)

Fig. 3
figure 3

\({{\mathcal {R}}}_{1,2}\) as a function of \(p_1\) in case C (each curve corresponds to a value of \(p_2 \in [0,1]\)) (Color Figure Online)

Figures 12 and 3 show families of curves \(p_1 \mapsto {{\mathcal {R}}}_{1,2}\) for different values of \(p_2 \in [0,1]\). One can observe that theses curves possess the properties given by Propositions 4 and 5:

  • they are either decreasing, increasing or decreasing down to a minimum and then increasing,

  • they are ordered and the lower one is obtained for \(p_2=0\) (i.e., \(N_{2c}=N_2\)).

This last feature is intuitive: the more there are commuters from territory 2 (that spend time in territory 1 where the conditions of transmission disease is lower), the less the epidemic spreads. A way to reduce the value of \({{\mathcal {R}}}_{1,2}\) is thus to encourage commuting toward territory 1 (whatever are the commuting rates). However, the role of the resident population in territory 1 is far less intuitive because it does depends on the commuting rates.

  1. 1.

    In case A, commuters from territory 2 return more rarely to home than commuters from territory 1 do. The condition of point 1 of Proposition 6 is fulfilled. Then, the threshold \({{\mathcal {R}}}_{1,2}\) can be made small (and below 1) when the proportion of resident in territory 1 is high, i.e., when the inhabitants of territory 1 are encouraged not to commute.

  2. 2.

    In case B, both commuters return rapidly to their home. This means that the numbers of commuters from one territory present in the other one at a given time is low. Then the condition of point 2 of Proposition 6 is fulfilled. Here, it is better to encourage inhabitants of territory 1 to commute to the other territory where the disease spreads yet more easily which is counterintuitive at first sight. Indeed, commuters do not spend much time in the other territory and therefore heuristically have less time to encounter and transmit the disease...

  3. 3.

    In case C, commuters from territory 2 return more rapidly to home than commuters from territory 1 do, on the opposite of case A. Conditions of points 1 and 2 of Proposition 6 are not fulfilled here, and we are in an intermediate situation for which point 3 of Proposition 6 occurs. It is theoretically possible to have \({{\mathcal {R}}}_{1,2}<1\) on the condition that the proportion of commuters of territory 1 is well balanced.

Finally, this example shows that changing only the return rates \(\mu _1\), \(\mu _2\) allows to obtain the three possible scenarios, but other changes could also exhibit them.

6 Conclusion

In this work, we have been able to provide an explicit expression of the reproduction number, although the model is in dimension 18. This expression has allowed us to study its minimization with respect to the proportions of permanently resident populations in each patch. We discovered a trichotomy of cases, with some counterintuitive situations. In each case, it is always beneficial to have commuters traveling to a safer city where the transmission rate is lower. However, for the safer city, three situations occur:

  • either it is better to avoid commuting to the other city,

  • or on the opposite encouraging commuting to the more risky city reduces the reproduction number,

  • and in a third case there exists an optimal intermediate proportion of commuters of the safer city which minimizes the epidemic threshold.

In some sense, the permanently resident populations, which have been ignored in former modeling, can play an hidden role in an epidemic outbreak. This is illustrated on an example for which only right proportions of commuters (or permanently resident) avoid the outbreak. This suggests that counterintuitive situations may also occur when considering networks with more than two nodes. The present study focuses on the reproduction number and how it can be reduced. The impacts of resident proportions on other epidemiological characteristics, such as the peak level or the finite size, may be the matter a future work. The extension of the present results to more general networks is also a future perspective.