Control with uncertain data of socially structured compartmental epidemic models

The adoption of containment measures to reduce the amplitude of the epidemic peak is a key aspect in tackling the rapid spread of an epidemic. Classical compartmental models must be modified and studied to correctly describe the effects of forced external actions to reduce the impact of the disease. The importance of social structure, such as the age dependence that proved essential in the recent COVID-19 pandemic, must be considered, and in addition, the available data are often incomplete and heterogeneous, so a high degree of uncertainty must be incorporated into the model from the beginning. In this work we address these aspects, through an optimal control formulation of a socially structured epidemic model in presence of uncertain data. After the introduction of the optimal control problem, we formulate an instantaneous approximation of the control that allows us to derive new feedback controlled compartmental models capable of describing the epidemic peak reduction. The need for long-term interventions shows that alternative actions based on the social structure of the system can be as effective as the more expensive global strategy. The timing and intensity of interventions, however, is particularly relevant in the case of uncertain parameters on the actual number of infected people. Simulations related to data from the first wave of the recent COVID-19 outbreak in Italy are presented and discussed.


Introduction
From the digital tracking systems of the Koreans to UK's initial choice of not wanting to do anything to counter the spread of the virus, passing through the militarized quarantines of the Chinese and those less authoritarian and more involved of the Italians, the reaction of different countries to the COVID-19 outbreak has shown a series of very different approaches that can also be explained considering the different cultural and political attitudes of the countries concerned.
In all cases, after an initial phase, even those governments that were less restrictive in the face of the pandemic's inexorable progress had to take strong containment measures. There's a graph that has become the symbol of the COVID-19 pandemic most of all. It shows in a simple and intuitive way the importance of slowing down the spread of an epidemic as much as possible ("flattening the curve"), so that the healthcare system can take care of all the sick without collapsing. Its success has helped to save many lives, raising awareness of good practices to slow down an epidemic: stay as much as possible at home, reduce social interactions and wash your hands often and well.
These "non-pharmaceutical" intervention measures, however, entail significant social and economic costs and thus policy makers may not be able to maintain them for more than a short period of time. Therefore, a modeling approach based on a limited time horizon that takes into account the social structure of the population is necessary in order to optimize containment strategies. Most current research, however, has focused on control procedures aimed at optimizing the use of vaccinations and medical treatments [5,7,35,15] and only recently the problem has been tackled from the perspective of non-pharmaceutical interventions [37,41]. In addition, data collected by governments are often incomplete and heterogeneous, so a high degree of uncertainty needs to be incorporated into predictive models [10,12,17,31,40,48]. This is the case of the spreading of COVID-19 worldwide, which have been often mistakenly underestimated due to a combination of factors, including deficiencies in surveillance and diagnostic capacity, and the large number of infectious but asymptomatic individuals [31,40,54].
For almost a hundred years, mathematical models have been used to describe the spread of epidemics [33]. The models currently used largely originate from the model proposed by Kermack and McKendrick at the beginning of last century. Even if the model contains strong simplification assumptions, the concepts introduced through this model are essential to provide a first intuition on the dynamics of epidemics, an intuition that remains confirmed in more complex models, albeit with numerous modifications (see for example [28,11]). The model provides for the division of the population into compartments, the susceptible, healthy individuals who may be infected, the infectious, who have already contracted the disease and can transmit it, and the removed, compartment that includes those who are healed and immune.
The hypothesis made by Kermack and McKendrick is that of the homogeneous "mixing"; that is, it is assumed that each individual has the same probability of contacting any other individual in the population. One understands how this hypothesis is unrealistic: we are often in contact with people from our family, our workplace, school class, group of friends and very rarely with those who live in a different place, have different ages and professions. In recent years, therefore, computational models have been developed that try to take into account additional social characteristics of individuals in order to arrive at more accurate predictions by keeping, however, the simplicity of compartmental models [14,26,36,27,30,23].
In this paper starting from a general compartmental model with social structure, typically the age dependence, we consider the external action of a policy maker that aims at reducing the spread of the epidemics by applying non pharmaceutical intervention measures, such as social distancing and quarantine. The mathematical problem is formulated as an optimal control problem characterized by a functional cost whose objective is to minimize the number of infectious people in a given time horizon. Through an instantaneous control strategy we compute an explicit feedback control that allows us to derive new SIR-type compartmental models capable of describing the epidemic peak reduction. Previously, this type of approach has been used successfully in the case of social models of consensus [1,2,3,4,8,13,20].
The feedback controlled models are subsequently extended to take into account the presence of uncertain infection parameters and data. In fact, to have reliable forecasts it is of paramount importance to consider the presence of uncertain quantities as a structural feature of the epidemic dynamics. This aspect is of paramount importance in the case of pandemic COVID-19, in which undetected infectious individuals play a key role in the spread of the disease. In this regard, it is worth noting that our methodology can be easily extended to the case of more complex compartmental models. The decision to limit ourselves to a simple SIR-type compartmentalization was related on the one hand to the increased complexity given by the dependence on the social structure, which proved to be crucial in the case of the COVID-19 pandemic, and on the other hand to the introduction of a systematic uncertainty in the number of infected to avoid a complex sub-compartmentalization of the infectious population and the consequent difficulties due to parameter identification and the inability to follow a data-driven approach [25,22]. From a mathematical point of view, we can rely on the methods of uncertainty quantification (UQ) to obtain efficient and accurate solutions based on stochastic orthogonal polynomials for the differential model with random inputs [53].
Few results are actually available regarding methods of UQ in epidemic systems, we mention in this direction [12,17,10,48]. The main idea is to increase the dimensionality of the problem adding the possible sources of uncertainty from the very beginning of the modeling. Hence, we extrapolate statistics by looking at the so-called quantities of interest, i.e. statistical quantities that can be obtained from the solution and that give some global information with respect to the input parameter like expected solution of the problem or higher order moments. Several techniques can be adopted for the approximation of the quantities of interest, in this paper we adopt stochastic Galerkin methods that allow to reduce the problem to a set of deterministic equations for the numerical evaluation of the solution in presence of uncertainties. Compared to conventional Monte Carlo methods, based on stochastic sampling, these methods guarantee an exponential convergence in the case of smooth uncertainty distributions and allow a much more accurate and efficient estimation of random parameters. We refer the interested reader to recent surveys and monographs on the topic [18,32,44,53].
In particular, we consider the case in which the policy maker applies his control based on several possible estimators on the actual number of infected people. The need for long-term interventions shows that alternative actions based on the social structure of the system can be as effective as the more expensive optimal strategy. The importance of the timing and intensity of interventions is particularly relevant in the case of uncertain parameters on the actual number of infected people.
The rest of the manuscript is organized as follows. In Section 2 we introduce the structured social SIR model and formulate the mathematical approach for containment measures to reduce the spread of the disease. Next, a feedback controlled model used in the subsequent analysis is derived within a short time horizon approximation. In Section 3 we generalize the feedback controlled model to take into account the presence of uncertainties. Section 4 is dedicated to the presentation of some numerical examples including applications to the first wave of the COVID-19 epidemic in Italy. In separate Appendices we provide details on the generalizations of the present approach to more realistic epidemic models for COVID-19 including additional compartmentalizations, on the stochastic Galerkin method employed to efficiently address the uncertainties, and on the social interaction matrices characterizing the contact rates.

Control of epidemic dynamics
The starting model in our discussion is a SIR-type compartmental model with a social structure. The presence of a social structure is in fact essential in deriving appropriate sustainable control techniques from the population for a protracted period, as in the case of the recent COVID-19 epidemic. We will discuss in Section 3 how to modify the model through the introduction of a stochastic parameter that takes into account the dependence on uncertain data, and thus implicitly introduce the role of undetected infectious in the dynamics.

Compartmental models with social structure
The heterogeneity of the social structure, which impacts the diffusion of the infective disease, is characterized by the vector a ∈ Λ ⊆ R da characterizing its social state and whose components summarize, for example, the age of the individual, its number of social connections or its economic status [27,28]. We denote by s(a, t), i(a, t) and r(a, t), the distributions at time t > 0 of susceptible, infectious and recovered individuals, respectively in relation to specific social characteristics. We assume that the rapid spread of the disease and the low mortality rate allows to ignore changes in the social structure, such as the aging process, births and deaths.
Consequently, for a given population of total number N , we have that where f (a) is the total distribution of the social features defined by the vector a. Hence, we recover the total fraction of the population which belong to the susceptible, infected and recovered as follows In a situation where changes in the social features act on a slower scale with respect to the spread of the disease, the socially structured compartmental model follows the dynamics where the function β(a, a * ) ≥ 0 represents the uncertain interaction rate among individuals with different social features and γ(a) ≥ 0 the recovery rate which may depend on the social feature. Often, in socially structured models the interaction rate between people is assumed to be separable, and proportionate to the activity level of the social feature [27,28], as follows with b(a) the average number of people contacted by a person with social feature a per unit time. Alternative approaches are based on preferential mixing [26,14]. Specific examples of age-dependent social interaction matrices are reported in Appendix C.
We introduce the usual normalization scaling and observe that the quantities S(t), I(t) and R(t) satisfy the conventional SIR dynamics where the fraction of recovered is obtained from R(t) = 1 − S(t) − I(t). We refer to [27,28] for analytical results concerning model (2) and (4). In the following we will adopt the simple compartmental model (2) to derive our feedback controlled formulation in presence of uncertainty. The extension to more realistic compartmental models in epidemiology, designed specifically for the COVID-19 pandemic, can be carried out in a similar way [24]. In order to simplify the description, we will consider the one-dimensional case d a = 1 and set the social dependence as the age a of the individual because of its importance in epidemic dynamics. It is clear, however, that similar containment procedures can impact also on other social features, like the wealth of the individuals [19]. We will first formulate the feedback controlled SIR model in the deterministic case and subsequently extend our approach to the presence of uncertain parameters.

Optimal control of structured compartmental model
In order to define the action of a policy maker introducing a control over the system based on social distancing and other containment measures linked to the social structure we consider an optimal control framework. The choice of an appropriate functional is problem dependent [35].
In our setting, we account the minimization of the total number of the infected population I(t) through the an age dependent control action depending both on time and pairwise interactions among individuals with different ages.
Thus, we introduce the optimal control problem min u∈U J(u) := T 0 ψ(I(t))dt + 1 2 T 0 Λ×Λ ν(a, a * , t)|u(a, a * , t)| 2 dada * dt, (5) subject to with initial condition i(a, 0) = i 0 (a), s(a, 0) = s 0 (a) and r(a, 0) = r 0 (a). The number of infected individual is measured by a monotone increasing function ψ(·) such that ψ : [0, 1] → R + . This function models the policy maker's perception of the impact of the epidemic by the number of people currently infected and in the sequel will be referred to as perception function. For example ψ(I) = I q /q, for q > 1 implies an underestimation of the actual number of infected corresponding to q = 1. The control aims to minimize this measure of the total infected population by reducing the rate of interaction between individuals. We consider a quadratic cost for its actuation.
Such control is restricted to the space of admissible controls which ensure the admissibility of the solution for (6). The above restriction on admissible controls can be relaxed if we consider controls that violate the previous condition locally but preserve the inequality in integral form after integration against i(a * , t).
The solution to problem (5)- (6) is computed through the optimality conditions obtained from the Euler-Lagrangian as follows where p s (a, t), p i (a, t), p r (a, t) are the associated Lagrangian multipliers. By computing the variations with respect to (s, i, r) we retrieve the adjoint system (p s (a * , t) − p i (a * , t))(β(a * , a) − u(a * , a, t))s(a * , t)da * + γ(a)p i with terminal conditions p s (a, T ) = 0, p i (a, T ) = 0 and p r (a, T ) = 0. Note that the contribution of p r (a, t) vanishes since the control does not act directly on population R, and the removed population is not considered in the minimization of the functional. The optimality condition reads ν(a, a * , t)u(a, a * , t) = (p s − p i ) s(a, t)i(a * , t).
The optimality conditions (7)-(8) are first order necessary conditions for the optimal control u(a, a * , t). In order to be admissible then the control reads u(a, a * , t) = max 0, min where φ M,β (a, a * ) = min{β(a, a * ), M }. The approach just described, however, is generally quite complicated when there are uncertainties as it involves solving simultaneously the forward problem (5)-(6) and the backward problem (7)- (8). Moreover, the assumption that the policy maker follows an optimal strategy over a long time horizon seems rather unrealistic in the case of a rapidly spreading disease such as the COVID-19 epidemic. Let us emphasize that extending the above optimal control formulation to more complex compartmental models designed specifically for COVID-19, like SEPIAR or SIDHARTE [24,25], can be done by generalizing the control functional (5) to include, for example, the hospitalized compartment, or other specific indicators that can be measured from the data. For all of these models, the feedback control strategy described in the next section does not change substantially. We refer the reader to Appendix A for more details.

Feedback controlled compartmental models
In this section we consider short time horizon strategies which permits to derive suitable feedback controlled models. These strategies are suboptimal with respect the original problem (5)-(6) but they have proved to be very successful in several social modeling problems [1,2,3,4,20]. To this aim, we consider a short time horizon of length h > 0 and formulate a time discretize optimal control problem through the functional J h (u) restricted to the interval [t, t + h], as follows subject to s(a, t + h) = s(a, t) − hs(a, t) Λ (β(a, a * ) − u(a, a * , t)) i(a * , t)da * (10) i(a, t + h) = i(a, t) + hs(a, t) Λ (β(a, a * ) − u(a, a * , t)) i(a * , t)da * − hγ(a)i(a, t).
By recalling that the macroscopic information on the infected is we can derive the minimizer of J h computing D u J h (u) ≡ 0. We retrieve the following nonlinear equation ν(a, a * , t)u(a, t) = hs(a, t)i(a * , t)ψ (I(t + h)).
In order to pass to the limit h → 0 we must rescale the penalization term as ν(a, a * , t) = hκ(a, a * , t) so that we can introduce the above instantaneous strategy directly in the discrete system (10)- (11).
The resulting controlled dynamic, corresponding to the feedback controlled continuous system (6), reads as follows In what follows we provide a sufficient conditions for the admissibility of the instantaneous control in terms of the penalization term κ(a, a * , t). Indeed we want to assure that the dynamics preserve the monotonicity of the number of susceptible population s(a, t).
Proposition 1 Let β(a, a * ) ≥ δ > 0, and ψ (·) a monotonically non decreasing function, then for all (a, a * ) ∈ Λ × Λ and t > 0, solutions to (13) are admissible if the penalization κ satisfies the following inequality whereī(a) andĪ are respectively the peak reached by the infected of age a and by the total population of infected.
Proof By imposing the non-negativity of the controlled reproduction rate inside the integral we have This inequality has to be satisfied for every time t ≥ 0. Next we observe that the number of susceptible s(a, t) is decreasing in time, therefore s 0 (a) ≥ s(a, t) for all t. Moreover i(a, t) reaches a peak before decreasing to 0 (note that this peak can also be in t = 0), sayĪ for the macroscopic variable andī(a). Thus, thanks to the monotonicity of ψ (·), we can restrict the previous inequality as follows δκ(a, a * , t) ≥ s 0 (a)ī(a * )ψ (Ī).
In Figure 1 we report the phase diagram of susceptible-infected trajectories for the controlled model with homogeneous mixing with ψ(I) = I q /q. The dynamics are similar to the classical SIR model but with a nonlinear contact rate. In particular, the trajectories are flattened when the value of the control is such that κβ ≈ S(t)I(t)ψ (I(t)). Note, however that this status is not an equilibrium point of the system. To understand this, let us observe that an equilibrium state (S * , I * ) for (15) satisfies the equations An equilibrium point corresponds to the classical state in which we have the extinction of the disease I * = 0 and S * arbitrary and defined by the asymptotic state of the dynamics [28]. Now, let's suppose that I * = 0, S * = 0, we can look for solutions where control is able to perfectly balance the spread of the disease κβ = S * I * ψ (I * ) κβS * = (S * ) 2 I * ψ (I * ) + κγ, consequently κβS * = (S * ) 2 κβ S * + κγ = 0, which is satisfied only for γ = 0 when κ = 0. The stability analysis of this unique equilibrium point can be performed by standard arguments and we omit the details.

Control of epidemic dynamics with uncertainties
Since the beginning of the outbreak of new infectious diseases, the actual number of infected and recovered people is typically underestimated, causing fatal delays in the implementation of public health policies facing the propagation of epidemic fronts. This is the case of the spreading of COVID-19 worldwide, often mistakenly underestimated due to deficiencies in surveillance and diagnostic capacity [31,47]. Health systems are struggling to adopt systematic testing to monitor actual cases. Moreover, another important epidemiological factor affecting data reliability is the large proportion of asymptomatic [31,40,54].
Among the common sources of uncertainties for dynamical systems modeling epidemic outbreaks we may consider the following noisy and incomplete available data; structural uncertainty due to the possible inadequacy of the mathematical model used to describe the phenomena under consideration.
In the following we consider the effects on the dynamics of uncertain data, such as the initial conditions on the number of infected people or the interaction and recovery rates. On the numerical level we consider techniques based on stochastic Galerkin methods, for which spectral convergence on random variables is obtained under appropriate regularity assumptions [53]. For simplicity, the details of the numerical method that allows to reduce the uncertain dynamic system to a set of deterministic equations are reported in Appendix B.

Socially structured models with uncertain inputs
We introduce the random vector z = (z 1 , . . . , z dz ) whose components are assumed to be independent real valued random variables with B R the Borel set. We assume to know the probability density p(z) : R dz → R dz + characterizing the distribution of z. Here, z ∈ R dz is a random vector taking into account various possible sources of uncertainty in the model.
In presence of uncertainties we generalize the initial modeling by introducing the quantities s(z, a, t), i(z, a, t) and r(z, a, t) representing the distributions at time t ≥ 0 of susceptible, infectious and recovered individuals. The total size of the population is a deterministic conserved quantity in time, i.e.
Hence, the quantities denote the uncertain fractions of the population that are susceptible, infectious and recovered respectively. The social structured model with uncertainties reads To illustrate the impact of uncertainties let us consider the simple following example. In the case of homogeneous mixing with uncertain contact rate with deterministic initial values I(z, 0) = I 0 and S(z, 0) = S 0 . The solution for the proportion of infectious during the initial exponential phase is [48] I(z, t) = I 0 e (β+αz)S0t−γt , and its expectation where the function represents the statistical correction factor to the standard deterministic exponential phase of the disease I 0 e βS0t−γt . If z is uniformly distributed in [−1, 1] we can explicitly compute More in general, if z has zero mean then by Jensen's inequality we have W (t) > 1 for t > 0, so that the expected exponential phase is amplified by the uncertainty (see [48]). In a similar way, keeping β constant, but introducing a source of uncertainty in the initial data I(z, 0) = I 0 + µz, µ > 0 and z ∈ R distributed as p(z) the solution in the exponential phase reads and then its expectation wherez is the mean of the variable z. Therefore, the expected initial exponential growth behaves as the one with deterministic initial data I 0 + µz. Of course, if both sources of uncertainty are present the two effects just described sum up in the dynamics.

Remark 1
The presence of a large number of undetected infected is at the basis of the construction of numerous epidemiological models with an increasingly complex compartmental structure in which the original compartment of the infected is subdivided into further compartments with different roles in the propagation of the disease [25,24,22]. To clarify the relationships to other deterministic compartmental models, let us consider, for simplicity, model (17) in absence of a social structure and with a one-dimensional random input z ∈ R distributed as p(z). Furthermore, for a function F (z, t) we will denote its expected value asF (t) = E[F (·, t)]. Now, starting from a discrete probability density function Taking the expectation in (21), we can write withβ k = S k β k /S, k = 1, . . . , n. For example, in the case n=2, by identifying I d = p 1 I 1 and I u = p 2 I 2 with the compartments of detected and undetected infectious individuals, respectively, we can formulate the equivalent partition- which has the same structure of a SIAR compartmental model including the undetected (or the asymptomatic) class. In the following, we will not rely on discrete probability distributions, but on continuous representations that can be associated with the overall probability of having a certain number of infectious individuals (detected or undetected). The additional dependence of the epidemiological parameters from the random variable allows us to take into account changes in the corresponding dynamics of disease transmission and recovery.

The feedback controlled model with random inputs
In presence of uncertainties the optimal control problem (5)- (6) is modified as follows being R[ψ(I(·, t))] a suitable operator taking into account the presence of the uncertainties z. Examples of such operator that are of interest in epidemic modeling are the expectation with respect to uncertainties (25) or relying on data which underestimate the number of infected where z 0 is a given value such that ψ(I(z 0 , t)) ≤ ψ(I(z, t)), ∀ z ∈ R dz and t > 0. In (24) U the space of admissible controls is defined as or in a more relaxed form if we consider the above inequality after integration against i(a * , t).
The minimization problem (24) is subject to the following dynamics with initial condition i(z, a, 0) = i 0 (z, a), s(z, a, 0) = s 0 (z, a) and r(z, a, 0) = r 0 (z, a). The implementation of instantaneous control for dynamics in presence of uncertainties follows from the derivation presented in Section 2.3. We can derive the minimizer of J h computing D u J h (u) ≡ 0 from the restriction of the minimization problem (24) where we assumed ∂R [ψ(I(·, t + h))]/∂u = R [∂ψ(I(·, t + h))/∂u], to obtain the following nonlinear identity The above assumption on R[·] is clearly satisfied by (25) and (26), where in the case of (26) we used the notation By introducing the scaling ν(a, t) = hκ(a, a * , t) we can pass to the limit for h → 0 to get which defines the feedback controlled model in presence of uncertainties.

Examples from the COVID-19 outbreak in Italy
In this section we present several numerical tests on the constrained compartmental model with uncertain data. Details of the numerical method used are given in Appendix B. In an attempt to analyze sufficiently realistic scenarios, in the following examples we will refer to values taken from Italian data on the COVID-19 epidemic [46]. More precisely, in the first test case we illustrate the behavior of the model in a simplified setting in absence of uncertainty and social structure and without trying to reproduce scenarios closely related to current data. In the second test case, following a progressively more realistic approach, we consider the impact of the presence of uncertain data in the controlled model with homogeneous social mixing and calibrated on Italian data. The same setting is then considered in Test 3 taking into account the additional effects given by the social structure of the system, characterized by suitable social interaction functions and an age-dependent recovery rate. The final scenario, explored in Test 4, examines the influence on the spread of infectious disease induced by relaxed confinement measures related to the social structure of the system.

Test 1. Containment in homogeneous social mixing dynamics
To illustrate the effects of controls introduced that mimic containment procedures, let us first consider the case where the social structure is not present. Furthermore, to simplify further the modeling, in this first example we neglect any dependence on uncertain data. We consider as initial small number of infected and recovered i(0) = 3.68 × 10 −6 , r(0) = 8.33 × 10 −8 . These normalized fractions refer specifically to the first reported values in the case of the Italian outbreak of COVID-19, even if in this simple test case we will not try to match the data in a quantitative setting but simply to illustrate the behavior of the feedback controlled model. Based on recent studies [31,54,38], the initial infection rate of COVID-19 R 0 = β/γ has been estimated between 2 and 6.5. Here, to exemplify the possible evolution of the pandemic we consider a value close to the lower bound, taking β = 0.25 and γ = 0.10, namely a recovery rate of 10 days, so that R 0 = 2.5.
In Figures 2 and 3 we report the infected and recovered dynamics based on the activation of the control in two different time frames. In Figure 2 the activation for t ∈ [50, 100], which means that after 100 days we suppose that all containment restrictions are cancelled. In Figure 3 we consider a larger activation time frame t ∈ [50,200].
With the choice of the perception function ψ(I) = I q /q, q ≥ 1, we can observe how the control term is able to flatten the curve even if, as expected, the case q = 2 gives rise to a weaker control action. To make the two controls, q = 1 and q = 2, comparable for the same penalization factor κ we also consider the normalized case, where ψ(I) = C q I q /q, and the constant C q > 0 is a normalization constant such that with I max an estimate of the maximum number of infected in absence of control. In particular, in our test case, from I max ≈ 1 4 , assuming C 1 = 1 we obtain C 2 = 12.
Note that, if the activation time is too short the control is not able to significantly change the total number of infected (and therefore recovered). On the other hand, by enlarging the activation time in combination with a sufficiently small penalty constant, the peak infection is not only reduced, but the total number of infected people is decreased. To achieve this, the control should be kept activated for a sufficiently long time and with the right intensity in a kind of plateau regime where there is a perfect balance between the containment effect and the spread of the disease. On the contrary, if the control is too strong, the majority of the population remains susceptible and consequently the disease will start spreading again forming a second wave after the containment policy is removed. Similar conclusions (that may appear counterintuitive) have been shown also by other authors (see for example [9,39]).
The cost functional depends on the value of q and can be evaluated summing up contributions in (9) with explicit form of the control given by (12).  In Figure 4 the cost of the two interventions is compared. We can see how a higher cost is associated with q = 1 that can be obtained with the control q = 2 only for smaller penalizations. As expected, the normalized case q = 2 essentially realigns the cost of interventions. Then, in Figure 5 we compare the performance of the two controls in [50, t d ] with a deactivation time t d ∈ [100, 300]. We consider κ = 10 −3 for both q = 1 and q = 2 normalized. It can be observed that there is a minimum control horizon for both strategies, in order to avoid the onset of a second infection peak. A sufficiently long control horizon is therefore necessary to reduce the impact of the infection.

Test 2: Impact of uncertain data on the epidemic outbreak
Next we focus on the influence of uncertain quantities on the controlled system with homogeneous mixing focusing on available data for COVID-19 outbreak in Italy, see [46]. The estimation of epidemiological parameters is a very difficult problem that can be addressed with different approaches [10,17,48]. It is worth to mention that, in the case of COVID-19, the number of infected and recovered has been largely underestimated, especially in the early phases of the epidemic, see [31,40]. Here, we restrict ourselves to identifying the deterministic parameters of the model through a suitable data fitting procedure, considering possible deviations due to such underestimates as part of the subsequent uncertainty quantification process.

The data fitting process
In details, we have adopted a two-level approach in estimating the parameters in absence of uncertainties. In the phase preceding the lockdown we estimated the epidemic parameters in an unconstrained regime, where we assumed no social containment procedure was activated. This estimate was then kept int the subsequent lockdown phase where we estimated as a function of time the value of the control penalty parameter. Both these two calibration steps were analyzed under the assumption of homogeneous mixing, therefore model (15) has been used in lockdown phase. First, we estimated the parameters β, γ > 0 in the time interval t ∈ [t 0 , t u ] solving a lest square problem based on the minimization of the relative L 2 norm of the difference between the reported number of infectedÎ(t) and recovered R(t), and the theoretical evolution of the unconstrained model whose solution at time t ≥ 0 is indicated with I(t) and R(t). More precisely, we considered the following minimization problem min β,γ∈R+ where θ ∈ [0, 1] and · L 2 ([t,s]) denotes the relative L 2 norm over a time horizon [t, s]. Problem (30) has been solved with the constraints β ∈ [0, 1] and γ ∈ 1 24 , 1 10 . Indeed, according to several studies the time to viral clearance during the early phase of the epidemic, corresponding to the time from the first positive test to the first negative test, can approximately span in average from 10 to 24 days, see [16,24,34]. At the end of the above optimization process we obtained the values β e ≈ 0.31, γ e ≈ 0.049 computed by averaging the optimization results with θ = 10 −2 and θ = 10 −6 . The choice of a small value for θ is due to the low reliability on the recovered data at this early stage.
Next, we estimate the penalization κ = κ(t) in time by solving (15), in the lockdown time interval t ∈ (t u , t c ] and for a sequence of time steps t i , the corresponding least square problems in [t i − k l , t i + k r ] where k , k r ≥ 1 are integers, and where we fix the values β e , γ e estimated in the first optimization step. In details, we solve the following minimization problem min κ(ti)∈R+ over a window of seven days corresponding to k = 3 and k r = 4 for regularization along one week of available data. Both minimization problems (30)- (31) have been solved testing various numerical methods in combination with adaptive solves for the systems of ODEs. The results have been obtained using Matlab functions fmincon in combination with ode45. The available data start on February 24 2020, when moderate social restrictions were enforced by the Italian government, and since the lockdown started on March 9 2020, thus we considered t u − t 0 = 14 (days).
The corresponding time dependent values for the expected penalization for a perception function ψ(I) = I q /q are reported in Figure 6. After an initial adjustment phase the penalty terms converge towards a constant value that we can assume as fixed in predictive terms for future times in a lockdown scenario. This is consistent with a situation in which society needs a certain period of time to adapt to the lockdown policy.
Remark 2 Finally, we remark that the data fitting procedure is easily generalizable to epidemic models with additional compartments [24,25]. Let us denote with C i (t) the compartments that are related to the reported datâ C j (t), j = 1, . . . , h, as the number of actual cases, hospitalized, deaths, etc. In the first step one minimizes a weighted L 2 norm in the form where w j ∈ [0, 1] are suitable weights such that h j=1 w j = 1 and p is the vector of the model parameters that need to be estimated. Since the problem may admit multiple minima leading to unrealistic solutions one usually perform the above optimization process under constraints on the range of values P of some parameters such as recovery rate, incubation period, etc.
An important difference, compared to a simple SIR compartmentalization, is the presence of compartments that are not data-driven as exposed, pre-symptomatic, asymptomatic, etc. that prevent the realization of the data fitting since their values are unknown. A way to overcome this difficulty is to solve the differential model starting from an unknown time t * < t 0 using as initial data the presence of a single individual in the first compartment promoting the infection (for example the exposed) and zero individuals in all other compartments. The idea is to simulate the early phase of the epidemic, by optimizing also the initial time t * in problem (32).
After this, in the second step one considers the corresponding feedback controlled model (see Appendix A) and solves the minimization problem where κ(t i ) is the vector of penalization terms that need to be estimated. Even in this case, assumptions on the range of values of the penalization terms may be necessary to avoid unrealistic solutions. Finally, we underline that most epidemic models are not data-driven, but one can always assume that the total number of reported cases underestimates the actual number of cases and perform the above data fitting to obtain a lower bound on the evolution of the epidemic. Then including a suitable data uncertainty, as in the present work, allows the recovery of realistic scenarios for the pandemic progression.

Introducing data uncertainty
To account for the impact of uncertainties in the data and parameters we then consider a two-dimensional random variable z = (z 1 , z 2 ) with independent components such that i(z, 0) = i 0 (1 + µz 1 ), r(z, 0) = r 0 (1 + µz 1 ), µ > 0, (34) and where z 1 , z 2 are chosen distributed as symmetric Beta functions in [0, 1], i 0 and r 0 are the initial number of reported cases and recovered taken from [46] on February 24, 2020. The choice of a Beta distribution for p(z) = p 1 (z 1 )p 2 (z 2 ) is coherent with other authors [48,53]. However, different probability distribution functions may be considered if additional information on the nature of the uncertainties are available.
It should be noted that the estimated reproduction number, computed in the first optimization step, corresponds to R e 0 = β e /γ e ≈ 6.3, which is at the upper limit of the values reported in the literature for COVID-19 [31,54,38]. Being aware of the limitations of the data fitting on reported data, to consider a more realistic range of values we assumed a stochastic dependence in β and γ taking into account the faster recovery of asymptomatic individuals [24] and Fig. 7 Test 2. Estimated reproduction number R 0 from the feedback controlled model with uncertain data (35) for a perception function ψ(I) = I q , q = 1 (left) and q = 2 (right) together with the confidence bands. We mark with dash-dotted green lines the days in which the lower 95% band and the expected R 0 fell below one, and with x-markers the estimated reproduction number relative to data fitting. the fact that asymptomatic individuals might be slightly less infectious than symptomatic cases [29]. In the simulations we take α β = 0.03, α γ = 0.04 and z 2 ∼ B(2, 2) in (35). Under this assumptions the reproduction number covers a range of values approximatively in [3.13, 6.27] with an expected value around 4.25.
The reproduction number in the feedback controlled model is estimated from In (36) the timet is the lockdown time, in the case under study March 9th, and χ(·) the indicator function. The feedback control u(t) is defined from (29) in the case of homogeneous mixing and assuming R[S(·, t)I(·, t)ψ (I(·, t))] = S(z 0 , t)I(z 0 , t) q as in (26), where I(z 0 , t) is the total number of infected reported at time t. This leads to in agreement with the fact that the confinement restrictions have been implemented accordingly to the reported data. Note that, otherwise the action of the uncertainty translates into the control and leads to the unrealistic effect that the largest is the number of unreported infected the largest is the action of the control in the system. In Figure 7 we report the expected value of R 0 together with the 95% and 50% confidence levels with respect to the variable z 2 . The estimated reproduction number relative to data fitting is reported with x-marked symbols and represents an upper bound for R 0 (z 2 , t). The results show that the R 0 reproduction number, thanks to the containment actions, has been drastically reduced and its expected value fell below one between March 23rd and March 29th for both q = 1 and q = 2. After March 29th the observed R 0 is stably Fig. 8 Test 2. Evolution of expected current cases (left) and of the expected total cases (right) and their 95% confidence bands with respect to z 1 (shaded color) and z 2 (shaded gray) for the feedback controlled model with perception function ψ(I) = I, and uncertain initial data (34)- (35).
below unity. Note that, in both controls the results are very similar, without any need of renormalization for q = 2 due to the data fitting process.
Next we considered the evolution of the uncertain number of infected. In the following we assumed a strongly underestimated initial number of infected (including asymptomatic), taking µ = 50 so that the reported infected along the time horizon of the simulation represent approximately a 20% portion of the total infected persons computed by the feedback controlled model. This is in accordance with the WHO findings that around 80% of infected are asymptomatic 1 and with the results of preliminary serological campaigns promoted in Italy 2 .
In Figure 8 we represent the evolution of the expected value of the number of infected obtained by the controlled model with perception function ψ(I) = I in presence of uncertain contact and recovery rates (35) and initial uncertain data (34) assuming z 1 ∼ B (40,40) and z 2 ∼ B(2, 2). We represent the expected values of the current cases (left) and of the total cases (right) along with the 95% confidence level with respect to the variables z 1 and z 2 . The shaded color band is relative to the variability in z 1 whereas the shaded gray band is relative to the variability in z 2 . The bars below the graph are the reported values of the number of infected on which the model has been calibrated. The results with q = 2 do not highlight significant differences with respect to the case q = 1 and therefore have been omitted.

Test 3: The effect of social contacts in the population
Subsequently, we analyze the effects of the inclusion of age dependence and social interactions in the above scenario. More precisely we consider the social  interaction rate β = β(a, a * ), recovery rate γ = γ(a) and uncertain initial number of infected. These functions were normalized using the estimated parameters β e and γ e in accordance with where f (a) is the age distribution with Λ = [0, a max ], a max = 100. The age dependent social interaction rate β(a, a * ) is defined as follows, β(a, a * ) = (1 − ξ)β e + ξβ social (a, a * ), where 0 ≤ ξ ≤ 1, thus for ξ = 0 we recover the homogeneuos mixing, whereas for ξ = 1 we have a full social mixing behavior. The social interaction function, β social (a, a * ), accounts for the interactions due to specific activities A = {Family, Education, Profession} and is defined by (55) in Appendix C. However, since after the discovery of the first case (February 21), schools, and many places of aggregation were closed in most regions of Northern Italy, we Fig. 11 Test 3. Expected number of infected and total cases of infected and recovered in time for a perception function ψ(I) = I q , q = 1 (left) and q = 2 (right) together with the confidence bands for the social mixing scenario with age-independent or age-dependent recovery rate.
assume that β E is 0 from February 24 onwards, while β P is reduced by a factor one-half from March 9 onwards. The choice of age dependent recovery rate γ(a) involves a certain degree of arbitrariness, nevertheless it is reasonable to account such heterogeneity as observed in different studies [21,51,50]. In order to account fast recovery rate of young people, and slow recovery of the eldest we chose γ(a) to be constant up to a specific age a 0 and then a decreasing function of the age. We express mathematically the recovery rate as with r = 4.5, a o = 20 and C γ ∈ R + such that (38) holds. We divided the computation time frame into two zones and used different models in each zone, in accordance with the policy adopted by the Italian Government. The first time interval defines the period without any form of containment from 24 February to 9 March, the second the lockdown period from 9 March. In the first zone we adopt the uncontrolled model with homogeneous mixing. Hence, in the second zone we compute the evolution of the feedback controlled age dependent model (27)- (29) with matching (on average) interaction and recovery rates accordingly to (38) and with the estimated value of the control penalization κ(t) as reported in Figure 6 until April 30.
After April 30 the computation advances in time using as penalization term the constant asymptotic valueκ reached by κ(t). The initial values for the age distributions of susceptible and infectious individuals are shown in Figure 9 in agreement with reported data 3 .
In Figure 10 we report the results of the expected number of infected with the related confidence bands in case of homogeneous mixing and different levels of social mixing (ξ = 0.75, ξ = 1) for the constant recovery rate γ e . The homogeneous mixing hypothesis leads to a lower estimate of the maximum number of infected and shows a slower decay over time in the case q = 1, whereas for q = 2 the decay is comparable. In Figure 11 we compare the case of constant and age-dependent recovery rates for the social mixing scenario. The expected number of infected are shown in the top row, bottom row depicts the total number of recovered and infected people. The heterogeneity of the recovery rate makes the epidemic more persistent and causes an increase in the total number of cases with respect to the homogeneous recovery rate. Finally, in Figure 12 we report the expected age distribution of infectious individuals in time for q = 1. It is evident the effect of the age dependent recovery rate in the increase of cases among the oldest part of the population. Note that, this effect is partially compensated by the strength of the social mixing which reduces interaction among the elderly.

Test 4: Reducing the epidemic through relaxed social containment
One of the major problems in the application of very strong containment strategies, such as the lockdown applied in Italy, is the difficulty in maintaining them over a long period, both for the economic impact and for the impact on the population from a social point of view. In order to analyze sustainable control strategies, therefore, it is necessary to resort to models with a social structure and control methods based on specific forms of social distancing that allow the economy to restart and the population to dedicate itself, albeit in a limited way, to its pre-pandemic activities.
In accordance with the interaction function introduced in the Appendix C, we considered the following age dependent relaxation function Ψ (a, a * , t) = p s a, a * ∈ Λ P p w elsewhere (41) where the interval Λ P defines the age group related to a stronger relaxation of the restrictions (in the sequel we assume Λ P = [25,65]), and the parameters 0 ≤ p w < p s ≤ 1 characterize the intensity of the heterogeneity of the relaxation process over the different age classes. Hence we relax the control parameter defined in (29) according to u relax (a, a * , t) = (1 − Ψ (a, a * , t))u(a, a * , t). until March 9 March 9 -May 3 May 4 -June 2 from June 3 ps × 100% -0% 15% 20% pw × 100% -0% 5% 10% Table 1 Reduction of the feedback control (42) over different time periods due to the relaxation of the lockdown processes by the choice of the parameter ps and pw of the age dependent function Ψ defined in (41).
The evolution of the infection is considered within two different relaxation times, at May 4 and at June 3, as actuated by the Italian Government during the first wave of the pandemic. We report in Table 1 the specific choice of the parameter p s and p w associated to different periods. Note that, these values are directly related with an increase of the disease transmission rather than to a relaxation of restrictions. In fact, it is clear that under suitable safety precautions a relaxation of restrictions may not contribute to the progress of the epidemic. In Figure 13 (top row) we report the evolution of the age-controlled model with perception function ψ(I) = I, with mild social mixing and with full social mixing and homogeneous recovery rate. Each plot compare the evolution with relaxation of the containment measure and without, respectively indicated with E[I relax ] and E[I]. It is easily observed how the relaxation process increases the number of infected persons. Although the expected value remains  Table 1, perception function ψ(I) = I, and homogeneous recovery rate, using mild social mixing (left) and with full social mixing (right). In the bottom row the corresponding expected age distribution of infectious individuals is reported.
under control, the wider confidence bands highlight the risk of a resumption of the epidemic. Of course, a further relaxation of the values in Table 1 will lead to a higher risk of restarting the epidemic wave. In Figure 13 (bottom row) we report also the evolution of the age-dependent expected number of infected individuals, both for mild social mixing and full social mixing. We remark that, as expected, the relaxation process focuses the increase of infections in the interval characterized by Λ P .
As can be seen, a gradual strategy can keep the average number of infections under control and have an outcome comparable to the fully controlled model at a potentially lower social cost. The timing and intensity of interventions, however, are crucial to prevent the restart of the epidemic wave.

Conclusions
Quantifying the impact of uncertain data in the context of an epidemic emergency is essential in order to design appropriate containment measures. Such containment measures, implemented by several countries in the course of the COVID-19 epidemic, have proved effective in reducing the R 0 reproduction number to below or very close to one. These large-scale non-pharmaceutical interventions vary from country to country but include social distancing (banning large mass events, closing public places and advising people not to socialize outside their families), closing borders, closing schools, measures to isolate symptomatic individuals and their contacts, and the large-scale lock-down of populations with all but essential prohibited travel.
One of the main problems is the sustainability of these interventions, which until the introduction of a vaccine will have to be maintained in the field for long periods. However, estimating the reproductive numbers of SARS-CoV-2 is a major challenge due to the high proportion of infections not detected by health care systems and differences in test application, resulting in diverse proportions of infections detected over time and between countries. Most countries initially had only the capacity to test a small proportion of suspected cases and tests were reserved for severely ill patients or high risk groups. The available data therefore offer a systematic partial overview of trends.
In this article, starting from a SIR-type compartmental model with social structure, we developed new stochastic mathematical models describing the actions of a government agency to contain infections among the population in presence of an uncertain number of infectious individuals. More precisely, in the present model, the state of the infected is defined by a multi-dimensional time-dependent function characterized by the age, which proved essential in the description of the COVID-19 pandemic, and by a systematic uncertainty which permits to avoid additional sub-compartmentalization of the infectious population.
These assumptions allows to derive a socially structured model that contains the control action in feedback form based on the perception of the policy maker of the spread of the disease. Subsequently, the uncertainty in the model has been approximated by expanding the state variables into orthogonal polynomials in the random space, reducing the problem to a set of deterministic equations for the distribution of the solution through the course of the epidemic. The resulting controlled stochastic dynamical system is then solved using this deterministic formulation, which in the case of sufficiently regular uncertainty distributions allows efficient and accurate estimations of the random parameters.
The numerical simulations, performed using data from the recent COVID-19 outbreak in Italy, show, on the one hand, the model's ability to characterize the presence of the asymptomatic population trough the introduction of an uncertainty in the number of infected and in their epidemic role, and, on the other hand, in presence of control to well describe the effects of non pharmaceutical interventions aimed at flattening the infection curve. In particular, identifying some scenarios in agreement with government actions, containment measures by the population based on the resumption of certain occupational activities characterized by specific age groups and social interaction matrices were studied.
Further studies in this direction will aim to consider more comprehensive epidemic models, including the effects of other clinical compartments of interest, along with generalization of the control term to different objective functions or to the case of multiple control terms for each social activity, in order to design optimal strategies to mitigate the overall epidemic impact.

A Feedback controlled models with additional compartmentalizations
In this Appendix, we will detail how to extend the instantaneous control approach introduced in Section 2.3 in the case of a a socially-structured SEPIAR-type compartmentalization [24] where the control functional is given by with u = (u P , u I , u A ) and subject to In the above compartmentalization e, i P , i I , and i A represent the number of exposed, preasymptomatic, infected with heavy symptoms and asymptomatic/mildly symptomatic individuals, respectively [24]. The parameters β j , j ∈ {P, I, A} are the specific transmission rates of the three infectious classes. Exposed individuals, still not contagious, enter the presymptom stage at rate δ E and only then become infectious. Pre-symptomatic individuals progress at rate δ P to become symptomatic who develop severe symptoms with probability σ or asymptomatic individuals with probability 1−σ. Symptomatic individuals recover from infection at rate γ I whereas asymptomatic individuals leave their compartment after having recovered from infection at rate γ A . In (43) we assume that the policy maker aims at minimizing the total number of infected We highlight that other control strategies may be considered as well leading to different feedback control formulations. For the socially-structured SEPIAR model the feedback controlled formulation is then obtained from the discrete approximation min u∈U J h (u) := ψ(I T (t + h)) + 1 2 j∈{P,I,A} Λ×Λ ν j (a, a * , t)|u j (a, a * , t)| 2 dada * , subject to s(a, t + h) = s(a, t) − hs(a, t) j∈{P,I,A} Λ (β j (a, a * ) − u j (a, a * , t))i j (a * , t) da * e(a, t + h) = e(a, t) + hs(a, t) j∈{P,I,A} Λ (β j (a, a * ) − u j (a, a * , t))i j (a * , t) da * − hδ E (a)e(a, t) where the discrete equation for r(a, t) can be omitted since the control does not play any direct role in its evolution. The total number of infected evolved accordingly to We can derive the minimizer of J h computing ∇uJ h (u) ≡ 0 or equivalently Using (45) we can compute ψ (I T (t + h)) ∂I T (a, t + h) ∂u j = −ν j (a, a * , t)u j (a, a * , t), j ∈ {P, I, A}, which by virtue of (47) leads to the discrete feedback controls In order to pass to the limit h → 0 in (46) and obtain the feedback controlled model, we must rescale the penalization parameters with respect to the short time-horizon h. With the aim of preserving the compartment of exposed, the penalization terms should rescale as ν j (a, a * , t) = h 2 κ j (a, a * , t) to get the feedback controlled model (44) where the control terms are given by u j (a, a * , t) = δ E (a) κ j (a, a * , t) s(a, t)i j (a * , t)ψ (I T (t)), j ∈ {P, I, A}.
It is interesting to observe that the control action in (49) is inversely proportional to the incubation time, namely shorter incubation periods requires a stronger control. On the contrary, if one assumes the same rescaling adopted for the SIR model in (12), namely ν j (a, a * , t) = hκ j (a, a * , t), and select a short time horizon equal to the incubation time h = 1/δ E , the exposed compartment in (46) reduces to e(a, t + h) = hs(a, t) j∈{P,I,A} Λ (β j (a, a * ) − u j (a, a * , t))i j (a * , t) da * which substituted into the equation for i P (a, t + h) in the limit h → 0 leads to a feedback controlled model in the form (44) without the compartment of exposed and where the presymptomatic satisfy j∈{P,I,A} Λ (β j (a, a * ) − u j (a, a * , t))i j (a * , t) da * − δ P (a)i P (a, t).
Finally, we emphasize that the generalization to the case with uncertainty contains no difficulty and follows the same methodology introduced in Section 3. In terms of data-fitting, similar to the SIR model, one considers the feedback controlled model in absence of agedependence and proceeds along the lines indicated in Remark 2. We will omit the details for brevity.

Probability law Expansion polynomials Support
Gaussian Charlier N Table 2 The different polynomial expansions connected to the probability distribution of the random component z k , k = 1, . . . , dz.

B Stochastic Galerkin approximation
In this Appendix we give the details of the stochastic Galerkin (sG) method used to solve the feedback controlled system (27)- (29) with uncertainties. To this aim, we consider a random vector z = (z 1 , . . . , z d ) with independent components and whose distribution is p(z) : R dz → R dz + . The stochastic Galerkin approximation of the differential model (27)- (29) is based on stochastic orthogonal polynomials and provides a spectrally accurate solution under suitable regularity assumptions, see [53]. We consider the linear space P M of polynomials of degree up to M generated by a family of polynomials {Φ h (z)} M |h|=0 that are orthonormal in the space L 2 (Ω) such that being k = (k 1 , . . . , k d ) a multi-index, |k| = k 1 + · · · + k dz with δ kh the Kronecker delta function, and E[·] the expectation with respect to p(z). The construction of the polynomial basis {Φ h (z)} M |h|=0 depends on the distribution of the uncertainties and must be chosen in agreement with the Askey scheme [53]. We summarize in Table 2 several polynomials bases in connection with the law of a random component of z.
By assuming s(z, a, t), i(z, a, t) and r(z, a, t) in L 2 (Ω) we may approximate these terms through a generalized polynomial chaos expansion in the random space as follows where the quantitiesŝ k ,î k ,r k are projections in the polynomial spacê The sG formulation of system (27) with u M (a, a * ) = 1 κ(a, a * ) R s M (z, a, t)i M (z, a * , t)ψ (I M (z, t)) , and where s M , i M , r M are defined by (50). Then, thanks to the orthonormality of the polynomial basis of P M , multiplying by Φm, for all |m| ≤ M , and taking the expectation with respect to p(z) we obtain the following coupled system of M +1 deterministic equations for the evolution of each projection where B km = M |l|=0 R dz Λ β(z, a, a * ) − u M (a, a * ) î l (a * , t)da * Φ k (z)Φm(z)Φ l (z)p(z)dz The above system is then integrated in time directly in the space of projections. We remark that statistical quantities of interest, such as expectation and variance of infected, can be recovered as In all the simulations reported we used M = 10 and a fourth order Runge-Kutta method for the time integration.

C Social mixing functions
This last appendix is devoted to report the details of the social interaction functions that characterize the dynamics of social mixing. These characteristics are in fact crucial for a correct prediction of outcomes, especially in diseases transmitted by close contacts. Several large-scale studies have been designed in the last decade to determine relevant age-based models in social mixing. Without attempting to review the vast literature on this topic, we mention [6,42,45] and the references therein.  Fig. 14 From left to right, contour plot of the social interaction functions β F , β E and β P taking into account the different contact rates of people with ages a and a * in relation to specific social activities. The function β F characterizes the family contacts, β E the education and school contacts, and β P the professional contacts.
The number of contacts per person generally shows considerable variability according to age, occupation, country and even day of the week, in relation to the social habits of the population. Nevertheless, some universal behaviors can be extracted which emerge as a function of specific social activities. Social mixing is highly age-related, which means that people usually tend to interact with other people of a similar age. Young people have a high rate of contact with adults aged around 30-39 and older people over 65, i.e. their parents and grandparents respectively. Contact rates are indeed very high at home and at school. On the other hand, professional mixing is weakly assortative by age and tends to be determined by uniform interactions, approximately between people from 25 and 65 years old.
For these reasons we consider an interaction function determined by three main subfunctions that characterize the family, the school and the professional mixing. Therefore, a stylized function approximating a realistic contact matrix can be written as follows β social (a, a * ) = C β j∈A ω j β j (a, a * ), where the functions β j (a, a * ) take into account the different contact rates of people with ages a and a * in relation to specific social activities of the type A = {F, E, P }, where we identify family contacts with F , education and school contacts with E and professional contacts with P , and the associated weights ω F , ω E and ω P . The particular structure of these social interaction matrices was determined empirically in [6,45]. Here, according to these observations, we propose suitable mathematical functions that can be calibrated to reproduce empirical observations, in what follow we normalize βe = 1 and amax = 1. In details, familiar contacts tend to concentrate on a three-band matrix with a peak around younger ages. This can be reproduced considering the functioñ β F (a, a * ) = λ F,1 (1 + (a − a * ) 2 ) λ F,2 + (a 2 + a 2 * ) exp − 1 σ 2 F a 2 + a 2 * (1 + (a − a * ) 2 ) λ F,2 .

Contact function
being λ E > 0 the average contact age at school, and λ P > 0 the average professional contact age. The coefficient C F , C E , C P > 0 measure the impact of the different contact function in the dynamics. In Figures 14 and 15 we represent the three social interaction functions and the resulting global social interaction function β social (a, a * ) assuming ω F = ω E = ω P = 1.
In Table 3 we report the choice of the parameters used in the simulations.