1 Introduction

In the absence of pharmaceutical interventions, the prevention of infection through a reduction of social contacts presents an effective method to slow or even halt the spread of an epidemic [1, 2]. However, limiting social contacts comes at a significant psychological [3] and economical [4, 5] cost. To reduce those socio-economic adverse effects, it is, therefore, desirable to use targeted social distancing measures, to remove fewer links for achieving the same effect. If the infection patterns for a particular disease are known, for example, measures can be taken to remove the most common routes of infection [6,7,8]. Similarly, methods have been introduced for targeted link removal if the infection status of individual nodes is known [8,9,10] as well as systems where the entire network structure is known in detail [10, 11]. This would be the case for example in more aggregated settings, like the transport network, which has been shown to directly affect disease spreading [12]. Moreover, in [13] the authors have introduced a test-kit-based control strategy to avoid a strict lock down or social distancing and, consequently, maintain economic stability by partial opening of business centers.

Often, however, diseases without pharmaceutical interventions are new and we lack detailed knowledge about their infection patterns, and in the absence of functioning contact tracing many infections may go unnoticed, especially if the symptoms are mild or unspecific. Therefore, social distancing as a behavioral change is the primary intervention for disease prevention [14].

In the extreme case of reducing the number of contacts in the social graph to zero, by necessity the disease dies out and is fully controlled. If there are no links left in the network, the disease can not spread. Intuitively, we expect that the more contacts are removed, the better a disease can be controlled and prevented from becoming an epidemic. This can be made precise given a model of the disease spread. For an approximation of SIS dynamics with recovery rate \(\delta \) and infectiousness \(\beta \) it was shown in [15] that the epidemic threshold, \(\tau \), is determined by the inverse of the largest eigenvalue of the adjacency matrix of the underlying contact network, i.e., \(\tau =\lambda _{\text {max}}^{-1}\). If \(\tau >\beta /\delta \), the disease quickly dies out and remains confined to a small section of the network, if \(\tau <\beta /\delta \); however, the disease spreads as an epidemic over large parts of the network and becomes endemic. At \(\tau ^c=(\lambda ^c_{\text {max}})^{-1}=\beta /\delta \), the system tips from epidemic to non-epidemic regime. Decreasing \(\lambda _{\text {max}}\) thus represents a topological way of controlling an epidemic that is independent of the unknown or unchangeable disease parameters \(\beta \) and \(\delta \).

By removing links from a network, one can generally reduce the value of \(\lambda _{\text {max}}\). However, the size of this reduction depends on the specific links removed (see the example network in [15]). Here, we introduce a way to remove links based on \(\lambda _{\text {max}}\) to sample from ensembles with a given disease-controlling property using Markov chain Monte Carlo (MCMC) methods [16]. We thereby study what characterizes contact reductions that perform well at controlling the disease spread. The largest eigenvalue is a lower bound for SIS [17, 18], and often, but not always indicative of SIR [19, 20]. Thus it is a natural candidate for building well-controlling networks. In the last section we will see in simulation that the resulting networks and methods do indeed work for SIS and SIR models. We will see that well-controlling networks on average have significantly lower \(\lambda _{\text {max}}\) even at relatively low \(\nu \). We find that well-controlling networks have a more strongly peaked degree distribution, as well as more homogeneously sized connected components once the network is no longer connected. We therefore suggest and test two heuristics for achieving a similar effect by homogenizing the node degrees in the network through link removal and find them to lead to a similarly strong control, suggesting that the strongly peaked degree distribution characterizes well-controlling networks. Finally, by running both SIS and SIR models on networks obtained from different removal strategies, we show that the tipping point in the critical ratio of \(\beta /\delta \) occurs for larger ratios in the WCNE and degree-homogenized networks than in generic removals.

The results are robust across all initial network ensembles tested here, namely Barabasi–Albert (BA) networks [21], random geometric (RG) graphs [22] and a real-world network of friendships between high-school students [23].

2 Basic notions and model description

We start from an initial graph ensemble representing the social network relevant for the spreading. The BA- and RG- network ensembles were chosen for their depiction of different types and applications of networks, on which epidemic spreading processes can happen. BA-Networks were introduced as a simplified model for online social networks, which are relevant for the spreading opinions, ideas or computer viruses. RG-networks, on the other hand, capture the spatially embedded structure of most in-person social networks most relevant for the spread of diseases.

We consider a generic reduction in which a certain percentage of links are removed at random. Given a social network \(N^{soc}\) with E edges this defines an ensemble of reduced networks \(q^\rho (N^{\text {soc}})\) that are equidistributed on the subgraphs of \(N^{\text {soc}}\) with \((1 - \rho ) E\) edges, where \(0< \rho < 1\) is the contact reduction rate. We then consider the canonical network ensemble [24] relative to this ensemble of generic reductions that achieves a certain expectation value of the largest Eigenvalue \(\lambda _{\text {max}}\). That is, the ensemble of networks that is least distinguishable from random contact removals at a given expectation value of \(\lambda _{\text {max}}\). The canonical network ensemble is given by the probabilities

$$\begin{aligned} p(N) = \frac{1}{Z} e^{- \nu \lambda _{\text {max}}(N)} q^\rho (N). \end{aligned}$$
(1)

Here Z is a normalization factor, \(\nu \) is the inverse genericity as defined in [24], which interpolates between the generic ensemble (at \(\nu = 0\)) and one peaked at the best controlling networks at \(\nu \rightarrow \infty \) (optimized). We call the family of ensembles with reasonably high values of \(\nu \) the well-controlling network ensembles (WCNE) and samples drawn from such an ensemble well-controlling networks (WCN). As in [24] we obtain well-controlling networks by employing an MCMC Metropolis–Hastings method [25, 26].

In both synthetic cases, the initial networks \(N^{\text {soc}}\) have 100 nodes and an average degree of \(\langle k\rangle \approx 18\), amounting to around 900 links. The precise number of links for the random geometric networks fluctuates slightly around that value due to the network construction algorithm used.

Fig. 1
figure 1

Example networks from network ensembles after removing \(80\%\) of contacts from an initial BA and RG graph with \(N=100\) and \(E\approx 900\). a and c The generic contact reductions at \(\nu =0\) or, in the other word, they show generic networks with random removed links. b and d WCNE at \(\nu =1000\). In all figures nodes with the same color and size have the same degree. To compare WCNE networks, i.e. b and d, with the corresponding generic networks with random removed links in a and c), it is visually clear that WCNE networks have more homogeneous degree distribution as well as more homogeneous component sizes

Starting from a reduced network drawn randomly from \(q^\rho (N^{soc})\) at a given contact reduction rate \(\rho \), we keep the number of links constant through all changes of the network. We then vary the set of removed links using MCMC. The probability of increasing/decreasing \(\lambda _{\text {max}}\) in this step depends on the selected value of \(\nu \).

Figure 1 shows example networks from a generic removal and WCNE ensembles of BA and RG starting ensembles with a removal rate of \(\rho =0.8\). At first glance, we see that the WCNE examples have fewer disconnected nodes and a generally more homogeneous distribution of degrees and component sizes. In the following, we describe the applied MCMC approach and its energy function in detail.

The state of the system at a time step t is given by the network structure and thus the adjacency matrix \(A_t\). The energy function for the process is given by the largest eigenvalue of the adjacency matrix \(A_t\), describing the remaining network (i.e., the one obtained after removing the current candidate set of edges),

$$\begin{aligned} \varepsilon (A)=\lambda _{\text {max}}(A). \end{aligned}$$
(2)

As mentioned in the introduction, it was shown in [15] that the epidemic threshold \(\tau \) is proportional to the inverse of \(\lambda _{\text {max}}\) for an approximation of the SIS model. In the same paper, it was also conjectured that this proportionality also holds in the SIR model. Thus, following that conjecture, we use \(\lambda _{\text {max}}\) as our energy function.

At each Monte Carlo step \(t\rightarrow t+1\), a proposal \(A'\) is generated to swap one edge between the remaining set of edges in the network and the set of currently removed edges, thus keeping the number of removed edges constant. The proposal is then accepted with a probability

$$\begin{aligned} P_{A\rightarrow A'} = \min \left( 1, e^{\nu (\varepsilon (A)-\varepsilon (A'))}\right) \end{aligned}$$
(3)

sampling from the canonical network ensemble at inverse genericity \(\nu \). Thus, a small \(\nu \) results in almost every proposal being accepted and many random changes being made. The final ensemble is then not different from the one obtained by randomly removing the edges. In the other extreme of very large \(\nu \), almost only those proposals are accepted that lower the energy, resulting in an ensemble with a lower average \(\lambda _{\text {max}}\).

Fig. 2
figure 2

MCMC substantially decreases \(\lambda _{\text {max}}\) at higher inverse genericity \(\nu \). a and c and show, respectively, the decrease of \(\lambda _{\text {max}}\) with MCMC for different initial networks for a range of values of the genericity \(\nu \). In all networks the removal rate is \(\rho =0.4\). For high genericity, such as \(\nu \rightarrow 0\), \(\lambda _{\text {max}}\) fluctuates just around its initial value after link removal (yellow line), while it reaches a low steady state for the smaller genericity (black line). b and d show the average of \(\lambda _{\text {max}}\) for a range of removal fractions \(\rho =\{0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9\}\) and genericities. As it is clear in these figures, in the larger \(\nu \), the curve is shifted more downwards, indicating a significant increase in the epidemic threshold\(\lambda _{\text {max}}^{-1}\)

This procedure is repeated until a steady state in \(\lambda _{\text {max}}\) is reached, fluctuating around \(\lambda _{\text {max}}(t\rightarrow \infty )\). We found this to be the case after 11000 steps. It should be noted that for both the BA and RG initial network ensembles we repeat the explained strategy for \(n=100\) different network configurations and, then, average over all final values of the largest eigenvalue to obtain .

3 Results

Examples for the evolution of \(\lambda _{\text {max}}\) are shown in Fig. 2 a and c for a range of \(\nu \) and \(\rho =0.4\).

While the ensemble after random link removal (\(\nu =0\)) has largest eigenvalues fluctuating around \(\lambda _{\text {max}}\approx 13.5\) for both BA and RG networks at \(\rho =0.4\), for the least general ensemble with \(\nu =1000\) this goes down to \(\lambda _{\text {max}}\approx 11\) in case of BA networks and \(\lambda _{\text {max}}\approx 10.5\) in case of RG networks. Values of \(\nu >1000\) were not considered here for several reasons. First, they require much longer (or more) runs to ensure a proper mixing, also the gains between \(\nu =500\) and \(\nu =1000\) were already minimal. Secondly our practical interest was in sampling the WCNE, rather than finding the optimal link removal, because in practical scenarios, the underlying network is neither known, nor always perfectly controllable.

Fig. 3
figure 3

Network measures in the ensembles. For both initial network ensembles (BA and RG), the ensembles after link removal are analyzed. a and b show the degree distributions after removal of 80% of links. d and e show the average number of components after the removal of \(\rho \) links. g and h demonstrate the size of the giant component after removal of \(90\%\) links

In Fig. 2b and d, we see that the improvement from MCMC persists in both network ensembles and over all removal fractions. We see that the impact of removing 90% of links randomly can alternatively be achieved by removing only 80% in a targeted manner, leaving each individual with twice as many contacts on average.

Fig. 4
figure 4

Top: Successively removing the link with the highest degree product results in networks with a very narrow degree distribution. Bottom: Go through nodes in order of increasing node number and remove any links exceeding the cap \(k_{\text {max}}=3\)

3.1 Network analysis

We now analyze the resulting ensembles and thus compare several network measures across random and WCNE removals in Fig. 3. We find several indicators pointing to a homogenization of the networks through the reduction of \(\lambda _{\text {max}}\).

Across all initial network ensembles we observe a more strongly peaked degree distribution for WCNE (blue diamonds), compared to the generic ensemble (red circles), as shown in Fig. 3a and b.

Furthermore the number of connected components for the WCNE remains at 1 until a removal ratio of \(\rho =0.8\) compared to \(\rho =0.6\) for the most generic removals in BA networks (see Fig. 3c, d), blue lines. This indicates homogeneity again, as no single nodes are split off from the giant component. Investigating further the regime of large removal rates by looking at the size of the giant component at \(\rho =0.9\), we find in Fig. 3e, f that the largest component is smaller in the WCNE ensemble than it is in the generic one. This means that it is not single nodes or small sub-graphs that are disconnected but the network splits into several networks of similar size, in line with our ad hoc observation from Fig. 1.

While the MCMC procedure is effective at increasing the epidemic threshold of networks compared to similarly dense networks with randomly removed edges, it is often not practically feasible to use. The procedure requires knowledge of the original social network, which is not typically accessible when suggesting social distancing measures. Moreover, there is not one fixed real network that can be optimized, but rather an ever-changing, unknown interaction structure. We need general characteristics of well-performing removal sets, that can be applied to a time dependent, unknown network. By understanding general characteristics of removals that reduce the spreading we can formulate policies for targeted reductions. Our above observations suggest that we need to homogenize the degree distribution.

We, therefore, now introduce two ad hoc methods for link removal, with this aim, one operating with only local and one with global information.

(i) The first one we call the degree product method. Starting from the original network, we iteratively remove one of the links with the highest degree product at random, which we define as the product of the degrees of the two end nodes, until the fraction \(\rho \) of removed links is reached. An illustrative example is shown in Fig. 5 (top).

Fig. 5
figure 5

The degree-based strategies reduce \(\lambda _{\text {max}}\) almost as well as MCMC. The reduction is greater for smaller removal ratios but persists even if 90 % of edges are removed

(ii) The second one we call the degree cap. In order of ascending node number, each node with degree \(>k_{\text {max}}\) has all but \(k_{\text {max}}\) of its links removed at random. This is a random order in case of RG networks. In the BA ensemble, low node numbers tend to be correlated with high degrees, thus high-degree nodes are reduced first. However this discrepancy results in similar effects on the network structure as shown in Fig. 3. An illustrative example of the algorithm is shown in Fig. 5 (bottom).

Fig. 6
figure 6

Decreased \(\lambda _{\text {max}}\) shifts the epidemic threshold to more infectious values of the disease spreading models’ parameters in both SIR (a) and SIS (b) simulations. This shift persists across removal rates \(\rho \) and tends to be larger for larger \(\rho \). It also persists for the degree-homogenizing methods. c Shows the smallest value of \(\beta /\delta \) for each removal rate \(\rho \), for which the cumulative number of infected individuals exceeds 1% using the SIS model. It clearly indicates the shift and verifies that it exists for both ad hoc strategies as well as for the WCNE

We have included networks generated with both degree-homogenizing methods in the analyses of Fig. 3 and show their reduction of the largest eigenvalues in Fig. 4. Subfigure c) also shows a similar level of reduction effect for a real world high school contact network. Both methods result in networks with a more peaked degree distribution than for random removals. The degree-product method (black stars) reaches the most peaked one and correlates very well with the network ensemble resulting from MCMC simulations in most network measures. The degree cap (yellow crosses) results in a less pronounced peak. Both methods mimic the delay of the onset of network fracturing, again with the degree product method having a more pronounced effect. Interestingly for the size of the giant component shown in Fig. 3e, f, the results differ for the different initial ensembles. While the degree product method consistently results in smaller giant components at a similar range of \(\lambda _{\text {max}}\) than the WCNE, the degree cap method results in component sizes comparable to the generic ensemble for BA and RG networks. In the high-school example, it produces the same component size as the degree product method, which is lower than even those of WCNE.

3.2 Simulation and real-world networks

To validate our removal strategies’ disease-slowing property, we run both SIS and SIR simulations on a real social network of high-school students. As a measure for the disease becoming endemic, we compute the cumulative number of infected nodes. This is plotted versus different disease parameter ratios \(\beta /\delta \) in Fig. 6a and b for SIR and SIS, respectively. The size of the network is \(N=1062\) and the figure averages over 640 realizations of SIR and SIS simulations with 10 nodes initially infected at random. They were simulated for 1000 steps for SIS and until equilibrium was reached for SIR. We consistently find that tipping into the epidemic state occurs at lower values of \(\beta /\delta \) in the generic networks than in the WCNE across all removal ratios for both SIR and SIS. However, the figure also indicates that there may be a kind of trade-off here. While the tipping point is shifted, the transition also becomes steeper, leading to larger numbers of total infected people at very large \(\beta /\delta \).

Finally, Fig. 6c shows the smallest value of \(\beta /\delta \) for each removal rate \(\rho \), for which the cumulative number of infected individuals exceeds \(1\%\) using the SIS model. It clearly indicates the shift and verifies that it exists for both ad hoc strategies as well as for the WCNE, supporting our conjecture that the sharper peak in the degree distribution may be the cause of the reduction in spreading.

4 Discussion and conclusion

We have studied the effects of targeted link removal on the epidemic threshold in a network by comparing random removal, removal based on the largest eigenvalue as an energy function and two types of degree-homogenizing removal. We have found that at sufficiently low genericities (i.e., high \(\nu \)), all three types of targeted removal have a significantly higher epidemic threshold (i.e., lower \(\lambda _{\text {max}}\)) than the random removal. This also results in a shift of the tipping point to the epidemic state in SIS and SIR simulations, such that more infectious diseases can be controlled with fewer link removals necessary.

We have found this to coincide with a more sharply peaked degree distribution as well as fewer and more homogeneously sized connected components, which we have interpreted as an overall homogenization. Consequently, we have proposed the two degree-homogenizing methods, which are also effective at decreasing \(\lambda _{\text {max}}\), as well as shift the onset of epidemic spreading. It is worth mentioning that we have also tested a removal strategy based on edge-betweenness. However, it was found to be less effective at increasing the epidemic threshold than random removal and thus omitted in the results.

While running a full MCMC may be infeasible in practice, where the social network is fluctuating and unknown, the two ad hoc methods provide a simple topological way of targeting links. This shows that a cap in the number of permitted contacts per person, as already practised in many countries as part of social distancing measures against COVID-19, is actually close to optimal in terms of topological link targeting in static networks, at least in rather simple model simulations such as ours.

Realistically, however, measures may also include non-binary changes (such as wearing masks, shortening contacts or keeping a distance) as well as temporal changes in infectiousness and fluctuating interaction patterns. Although we have chosen the epidemic threshold in this work as our measure to reduce outbreak sizes, it would be a very interesting follow-up of this work to compare our results to ensembles resulting from other network measures, such as giant connected component, commonly associated with reduced outbreak sizes. While we have here exemplified the method using simple static SIR and SIS models, a more complex energy function would in principle make such an analysis possible also in the case of inhomogeneous (weighted) or temporally fluctuating transmission probabilities \(\beta \), as well as node, rather than edge removal (vaccinations). Changes of connection strengths or edge weights could be used instead of complete removals of edges as a model of transmission reductions through measures, such as mask wearing, meeting outdoors, improved ventilation and keeping a distance.