Skip to main content

Approximate Identification of the Optimal Epidemic Source in Complex Networks

Part of the Springer Proceedings in Complexity book series (SPCOM)


We consider the problem of identifying the source of a network epidemic from a complete snapshot of the infected nodes. We take a fully statistical approach and derive novel recursions to compute the Bayes optimal solution, under a heterogeneous susceptible-infected (SI) epidemic model. Our analysis is time and rate independent, and holds for general network topologies. We then provide two highly scalable algorithms for solving these recursions, a mean-field approximation and a greedy approach, and evaluate their performance on real and synthetic networks. Previous work on the problem has mostly focused on tree-like network topologies. Real networks are far from tree-like and an emphasis will be given to networks with high transitivity, such as social networks and those with communities. We show that on such networks, our approaches significantly outperform popular geometric and spectral centrality measures, most of which perform no better than random guessing.


  • Network epidemics
  • Source recovery
  • Approximate inference
  • Dynamic programming

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-38965-9_8
  • Chapter length: 19 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-38965-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Hardcover Book
USD   179.99
Price excludes VAT (USA)
Fig. 1
Fig. 2
Fig. 3


  1. Cliff, A., Haggett, P.: Time, travel and infection. Br. Med. Bull. 69(1), 87–99 (2004)

    CrossRef  Google Scholar 

  2. Cohen, M.L.: Changing patterns of infectious disease. Nature 406(6797), 762 (2000)

    ADS  CrossRef  Google Scholar 

  3. Colizza, V., Barrat, A., Barthélemy, M., Vespignani, A.: The role of the airline transportation network in the prediction and predictability of global epidemics. Proc. Natl. Acad. Sci. 103(7), 2015–2020 (2006)

    ADS  CrossRef  Google Scholar 

  4. Slutsker, L., Altekruse, S.F., Swerdlow, D.L.: Foodborne diseases: emerging pathogens and trends. Infect. Dis. Clin. 12(1), 199–216 (1998)

    CrossRef  Google Scholar 

  5. Elliott, M., Golub, B., Jackson, M.O.: Financial networks and contagion. Am. Econ. Rev. 104(10), 3115–3153 (2014)

    CrossRef  Google Scholar 

  6. Acemoglu, D., Ozdaglar, A., Tahbaz-Salehi, A.: Systemic risk and stability in financial networks. Am. Econ. Rev. 105(2), 564–608 (2015)

    CrossRef  Google Scholar 

  7. Kondakci, S.: Epidemic state analysis of computers under malware attacks. Simul. Model. Pract. Theory 16(5), 571–584 (2008)

    CrossRef  Google Scholar 

  8. Fleizach, C., Liljenstam, M., Johansson, P., Voelker, G.M., Mehes, A.: Can you infect me now? Malware propagation in mobile phone networks. In: Proceedings of the 2007 ACM workshop on Recurring malcode, pp. 61–68. ACM, New York (2007)

    Google Scholar 

  9. Shao, C., Ciampaglia, G.L., Varol, O., Flammini, A., Menczer, F.: The spread of fake news by social bots (2017), pp. 96–104. Preprint. arXiv: 1707.07592

    Google Scholar 

  10. Shao, C., Ciampaglia, G.L., Varol, O., Yang, K.C., Flammini, A., Menczer, F.: The spread of low-credibility content by social bots. Nat. Commun. 9(1), 4787 (2018)

    ADS  CrossRef  Google Scholar 

  11. Friggeri, A., Adamic, L., Eckles, D., Cheng, J.: Rumor cascades. In: Eighth International AAAI Conference on Weblogs and Social Media (2014)

    Google Scholar 

  12. Shin, J., Jian, L., Driscoll, K., Bar, F.: Political rumoring on twitter during the 2012 US presidential election: rumor diffusion and correction. New Media Soc. 19(8), 1214–1235 (2017)

    CrossRef  Google Scholar 

  13. Jin, Z., Cao, J., Guo, H., Zhang, Y., Wang, Y., Luo, J.: Detection and analysis of 2016 US presidential election related rumors on twitter. In: International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, pp. 14–24. Springer, Berlin (2017)

    CrossRef  Google Scholar 

  14. Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. J. Econ. Perspect. 31(2), 211–36 (2017)

    CrossRef  Google Scholar 

  15. World Health Organization: Foodborne Disease Outbreaks: Guidelines for Investigation and Control, pp. 41–43. World Health Organization, Geneva (2008)

    Google Scholar 

  16. Manitz, J., Kneib, T., Schlather, M., Helbing, D., Brockmann, D.: Origin detection during food-borne disease outbreaks – a case study of the 2011 EHEC/HUS outbreak in Germany. PLoS Curr. 6 (2014)

    Google Scholar 

  17. Horn, A.L., Friedrich, H.: Locating the source of large-scale outbreaks of foodborne disease. J. R. Soc. Interface 16(151), 20180624 (2019)

    CrossRef  Google Scholar 

  18. Shen, Z., Cao, S., Wang, W.X., Di, Z., Stanley, H.E.: Locating the source of diffusion in complex networks by time-reversal backward spreading. Phys. Rev. E 93(3), 032301 (2016)

    ADS  MathSciNet  CrossRef  Google Scholar 

  19. Pei, X., Jin, Z., Zhang, W., Wang, Y.: Detection of infection sources for avian influenza a (H7N9) in live poultry transport network during the fifth wave in China. IEEE Access 7, 155759–155778 (2019)

    CrossRef  Google Scholar 

  20. Pei, S., Muchnik, L., Andrade Jr., J.S., Zheng, Z., Makse, H.A.: Searching for superspreaders of information in real-world social media. Sci. Rep. 4, 5547 (2014)

    ADS  CrossRef  Google Scholar 

  21. Kitsak, M., Gallos, L.K., Havlin, S., Liljeros, F., Muchnik, L., Stanley, H.E., Makse, H.A.: Identification of influential spreaders in complex networks. Nat. Phys. 6(11), 888 (2010)

    CrossRef  Google Scholar 

  22. Bojja Venkatakrishnan, S., Fanti, G., Viswanath, P.: Dandelion: redesigning the Bitcoin network for anonymity. Proc. ACM Meas. Anal. Comput. Syst. 1(1), 22 (2017)

    CrossRef  Google Scholar 

  23. Shah, D., Zaman, T.: Rumors in a network: who’s the culprit? IEEE Trans. Inf. Theory 57(8), 5163–5181 (2011)

    MathSciNet  CrossRef  Google Scholar 

  24. Fioriti, V., Chinnici, M.: Predicting the sources of an outbreak with a spectral technique (2012). Preprint. arXiv:1211.2333

    Google Scholar 

  25. Lokhov, A.Y., Mézard, M., Ohta, H., Zdeborová, L.: Inferring the origin of an epidemic with a dynamic message-passing algorithm. Phys. Rev. E 90(1), 012801 (2014)

    ADS  CrossRef  Google Scholar 

  26. Zhu, K., Ying, L.: Information source detection in the sir model: a sample-path-based approach. IEEE/ACM Trans. Networking 24(1), 408–421 (2016)

    CrossRef  Google Scholar 

  27. Luo, W., Tay, W.P.: Identifying multiple infection sources in a network. In: 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp. 1483–1489. IEEE, Piscataway (2012)

    Google Scholar 

  28. Nguyen, H.T., Ghosh, P., Mayo, M.L., Dinh, T.N.: Multiple infection sources identification with provable guarantees. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 1663–1672. ACM, New York (2016)

    Google Scholar 

  29. Prakash, B.A., Vreeken, J., Faloutsos, C.: Spotting culprits in epidemics: how many and which ones? In: 2012 IEEE 12th International Conference on Data Mining (ICDM), pp. 11–20. IEEE, Piscataway (2012)

    Google Scholar 

  30. Jiang, J., Wen, S., Yu, S., Xiang, Y., Zhou, W.: Identifying propagation sources in networks: state-of-the-art and comparative studies. IEEE Commun. Surv. Tutorials 19(1), 465–481 (2017)

    CrossRef  Google Scholar 

  31. Lappas, T., Terzi, E., Gunopulos, D., Mannila, H.: Finding effectors in social networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1059–1068. ACM, New York (2010)

    Google Scholar 

  32. Antulov-Fantulin, N., Lančić, A., Šmuc, T., Štefančić, H., Šikić, M.: Identification of patient zero in static and temporal networks: robustness and limitations. Phys. Rev. Lett. 114(24), 248701 (2015)

    ADS  CrossRef  Google Scholar 

  33. Paluch, R., Lu, X., Suchecki, K., Szymański, B.K., Hołyst, J.A.: Fast and accurate detection of spread source in large complex networks. Sci. Rep. 8(1), 2508 (2018)

    ADS  CrossRef  Google Scholar 

  34. Kiss, I.Z., Miller, J.C., Simon, P.L., et al.: Mathematics of Epidemics on Networks. Springer, Cham (2017)

    CrossRef  Google Scholar 

  35. Khim, J., Loh, P.L.: Confidence sets for the source of a diffusion in regular trees. IEEE Trans. Netw. Sci. Eng. 4(1), 27–40 (2017)

    MathSciNet  CrossRef  Google Scholar 

  36. Chang, B., Zhu, F., Chen, E., Liu, Q.: Information source detection via maximum a posteriori estimation. In: 2015 IEEE International Conference on Data Mining (ICDM), pp. 21–30. IEEE, Piscataway (2015)

    Google Scholar 

  37. University of Oregon Route Views Project: Online data and reports.

  38. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 177–187. ACM, New York (2005)

    Google Scholar 

  39. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440 (1998)

    ADS  CrossRef  Google Scholar 

  40. Traud, A.L., Kelsic, E.D., Mucha, P.J., Porter, M.A.: Comparing community structure to characteristics in online collegiate social networks. SIAM Rev. 53(3), 526–543 (2011)

    MathSciNet  CrossRef  Google Scholar 

  41. Traud, A.L., Mucha, P.J., Porter, M.A.: Social structure of Facebook networks. Physica A 391(16), 4165–4180 (2012)

    ADS  CrossRef  Google Scholar 

  42. Leskovec, J., Huttenlocher, D., Kleinberg, J.: Signed networks in social media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1361–1370. ACM, New York (2010)

    Google Scholar 

  43. Karrer, B., Newman, M.E.: Stochastic blockmodels and community structure in networks. Phys. Rev. E 83(1), 016107 (2011)

    ADS  MathSciNet  CrossRef  Google Scholar 

  44. Luo, W., Tay, W.P., Leng, M.: How to identify an infection source with limited observations. IEEE J. Sel. Top. Sign. Proces. 8(4), 586–597 (2014)

    ADS  CrossRef  Google Scholar 

  45. Ross, S.M.: Introduction to Probability Models. Academic, Cambridge (2014)

    MATH  Google Scholar 

Download references


We would like to thank Mason A. Porter for providing the Facebook-100 dataset.

Author information

Authors and Affiliations


Corresponding author

Correspondence to S. Jalil Kazemitabar .

Editor information

Editors and Affiliations


Appendix 1: Multi-Source Extension

The inference problem discussed in Sect. 2.2 immediately extends to the multi-source situations. Consider the case were more than one independent source, denoted by I , initiate the infection dynamics. Due to the Markovian nature of the dynamics, the infection path that leads to some set I does not influence the value of ρ IO. Hence, Proposition 1 also describes the likelihood of the transition from the source set I to a snapshot O.

If we know that there are s original sources, e.g. |I | = s, with a uniform prior on the patient zeros, the Bayesian solution would be characterized by the optimization

$$\displaystyle \begin{aligned} I^*_{\text{MAP}} = \operatorname*{\mathrm{argmax}}_{I\subset O,\, |I|=s}\, \rho_{I \to O} \end{aligned} $$

To compute this MAP estimate, we can still use the DP solution in Proposition 1, but we do not need to compute ρ IO for |I| < s. Thus, the multi-source problem is in a sense “easier”, especially when s ≈|O|, since one can terminate the recursion earlier (i.e., the case s = 1 is the hardest).

Appendix 2: Proofs

1.1 6 Proof of Proposition 1

Let us first recall a known fact about the exponential distribution:

Lemma 1

Let \(T_i \sim \operatorname {\mathrm {Exp}}(\beta _i)\) be a collection of independent exponential variables. Then,

$$\displaystyle \begin{aligned} \mathbb{P}\Big(T_i < \min_{j \neq i} T_j\Big) = \frac{\beta_i}{ \sum_{j} \beta_j}. \end{aligned} $$

For a proof of Lemma 1, see [45]. The forward programming (2) is an application of the law of total probability in the following sense: The event that nodes in O ∖ I are infected before any other node in I c splits into sub-events that each node in O ∖ I is infected before those in O c and we have

$$\displaystyle \begin{aligned} \rho_{I \to O} = \sum_{j \in O \setminus I} \rho_{I \to I \cup j} \cdot \rho_{I \cup j \to O} \end{aligned} $$

where we have also used the Markov property of SI dynamics to split the probabilities on the RHS into the products. The ratio in (2) corresponds to the transition probability from I to I ∪ j, that is ρ IIj. Indeed, given that I is infected, we run exponential clocks \(T_j \sim \operatorname {\mathrm {Exp}}( \beta \operatorname {\mathrm {vol}}(I,j))\) and the first to expire determines the next infected node. By Fact 1, this happens for any node j ∈ I c with probability \(\propto _j \beta \operatorname {\mathrm {vol}}(I,j)\). Thus,

$$\displaystyle \begin{aligned} \rho_{I \to I \cup j} = \frac{\beta \operatorname{\mathrm{vol}}(I,j) }{\sum_{j^{\prime}}\beta \operatorname{\mathrm{vol}}(I,j^{\prime})} = \frac{\operatorname{\mathrm{vol}}(I, j)}{\operatorname{\mathrm{vol}}(I, I^c)}. \end{aligned} $$

This proves the forward programming. The backward programming, on the other hand, connects ρ IO to ρ IOj and is proved similarly. Basically, the event of visiting O can be divided into sub-events based on the last node in O that is infected.

1.2 6 Proof of Proposition 2

We prove the following alternative expressions for \(S = (S_{jj^{\prime }})^{|O| \times |O|}\) and z = (z j)|O|,

$$\displaystyle \begin{aligned} S_{jj^{\prime}} &:= \begin{cases} d^{in}_{O \setminus j^{\prime}}(j) d^{in}_{O \setminus j}(j^{\prime}) + \sum_{i\in O} A_{ij} A_{ij^{\prime}} & j \neq j^{\prime} \\ 2 \big[d^{in}_{O}(j)^2 + \sum_{i\in O} A^2_{ij}\big] & j = j^{\prime} \end{cases} \\ {} z_j &:= \Big[ \operatorname{\mathrm{vol}}(O_{\setminus j}) + 2 \operatorname{\mathrm{vol}}(O{ \setminus\, j}, (O{\setminus \,j})^c)\Big]\, d^{in}_{O}(j) \\ & \qquad + \sum_{i\in O} (d^{out}_{O_{\setminus j}}(i)- d^{in}_{O_{\setminus j}}(i)) A_{ij} + 2 \sum_{i\in O} d^{out}_{(O\setminus \,j)^c}(i)\, A_{ij}. \end{aligned} $$

Here, \(d^{out}_O(i) := \sum _{j \in O} A_{ij}\) is the out-degree of node i in O, \(d^{in}_O(i) := \sum _{j \in O} A_{ji}\) is the in-degree of node i in O, and \( \operatorname {\mathrm {vol}}^{(2)}(i,j) := \sum _{r \in O} A_{ir} A_{rj}\) is the number of paths of length 2 between nodes i and j that pass through O. It is not hard to verify that these expressions are equivalent to the matrix form presented in (2).

Recall that \( \operatorname {\mathrm {vol}}(I, I^c) = \sum _{i,k} A_{ik} 1\{i \in I, k \notin I \}\) and similarity \( \operatorname {\mathrm {vol}}(I, j) = \sum _{r} A_{rj} 1\{r \in I \}\). Here, the indices, i, k and r run over all nodes in the network, i.e. i, k, r ∈ [n]. We have

$$\displaystyle \begin{aligned} (Q^T \boldsymbol r)_j &= \sum_{I\subset O} 1\{j\not\in I\}\, \operatorname{\mathrm{vol}}(I, j)\cdot \operatorname{\mathrm{vol}}(I, I^c) \\ &= \sum_{I \subset O \setminus \{j\}}\, \operatorname{\mathrm{vol}}(I, I^c) \cdot\operatorname{\mathrm{vol}}(I, j) \\ &= \sum_{I \subset O \setminus \{j\}}\, \sum_{i,k,r} A_{ik} A_{rj} \, 1\{i \in I, \,k \notin I, \,r \in I\} \\ &= \sum_{i,k,r} A_{ik} A_{rj} \gamma_{ikr} \end{aligned} $$

where the last equality follows by interchanging the order of summations and defining

$$\displaystyle \begin{aligned} \gamma_{ikr} := \sum_{I \subset O \setminus \{j\}}1\{i \in I, \,k \notin I, \,r \in I\} \end{aligned} $$

If i or r do not belong to O ∖{j}, or k ∈{i, r}, then γ ikr = 0. Thus, it what follows assume that i, r ∈ O j := O ∖{j} and k∉{i, r}. Then,

$$\displaystyle \begin{aligned} \gamma_{ikr} = 0 \begin{cases} 2^{|O|-4} & i \neq r,\; k \in O_{\setminus\, j}\\ 2^{|O|-3} & i=r, \; k \in O_{\setminus\, j}\\ 2^{|O|-3} & i \neq r,\; k \notin O_{\setminus\, j}\\ 2^{|O|-2} & i=r, \; k \notin O_{\setminus\, j}\\ \end{cases} \end{aligned} $$

To see the second equality, note that we are counting subsets of the set O ∖{j} (of cardinality |O|− 1) that contain or exclude certain elements. For example, when k, i, r are pairwise distinct, and k ∈ O ∖{j}, looking at the binary representation of I, we have two ones in the positions i and r and a zero in position k, and the rest of |O|− 1 − 3 positions are free to be zero or one.

In what follows, i and r range over O ∖{j} (otherwise γ ikr = 0). Also, condition k∉{i, r} can be replaced with k ≠ r, since the k ≠ i is implicitly enforced by A ik = 0 if k = i (no self-loops). We have

$$\displaystyle \begin{aligned} (Q^T \boldsymbol r)_j &= \sum_{i,r} \sum_{k \neq r} A_{ik} A_{rj} \big[ 2^{|O|-4}(1+1\{i=r\})1\{k \in O_{\setminus\, j}\} \\ &\quad + 2^{|O|-3}(1+1\{i=r\})1\{k \notin O_{\setminus\, j}\}\big] \\ &= 2^{|O|-4} \sum_{i,r} d^{out}_{O\setminus\{j,r\}}(i) A_{rj} (1+1\{i=r\}) \\ &\quad + 2^{|O|-3}\sum_{i,r} d^{out}_{(O\setminus \,j)^c}(i) A_{rj} (1+1\{i=r\}) \end{aligned} $$

where in the second term, we used the fact that if kO j then we automatically have k ≠ r since r ranges over O j. We have

$$\displaystyle \begin{aligned} \sum_{r} d^{out}_{O\setminus\{j,r\}}(i) A_{rj} &= \sum_{r} (d^{out}_{O_{\setminus j}}(i) - A_{ir}) A_{rj} \\ &= d^{out}_{O_{\setminus j}}(i) d^{in}_{O_{\setminus j}}(j) - \operatorname{\mathrm{vol}}_{O_{\setminus j}}^{(2)}(i,j) \end{aligned} $$

where \( \operatorname {\mathrm {vol}}_{O_{\setminus j}}^{(2)}(i,j) := \sum _{r \in O_{\setminus j}} A_{ir} A_{rj}\) is the number of paths of length two between i and j in O j. Note that \( \operatorname {\mathrm {vol}}_{O_{\setminus j}}^{(2)}(i,j) = \operatorname {\mathrm {vol}}_{O}^{(2)}(i,j)\) and similarly \(d_{O_{\setminus j}}(j) = d_{O}(j)\) since A jj = 0. Thus,

$$\displaystyle \begin{aligned} \sum_{i,r} d^{out}_{O\setminus\{j,r\}}(i)\, A_{rj} \big( 1+1\{i=r\} \big) &= \sum_i \Big[ d^{out}_{O_{\setminus j}}(i) d^{in}_{O}(j) - \operatorname{\mathrm{vol}}_{O}^{(2)}(i,j) + d^{out}_{O_{\setminus j}}(i) A_{ij} \Big] \\ &= \sum_i d^{out}_{O_{\setminus j}}(i) d^{in}_{O}(j) + (d^{out}_{O_{\setminus j}}(i)- d^{in}_{O_{\setminus j}}(i)) A_{ij} \\ &= \operatorname{\mathrm{vol}}(O_{\setminus j}) d^{in}_{O}(j) + \sum_i (d^{out}_{O_{\setminus j}}(i)- d^{in}_{O_{\setminus j}}(i)) A_{ij} \end{aligned} $$

where \( \operatorname {\mathrm {vol}}(O_{\setminus j}) = \operatorname {\mathrm {vol}}(O_{\setminus j},O_{\setminus j})\) and the third equality follows since we have

$$\displaystyle \begin{aligned} \sum_{i\in A} \operatorname{\mathrm{vol}}_{A}^{(2)}(i,j) = \sum_{i \in A} \sum_{r \in A} A_{ir} A_{rj} = \sum_{r \in A} d^{in}_A(r) A_{rj} \end{aligned} $$

which was used with A = O j. Similarly, we have

$$\displaystyle \begin{aligned} \sum_{i,r} d^{out}_{(O\setminus \,j)^c}(i) A_{rj} (1+1\{i=r\}) &= \sum_{i} d^{out}_{(O\setminus \,j)^c}(i) \big (d^{in}_{{O \setminus j}}(j)+A_{ij} \big) \\ &=\operatorname{\mathrm{vol}}(O{ \setminus\, j}, (O{\setminus \,j})^c) \,d^{in}_O(j) \\ &\quad + \sum_{i} d^{out}_{(O\setminus \,j)^c}(i)\, A_{ij} \end{aligned} $$

It follows that

$$\displaystyle \begin{aligned} (Q^T \boldsymbol r)_j =2^{|O|-4} &\Big[ \operatorname{\mathrm{vol}}(O_{\setminus j}) d^{in}_{O}(j) + \sum_i (d^{out}_{O_{\setminus j}}(i)- d^{in}_{O_{\setminus j}}(i)) A_{ij} \\ &+ 2 \operatorname{\mathrm{vol}}(O{ \setminus\, j}, (O{\setminus \,j})^c) \,d^{in}_O(j) + 2 \sum_{i} d^{out}_{(O\setminus \,j)^c}(i)\, A_{ij}\Big]. \end{aligned} $$

Calculating Q TQ

Let us first take j ≠ j . Then, similar to the previous argument,

$$\displaystyle \begin{aligned} (Q^T Q)_{jj^{\prime}} &= \sum_{I\subset O\setminus \{j,j^{\prime}\}} \operatorname{\mathrm{vol}}(I, j)\, \operatorname{\mathrm{vol}}(I, j^{\prime}) \\ &=\sum_{I\subset O\setminus \{j,j^{\prime}\}} \sum_{i,r} A_{ij}\, A_{rj^{\prime}} 1\{i \in I,\;r \in I\} \\ &= \sum_{i,r} A_{ij}\, A_{rj^{\prime}} \beta_{ir} \end{aligned} $$

where we have defined

$$\displaystyle \begin{aligned} \beta_{ir} &:= \sum_{I\subset O\setminus \{j,j^{\prime}\}} 1\{i \in I,\;r \in I\} \\ &= 2^{|O|-4} 1\{i\neq r\} + 2^{|O|-3} 1\{i=r\}\\ &= 2^{|O|-4} \big(1+ 1\{i=r\}\big) \end{aligned} $$

assuming i, r ∈ O ∖{j, j }, otherwise β ir = 0. Thus, restricting summations over indices i, r ∈ O ∖{j, j }

$$\displaystyle \begin{aligned} (Q^T Q)_{jj^{\prime}} &= 2^{|O|-4}\Big[ \sum_{i,r} A_{ij}\, A_{rj^{\prime}} + \sum_{i} A_{ij} A_{ij^{\prime}}\Big] \\ &= 2^{|O|-4}\Big[ d^{in}_{O \setminus j^{\prime}}(j) d^{in}_{O \setminus j}(j^{\prime}) + \sum_{i} A_{ij} A_{ij^{\prime}}\Big]. \\ \end{aligned} $$

Now consider the case j = j . Then,

$$\displaystyle \begin{aligned} (Q^T Q)_{jj} &= \sum_{I\subset O\setminus \{j\}} \operatorname{\mathrm{vol}}(I, j)^2 \\ &=\sum_{I\subset O\setminus \{j\}} \sum_{i,r} A_{ij}\, A_{rj} 1\{i \in I,\;r \in I\} \\ &= \sum_{i,r} A_{ij}\, A_{rj} \,2^{|O|-3} \big( 1 + 1\{i=r\} \big), \end{aligned} $$

assuming i, r ∈ O ∖ j. It follows that

$$\displaystyle \begin{aligned} (Q^T Q)_{jj} &= 2^{|O|-3} \Big[ \sum_{i,r} A_{ij}\, A_{rj} + \sum_{i} A^2_{ij} \Big] \\ &= 2^{|O|-3} \big[ d^{in}_{O}(j)^2 + \sum_{i} A^2_{ij} \big]. \end{aligned} $$

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Kazemitabar, S.J., Amini, A.A. (2020). Approximate Identification of the Optimal Epidemic Source in Complex Networks. In: Masuda, N., Goh, KI., Jia, T., Yamanoi, J., Sayama, H. (eds) Proceedings of NetSci-X 2020: Sixth International Winter School and Conference on Network Science. NetSci-X 2020. Springer Proceedings in Complexity. Springer, Cham.

Download citation