Modeling infection methods of computer malware in the presence of vaccinations using epidemiological models: an analysis of real-world data


Computer malware and biological pathogens often use similar infection mechanisms. For this reason, it has been suggested to model malware spread using epidemiological models developed to characterize the spread of biological pathogens. However, to date, most work examining the similarities between malware and pathogens using such methods was based on theoretical analysis and simulation. Here we extend the classical susceptible–infected–recovered epidemiological model to describe two of the most common infection methods used by malware. We fit the proposed model to malware collected between April 2017 and April 2018 from a major anti-malware vendor. We show that by fitting the proposed model it is possible to identify the method of transmission used by the malware, its rate of infection, and the number of machines which will be infected unless blocked by anti-virus software. In a large sample of malware infections, the Spearman correlation between the number of actual and predicted infected machines is \(\rho =0.84\). Examining cases where an anti-malware “signature” was transmitted to susceptible computers by the anti-virus provider, we show that the time to remove the malware will be short and independent of the number of infected computers if fewer than approximately 60% of susceptible computers have been infected. If more computers were infected, the time to removal will be approximately 3.2 times greater and will depend on the fraction of infected computers. Our results show that the application of epidemiological models of infection to malware can provide anti-virus providers with information on malware spread and its potential damage. We further propose that similarities between computer malware and biological pathogens, the availability of data on the former, and the dearth of data on the latter, make malware an extremely useful model for testing interventions which could later be applied to improve medicine.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. 1.

  2. 2.

    Here we refer to software used to block malware as either anti-virus or anti-malware software interchangeably.


  1. 1.

    Allen, L.J.: An introduction to stochastic epidemic models. In: van den Driessche, P., Wu, J., Brauer, F. (eds.) Mathematical Epidemiology, pp. 81–130. Springer, Berlin (2008)

    Google Scholar 

  2. 2.

    AV-test: Malware statistics and trend reports (2019).

  3. 3.

    Balcan, D., Hu, H., Goncalves, B., Bajardi, P., Poletto, C., Ramasco, J.J., Paolotti, D., Perra, N., Tizzoni, M., Van den Broeck, W., et al.: Seasonal transmission potential and activity peaks of the new influenza a (h1n1): a Monte Carlo likelihood analysis based on human mobility. BMC Med. 7(1), 45 (2009)

    Article  Google Scholar 

  4. 4.

    Berger, N., Borgs, C., Chayes, J.T., Saberi, A.: On the spread of viruses on the internet. In: Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 301–310. Society for Industrial and Applied Mathematics (2005)

  5. 5.

    Chen, Z., Ji, C.: Spatial-temporal modeling of malware propagation in networks. IEEE Trans. Neural Netw. 16(5), 1291–1303 (2005)

    Article  Google Scholar 

  6. 6.

    Feng, L., Liao, X., Han, Q., Li, H.: Dynamical analysis and control strategies on malware propagation model. Appl. Math. Model. 37(16–17), 8225–8236 (2013)

    MathSciNet  Article  Google Scholar 

  7. 7.

    Garetto, M., Gong, W., Towsley, D.: Modeling malware spreading dynamics. In: IEEE INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No. 03CH37428), vol. 3, pp. 1869–1879. IEEE (2003)

  8. 8.

    Gartner: Gartner says 8.4 billion connected “things” will be in use in 2017, up 31 percent from 2016 (2017).

  9. 9.

    Goldenberg, J., Shavitt, Y., Shir, E., Solomon, S.: Distributive immunization of networks against viruses using the ‘honey-pot’ architecture. Nat. Phys. 1(3), 184 (2005)

    Article  Google Scholar 

  10. 10.

    Hoskin, J., Kiloh, L., Cawte, J.: Epilepsy and guria: the shaking syndromes of new guinea. Soc. Sci. Med. 3(1), 39–48 (1969)

    Article  Google Scholar 

  11. 11.

    Hu, H., Myers, S., Colizza, V., Vespignani, A.: Wifi networks and malware epidemiology. Proc. Nat. Acad. Sci. 106(5), 1318–1323 (2009)

    Article  Google Scholar 

  12. 12.

    Intelligence, M.S.: Win32/mydoom (2011).

  13. 13.

    Intelligence, M.S.: Backdoor:macos\_x/flashback (2017).

  14. 14.

    Intelligence, M.S.: Trojan:js/miner.a (2017).

  15. 15.

    Intelligence, M.S.: Trojan:win32/kovter (2017).

  16. 16.

    Intelligence, M.S.: Trojan:win32/zues.a (2017).

  17. 17.

    Intelligence, M.S.: Virus:vbs/loveletter (2017).

  18. 18.

    Intelligence, M.S.: Ransom:win32/wannacrypt (2018).

  19. 19.

    Kephart, J.O., White, S.R.: Directed-graph epidemiological models of computer viruses. In: Huberman, B.A. (ed.) Computation: The Micro and the Macro View, pp. 71–102. World Scientific, Singapore (1992)

    Google Scholar 

  20. 20.

    Kermack, W.O., McKendrick, A.G.: A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. A Math. Phys. Eng. Sci. 115(772), 700–721 (1927)

    MATH  Google Scholar 

  21. 21.

    Labs, M.: 2019 state of malware (2019).

  22. 22.

    Levy, N., Iv, M., Yom-Tov, E.: Modeling influenza-like illnesses through composite compartmental models. Physica A Stat. Mech. Appl. 494, 288–293 (2018)

    MathSciNet  Article  Google Scholar 

  23. 23.

    Liu, W., Liu, C., Liu, X., Cui, S., Huang, X.: Modeling the spread of malware with the influence of heterogeneous immunization. Appl. Math. Model. 40(4), 3141–3152 (2016)

    MathSciNet  Article  Google Scholar 

  24. 24.

    Liu, W., Zhong, S.: Web malware spread modelling and optimal control strategies. Sci. Rep. 7, 42308 (2017)

    Article  Google Scholar 

  25. 25.

    Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 4765–4774. Curran Associates Inc., Red Hook (2017)

    Google Scholar 

  26. 26.

    McHugh, M.L.: Interrater reliability: the kappa statistic. Biochem. Med. 22(3), 276–282 (2012)

    MathSciNet  Article  Google Scholar 

  27. 27.

    Merler, S., Ajelli, M.: The role of population heterogeneity and human mobility in the spread of pandemic influenza. Proc. R. Soc. B Biol. Sci. 277(1681), 557–565 (2009)

    Article  Google Scholar 

  28. 28.

    Oren, E., Frere, J., Yom-Tov, E., Yom-Tov, E.: Respiratory syncytial virus tracking using internet search engine data. BMC Public Health 18(1), 445 (2018)

    Article  Google Scholar 

  29. 29.

    Qu, B., Wang, H.: Sis epidemic spreading with heterogeneous infection rates. IEEE Trans. Netw. Sci. Eng. 4(3), 177–186 (2017)

    Article  Google Scholar 

  30. 30.

    Rabiner, L.R., Gold, B.: Theory and Application of Digital Signal Processing. Prentice-Hall Inc., Englewood Cliffs (1975)

    Google Scholar 

  31. 31.

    Signes-Pont, M.T., Cortés-Castillo, A., Mora-Mora, H., Szymanski, J.: Modelling the malware propagation in mobile computer devices. Comput. Secur. 79, 80–93 (2018)

    Article  Google Scholar 

  32. 32.

    Taynitskiy, V., Gubar, E., Zhu, Q.: Optimal impulsive control of epidemic spreading of heterogeneous malware. IFAC-PapersOnLine 50(1), 15038–15043 (2017)

    Article  Google Scholar 

  33. 33.

    Waalen, K., Kilander, A., Dudman, S., Krogh, G., Aune, T., Hungnes, O.: High prevalence of antibodies to the 2009 pandemic influenza a (h1n1) virus in the Norwegian population following a major epidemic and a large vaccination campaign in autumn 2009. Eurosurveillance 15(31), 19633 (2010)

    Google Scholar 

  34. 34.

    Wang, C., Knight, J.C., Elder, M.C.: On computer viral infection and the effect of immunization. In: Proceedings 16th Annual Computer Security Applications Conference (ACSAC’00), pp. 246–256. IEEE (2000)

  35. 35.

    Watts, D., Strogatz, S.: Collective dynamics of small-world networks. Nature 393, 440–441 (1998)

    Article  Google Scholar 

  36. 36.

    Wired: Everything you need to know about eternalblue—the NSA exploit linked to Petya (2017).

  37. 37.

    Zaman, G., Kang, Y.H., Jung, I.H.: Stability analysis and optimal vaccination of an sir epidemic model. BioSystems 93(3), 240–249 (2008)

    Article  Google Scholar 

  38. 38.

    Zhang, S., Jin, Z., Zhang, J.: The dynamical modeling analysis of the spreading of passive worms in p2p networks. Discrete Dyn. Nat. Soc. (2018).

    MathSciNet  Article  MATH  Google Scholar 

Download references


The authors would like to thank Prof. Lev Muchnik for enlightening discussions and comments.

Author information



Corresponding author

Correspondence to Elad Yom-Tov.

Ethics declarations

Conflict of interest

All authors are employees of Microsoft.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Levy, N., Rubin, A. & Yom-Tov, E. Modeling infection methods of computer malware in the presence of vaccinations using epidemiological models: an analysis of real-world data. Int J Data Sci Anal 10, 349–358 (2020).

Download citation


  • Malware,
  • Epidemiological model
  • Compartmental models
  • Vaccination