Abstract
How does malware propagate? How do software patches propagate? Given a set of malware samples, how to identify all malware variants that exist in a database? Which human behaviors may lead to increased malware attacks? These are challenging problems in their own respect, especially as they depend on having access to extensive, field-gathered data that highlight the current trends. These datasets are increasingly easier to collect, are large in size, and also high in complexity. Hence data mining can play an important role in cyber-security by answering these questions in an empirical data-driven manner. In this chapter, we discuss how related problems in cyber-security can be tackled via techniques from graph mining (specifically mining network propagation) on large field datasets collected on millions of hosts.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Each month’s second Tuesday, on which Microsoft releases security patches.
References
Adar, E., Adamic, L.A.: Tracking information epidemics in blogspace. (2005)
Albert, R., Jeong, H., Barabási, A.L.: Diameter of the World-Wide Web. Nature401, 130–131 (1999)
Anderson, R.M., May, R.M.: Infectious diseases of humans: Dynamics and control. Oxford Press (2002)
Bailey, N.: The Mathematical Theory of Infectious Diseases and its Applications. Griffin, London (1975)
Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science286, 509–512 (1999)
Barrett, C.L., Bisset, K.R., Eubank, S.G., Feng, X., Marathe, M.V.: Episimdemics: an efficient algorithm for simulating the spread of infectious disease over large realistic social networks pp. 1–12 (2008)
Bass, F.M.: A new product growth for model consumer durables. Management Science15(5), 215–227 (1969)
Beutel, A., Prakash, B.A., Rosenfeld, R., Faloutsos, C.: Interacting viruses in networks: can both survive? In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’12, pp. 426–434 (2012)
Bikhchandani, S., Hirshleifer, D., Welch, I.: A theory of fads, fashion, custom, and cultural change in informational cascades. Journal of Political Economy100(5), 992–1026 (1992)
Bilge, L., Dumitras, T.: Before we knew it: An empirical study of zero-day attacks in the real world. In: ACM Conference on Computer and Communications Security. Raleigh, NC (2012)
Briesemeister, L., Lincoln, P., Porras, P.: Epidemic profiles and defense of scale-free networks. WORM 2003 (2003)
Camp, J., Cranor, L., Feamster, N., Feigenbaum, J., Forrest, S., Kotz, D., Lee, W., Lincoln, P., Paxson, V., Reiter, M., Rivest, R., Sanders, W., Savage, S., Smith, S., Spafford, E., Stolfo, S.: Data for cybersecurity research: Process and “wish list”.http://www.gtisc.gatech.edu/files_nsf10/data-wishlist.pdf (2009)
Chakrabarti, D., Wang, Y., Wang, C., Leskovec, J., Faloutsos, C.: Epidemic thresholds in real networks. ACM TISSEC10(4) (2008)
Chau, D.H.P., Nachenberg, C., Wilhelm, J., Wright, A., Faloutsos, C.: Polonium: Tera-scale graph mining for malware detection. In: SIAM International Conference on Data Mining (SDM). Mesa, AZ (2011)
Cohen, R., Havlin, S., ben Avraham, D.: Efficient immunization strategies for computer networks and populations. Physical Review Letters91(24) (2003)
Domingos, P., Richardson, M.: Mining the network value of customers. In: KDD, pp. 57–66 (2001)
Falliere, N., O’Murchu, L., Chien, E.: W32.Stuxnet dossier. Symantec Whitepaper (2011).http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/w32_stuxnet_dossier.pdf
Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. SIGCOMM pp. 251–262 (1999)
Ganesh, A., Massoulié, L., Towsley, D.: The effect of network topology on the spread of epidemics. In: IEEE INFOCOM. IEEE Computer Society Press, Los Alamitos, CA (2005)
Gkantsidis, C., Karagiannis, T., Vojnovic, M.: Planet scale software updates. In: SIGCOMM, pp. 423–434 (2006)
Goldenberg, J., Libai, B., Muller, E.: Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters (2001)
Granovetter, M.: Threshold models of collective behavior. Am. Journal of Sociology83(6), 1420–1443 (1978)
Gruhl, D., Guha, R., Liben-Nowell, D., Tomkins, A.: Information diffusion through blogspace. In: WWW ’04 (2004).www.www2004.org/proceedings/docs/1p491.pdf
Hayashi, Y., Minoura, M., Matsukubo, J.: Recoverable prevalence in growing scale-free networks and the effective immunization. arXiv:cond-mat/0305549 v2 (2003)
Hethcote, H.W.: The mathematics of infectious diseases. SIAM Review42 (2000)
Hethcote, H.W., Yorke, J.A.: Gonorrhea transmission dynamics and control. Springer Lecture Notes in Biomathematics46 (1984)
Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: KDD ’03 (2003)
Kephart, J.O., White, S.R.: Directed-graph epidemiological models of computer viruses. In: Proceedings of the 1991 IEEE Computer Society Symposium on Research in Security and Privacy, pp. 343–359 (1991)
Kephart, J.O., White, S.R.: Measuring and modeling computer virus prevalence. In: Proceedings of the 1993 IEEE Computer Society Symposium on Research in Security and Privacy, pp. 2–15 (1993)
Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the bursty evolution of blogspace. In: WWW ’03: Proceedings of the 12th international conference on World Wide Web, pp. 568–576. ACM Press, New York, NY, USA (2003).
Kumar, R., Novak, J., Tomkins, A.: Structure and evolution of online social networks. In: KDD ’06: Proceedings of the 12th ACM SIGKDD International Conference on Knowedge Discover and Data Mining, pp. 611–617. New York (2006)
Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the web for emerging cyber-communities. Computer Networks31(11-16), 1481–1493 (1999)
Lad, M., Zhao, X., Zhang, B., Massey, D., Zhang, L.: Analysis of BGP Update Burst During Slammer Attack. In: The 5th International Workshop on Distributed Computing (2005)
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: Densification laws, shrinking diameters and possible explanations. In: Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY (2005)
Li, J., Wu, Z., Purpus, E.: CAM04-5: Toward Understanding the Behavior of BGP During Large-Scale Power Outages. Global Telecommunications Conference, 2006. GLOBECOM ’06. IEEE pp. 1–5 (Nov. 2006)
Madar, N., Kalisky, T., Cohen, R., ben Avraham, D., Havlin, S.: Immunization and epidemic dynamics in complex networks. Eur. Phys. J. B38(2), 269–276 (2004)
Matsubara, Y., Sakurai, Y., Prakash, B.A., Li, L., Faloutsos, C.: Rise and fall patterns of information diffusion: model and implications. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’12, pp. 6–14 (2012)
McHugh, J.: Testing intrusion detection systems: A critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Transactions on Information and System Security3(4), 262–294 (2000)
McKendrick, A.G.: Applications of mathematics to medical problems. In: Proceedings of Edin. Math. Society, vol. 44, pp. 98–130 (1925)
Milgram, S.: The small-world problem. Psychology Today2, 60–67 (1967)
Moore, D., Shannon, C., Claffy, K.C.: Code-red: a case study on the spread and victims of an internet worm. In: Internet Measurement Workshop, pp. 273–284 (2002)
Moore, D., Paxson, V., Savage, S., Shannon, C., Staniford, S., Weaver, N.: Inside the Slammer worm. Security & Privacy, IEEE1(4), 33–39 (2003)
Newman, M.E.J.: Threshold effects for two pathogens spreading on a network. Phys. Rev. Lett (2005)
Papalexakis, E.E., Dumitras, T., Chau, D.H., Prakash, B.A., Faloutsos, C.: Spatio-temporal mining of software adoption & penetration. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (2013)
Pastor-Satorras, R., Vespignani, A.: Epidemic dynamics and endemic states in complex networks. Physical Review E63, 066,117 (2001)
Pastor-Satorras, R., Vespignani, A.: Epidemic dynamics in finite size scale-free networks. Physical Review E65, 035,108 (2002)
Prakash, B.A., Tong, H., Valler, N., Faloutsos, M., Faloutsos, C.: Virus propagation on time-varying networks: Theory and immunization algorithms. ECML-PKDD (2010)
Prakash, B.A., Chakrabarti, D., Faloutsos, M., Valler, N., Faloutsos, C.: Threshold conditions for arbitrary cascade models on arbitrary networks. In: ICDM (2011)
Prakash, B.A., Beutel, A., Rosenfeld, R., Faloutsos, C.: Winner takes all: Competiting viruses or ideas on fair-play networks. WWW (2012)
Prakash, B.A., Vreeken, J., Faloutsos, C.: Spotting culprits in epidemics: How many and which ones? In: ICDM (2012)
Prakash, B.A., Adamic, L.A., Iwashyna, T.J., Tong, H., Faloutsos, C.: Fractional immunization in networks. In: SDM, pp. 659–667 (2013)
Richardson, M., Domingos, P.: Mining knowledge-sharing sites for viral marketing (2002). citeseer.ist.psu.edu/richardson02mining.html
Ripeanu, M., Foster, I., Iamnitchi, A.: Mapping the gnutella network: Properties of large-scale peer-to-peer systems and implications for system design. IEEE Internet Computing Journal6(1) (2002)
Rogers, E.M.: Diffusion of Innovations, 5th Edition. Free Press (2003).http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&=ASIN/0743222091
Staniford, S., Moore, D., Paxson, V., Weaver, N.: The top speed of flash worms. In: WORM, pp. 33–42 (2004)
Staniford, S., Paxson, V., Weaver, N.: How to 0wn the internet in your spare time. In: Proceedings of the 11th USENIX Security Symposium, pp. 149–167. USENIX Association, Berkeley, CA, USA (2002).http://dl.acm.org/citation.cfm?id=647253.720288
Symantec Corporation: Symantec Internet security threat report, volume 17.http://www.symantec.com/threatreport/ (2012)
Tong, H., Prakash, B.A., Eliassi-Rad, T., Faloutsos, M., Faloutsos, C.: Gelling, and melting, large graphs by edge manipulation. In: CIKM (2012)
Tong, H., Prakash, B.A., Tsourakakis, C.E., Eliassi-Rad, T., Faloutsos, C., Chau, D.H.: On the vulnerability of large graphs. In: ICDM (2010)
Valler, N., Prakash, B.A., Tong, H., Faloutsos, M., Faloutsos, C.: Epidemic spread in mobile ad hoc networks: Determining the tipping point. IFIP NETWORKING (2011)
Wang, L., Zhao, X., Pei, D., Bush, R., Massey, D., Mankin, A., Wu, S., Zhang, L.: Observation and Analysis of BGP Behavior under Stress. In: IMW (2002)
Watts, D.J.: A simple model of global cascades on random networks. In: Proceedings of the National Academy of Sciences of the United States of America, vol. 99, pp. 5766–5771 (2002)
Weaver, N., Ellis, D.: Reflections on Witty: Analyzing the attacker. ;login: The USENIX Magazine29(3), 34–37 (2004)
Acknowledgements
The WINE platform data analyzed here is available for follow-on research as the reference data setWINE-2012-006. Based on work partly supported by the Army Research Laboratory under grant number W911NF-09-2-0053, the National Science Foundation under grant numbers IIS-1017415 and IIS-1353346 and by the Maryland Procurement Office under contract H98230-14-C-0127.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Prakash, B. (2015). Graph Mining for Cyber Security. In: Jajodia, S., Shakarian, P., Subrahmanian, V., Swarup, V., Wang, C. (eds) Cyber Warfare. Advances in Information Security, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-319-14039-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-14039-1_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14038-4
Online ISBN: 978-3-319-14039-1
eBook Packages: Computer ScienceComputer Science (R0)