Skip to main content

Graph Mining for Cyber Security

  • Chapter
  • First Online:

Part of the book series: Advances in Information Security ((ADIS,volume 56))

Abstract

How does malware propagate? How do software patches propagate? Given a set of malware samples, how to identify all malware variants that exist in a database? Which human behaviors may lead to increased malware attacks? These are challenging problems in their own respect, especially as they depend on having access to extensive, field-gathered data that highlight the current trends. These datasets are increasingly easier to collect, are large in size, and also high in complexity. Hence data mining can play an important role in cyber-security by answering these questions in an empirical data-driven manner. In this chapter, we discuss how related problems in cyber-security can be tackled via techniques from graph mining (specifically mining network propagation) on large field datasets collected on millions of hosts.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Each month’s second Tuesday, on which Microsoft releases security patches.

References

  • Adar, E., Adamic, L.A.: Tracking information epidemics in blogspace. (2005)

    Google Scholar 

  • Albert, R., Jeong, H., Barabási, A.L.: Diameter of the World-Wide Web. Nature401, 130–131 (1999)

    Google Scholar 

  • Anderson, R.M., May, R.M.: Infectious diseases of humans: Dynamics and control. Oxford Press (2002)

    Google Scholar 

  • Bailey, N.: The Mathematical Theory of Infectious Diseases and its Applications. Griffin, London (1975)

    Google Scholar 

  • Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science286, 509–512 (1999)

    Google Scholar 

  • Barrett, C.L., Bisset, K.R., Eubank, S.G., Feng, X., Marathe, M.V.: Episimdemics: an efficient algorithm for simulating the spread of infectious disease over large realistic social networks pp. 1–12 (2008)

    Google Scholar 

  • Bass, F.M.: A new product growth for model consumer durables. Management Science15(5), 215–227 (1969)

    Google Scholar 

  • Beutel, A., Prakash, B.A., Rosenfeld, R., Faloutsos, C.: Interacting viruses in networks: can both survive? In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’12, pp. 426–434 (2012)

    Google Scholar 

  • Bikhchandani, S., Hirshleifer, D., Welch, I.: A theory of fads, fashion, custom, and cultural change in informational cascades. Journal of Political Economy100(5), 992–1026 (1992)

    Google Scholar 

  • Bilge, L., Dumitras, T.: Before we knew it: An empirical study of zero-day attacks in the real world. In: ACM Conference on Computer and Communications Security. Raleigh, NC (2012)

    Google Scholar 

  • Briesemeister, L., Lincoln, P., Porras, P.: Epidemic profiles and defense of scale-free networks. WORM 2003 (2003)

    Google Scholar 

  • Camp, J., Cranor, L., Feamster, N., Feigenbaum, J., Forrest, S., Kotz, D., Lee, W., Lincoln, P., Paxson, V., Reiter, M., Rivest, R., Sanders, W., Savage, S., Smith, S., Spafford, E., Stolfo, S.: Data for cybersecurity research: Process and “wish list”.http://www.gtisc.gatech.edu/files_nsf10/data-wishlist.pdf (2009)

  • Chakrabarti, D., Wang, Y., Wang, C., Leskovec, J., Faloutsos, C.: Epidemic thresholds in real networks. ACM TISSEC10(4) (2008)

    Google Scholar 

  • Chau, D.H.P., Nachenberg, C., Wilhelm, J., Wright, A., Faloutsos, C.: Polonium: Tera-scale graph mining for malware detection. In: SIAM International Conference on Data Mining (SDM). Mesa, AZ (2011)

    Google Scholar 

  • Cohen, R., Havlin, S., ben Avraham, D.: Efficient immunization strategies for computer networks and populations. Physical Review Letters91(24) (2003)

    Google Scholar 

  • Domingos, P., Richardson, M.: Mining the network value of customers. In: KDD, pp. 57–66 (2001)

    Google Scholar 

  • Falliere, N., O’Murchu, L., Chien, E.: W32.Stuxnet dossier. Symantec Whitepaper (2011).http://www.symantec.com/content/en/us/enterprise/media/security_response/whitepapers/w32_stuxnet_dossier.pdf

  • Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. SIGCOMM pp. 251–262 (1999)

    Google Scholar 

  • Ganesh, A., Massoulié, L., Towsley, D.: The effect of network topology on the spread of epidemics. In: IEEE INFOCOM. IEEE Computer Society Press, Los Alamitos, CA (2005)

    Google Scholar 

  • Gkantsidis, C., Karagiannis, T., Vojnovic, M.: Planet scale software updates. In: SIGCOMM, pp. 423–434 (2006)

    Google Scholar 

  • Goldenberg, J., Libai, B., Muller, E.: Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters (2001)

    Google Scholar 

  • Granovetter, M.: Threshold models of collective behavior. Am. Journal of Sociology83(6), 1420–1443 (1978)

    Google Scholar 

  • Gruhl, D., Guha, R., Liben-Nowell, D., Tomkins, A.: Information diffusion through blogspace. In: WWW ’04 (2004).www.www2004.org/proceedings/docs/1p491.pdf

  • Hayashi, Y., Minoura, M., Matsukubo, J.: Recoverable prevalence in growing scale-free networks and the effective immunization. arXiv:cond-mat/0305549 v2 (2003)

    Google Scholar 

  • Hethcote, H.W.: The mathematics of infectious diseases. SIAM Review42 (2000)

    Google Scholar 

  • Hethcote, H.W., Yorke, J.A.: Gonorrhea transmission dynamics and control. Springer Lecture Notes in Biomathematics46 (1984)

    Google Scholar 

  • Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: KDD ’03 (2003)

    Google Scholar 

  • Kephart, J.O., White, S.R.: Directed-graph epidemiological models of computer viruses. In: Proceedings of the 1991 IEEE Computer Society Symposium on Research in Security and Privacy, pp. 343–359 (1991)

    Google Scholar 

  • Kephart, J.O., White, S.R.: Measuring and modeling computer virus prevalence. In: Proceedings of the 1993 IEEE Computer Society Symposium on Research in Security and Privacy, pp. 2–15 (1993)

    Google Scholar 

  • Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: On the bursty evolution of blogspace. In: WWW ’03: Proceedings of the 12th international conference on World Wide Web, pp. 568–576. ACM Press, New York, NY, USA (2003).

    Google Scholar 

  • Kumar, R., Novak, J., Tomkins, A.: Structure and evolution of online social networks. In: KDD ’06: Proceedings of the 12th ACM SIGKDD International Conference on Knowedge Discover and Data Mining, pp. 611–617. New York (2006)

    Google Scholar 

  • Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the web for emerging cyber-communities. Computer Networks31(11-16), 1481–1493 (1999)

    Google Scholar 

  • Lad, M., Zhao, X., Zhang, B., Massey, D., Zhang, L.: Analysis of BGP Update Burst During Slammer Attack. In: The 5th International Workshop on Distributed Computing (2005)

    Google Scholar 

  • Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: Densification laws, shrinking diameters and possible explanations. In: Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY (2005)

    Google Scholar 

  • Li, J., Wu, Z., Purpus, E.: CAM04-5: Toward Understanding the Behavior of BGP During Large-Scale Power Outages. Global Telecommunications Conference, 2006. GLOBECOM ’06. IEEE pp. 1–5 (Nov. 2006)

    Google Scholar 

  • Madar, N., Kalisky, T., Cohen, R., ben Avraham, D., Havlin, S.: Immunization and epidemic dynamics in complex networks. Eur. Phys. J. B38(2), 269–276 (2004)

    Google Scholar 

  • Matsubara, Y., Sakurai, Y., Prakash, B.A., Li, L., Faloutsos, C.: Rise and fall patterns of information diffusion: model and implications. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’12, pp. 6–14 (2012)

    Google Scholar 

  • McHugh, J.: Testing intrusion detection systems: A critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Transactions on Information and System Security3(4), 262–294 (2000)

    Google Scholar 

  • McKendrick, A.G.: Applications of mathematics to medical problems. In: Proceedings of Edin. Math. Society, vol. 44, pp. 98–130 (1925)

    Google Scholar 

  • Milgram, S.: The small-world problem. Psychology Today2, 60–67 (1967)

    Google Scholar 

  • Moore, D., Shannon, C., Claffy, K.C.: Code-red: a case study on the spread and victims of an internet worm. In: Internet Measurement Workshop, pp. 273–284 (2002)

    Google Scholar 

  • Moore, D., Paxson, V., Savage, S., Shannon, C., Staniford, S., Weaver, N.: Inside the Slammer worm. Security & Privacy, IEEE1(4), 33–39 (2003)

    Google Scholar 

  • Newman, M.E.J.: Threshold effects for two pathogens spreading on a network. Phys. Rev. Lett (2005)

    Google Scholar 

  • Papalexakis, E.E., Dumitras, T., Chau, D.H., Prakash, B.A., Faloutsos, C.: Spatio-temporal mining of software adoption & penetration. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (2013)

    Google Scholar 

  • Pastor-Satorras, R., Vespignani, A.: Epidemic dynamics and endemic states in complex networks. Physical Review E63, 066,117 (2001)

    Google Scholar 

  • Pastor-Satorras, R., Vespignani, A.: Epidemic dynamics in finite size scale-free networks. Physical Review E65, 035,108 (2002)

    Google Scholar 

  • Prakash, B.A., Tong, H., Valler, N., Faloutsos, M., Faloutsos, C.: Virus propagation on time-varying networks: Theory and immunization algorithms. ECML-PKDD (2010)

    Google Scholar 

  • Prakash, B.A., Chakrabarti, D., Faloutsos, M., Valler, N., Faloutsos, C.: Threshold conditions for arbitrary cascade models on arbitrary networks. In: ICDM (2011)

    Google Scholar 

  • Prakash, B.A., Beutel, A., Rosenfeld, R., Faloutsos, C.: Winner takes all: Competiting viruses or ideas on fair-play networks. WWW (2012)

    Google Scholar 

  • Prakash, B.A., Vreeken, J., Faloutsos, C.: Spotting culprits in epidemics: How many and which ones? In: ICDM (2012)

    Google Scholar 

  • Prakash, B.A., Adamic, L.A., Iwashyna, T.J., Tong, H., Faloutsos, C.: Fractional immunization in networks. In: SDM, pp. 659–667 (2013)

    Google Scholar 

  • Richardson, M., Domingos, P.: Mining knowledge-sharing sites for viral marketing (2002). citeseer.ist.psu.edu/richardson02mining.html

    Google Scholar 

  • Ripeanu, M., Foster, I., Iamnitchi, A.: Mapping the gnutella network: Properties of large-scale peer-to-peer systems and implications for system design. IEEE Internet Computing Journal6(1) (2002)

    Google Scholar 

  • Rogers, E.M.: Diffusion of Innovations, 5th Edition. Free Press (2003).http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&=ASIN/0743222091

  • Staniford, S., Moore, D., Paxson, V., Weaver, N.: The top speed of flash worms. In: WORM, pp. 33–42 (2004)

    Google Scholar 

  • Staniford, S., Paxson, V., Weaver, N.: How to 0wn the internet in your spare time. In: Proceedings of the 11th USENIX Security Symposium, pp. 149–167. USENIX Association, Berkeley, CA, USA (2002).http://dl.acm.org/citation.cfm?id=647253.720288

  • Symantec Corporation: Symantec Internet security threat report, volume 17.http://www.symantec.com/threatreport/ (2012)

  • Tong, H., Prakash, B.A., Eliassi-Rad, T., Faloutsos, M., Faloutsos, C.: Gelling, and melting, large graphs by edge manipulation. In: CIKM (2012)

    Google Scholar 

  • Tong, H., Prakash, B.A., Tsourakakis, C.E., Eliassi-Rad, T., Faloutsos, C., Chau, D.H.: On the vulnerability of large graphs. In: ICDM (2010)

    Google Scholar 

  • Valler, N., Prakash, B.A., Tong, H., Faloutsos, M., Faloutsos, C.: Epidemic spread in mobile ad hoc networks: Determining the tipping point. IFIP NETWORKING (2011)

    Google Scholar 

  • Wang, L., Zhao, X., Pei, D., Bush, R., Massey, D., Mankin, A., Wu, S., Zhang, L.: Observation and Analysis of BGP Behavior under Stress. In: IMW (2002)

    Google Scholar 

  • Watts, D.J.: A simple model of global cascades on random networks. In: Proceedings of the National Academy of Sciences of the United States of America, vol. 99, pp. 5766–5771 (2002)

    Google Scholar 

  • Weaver, N., Ellis, D.: Reflections on Witty: Analyzing the attacker. ;login: The USENIX Magazine29(3), 34–37 (2004)

    Google Scholar 

Download references

Acknowledgements

The WINE platform data analyzed here is available for follow-on research as the reference data setWINE-2012-006. Based on work partly supported by the Army Research Laboratory under grant number W911NF-09-2-0053, the National Science Foundation under grant numbers IIS-1017415 and IIS-1353346 and by the Maryland Procurement Office under contract H98230-14-C-0127.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. Aditya Prakash .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Prakash, B. (2015). Graph Mining for Cyber Security. In: Jajodia, S., Shakarian, P., Subrahmanian, V., Swarup, V., Wang, C. (eds) Cyber Warfare. Advances in Information Security, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-319-14039-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14039-1_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14038-4

  • Online ISBN: 978-3-319-14039-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics