Detecting Malicious Domains via Graph Inference

  • Pratyusa K. Manadhata
  • Sandeep Yadav
  • Prasad Rao
  • William Horne
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8712)


Enterprises routinely collect terabytes of security relevant data, e.g., network logs and application logs, for several reasons such as cheaper storage, forensic analysis, and regulatory compliance. Analyzing these big data sets to identify actionable security information and hence to improve enterprise security, however, is a relatively unexplored area. In this paper, we introduce a system to detect malicious domains accessed by an enterprise’s hosts from the enterprise’s HTTP proxy logs. Specifically, we model the detection problem as a graph inference problemwe construct a host-domain graph from proxy logs, seed the graph with minimal ground truth information, and then use belief propagation to estimate the marginal probability of a domain being malicious. Our experiments on data collected at a global enterprise show that our approach scales well, achieves high detection rates with low false positive rates, and identifies previously unknown malicious domains when compared with state-of-the-art systems. Since malware infections inside an enterprise spread primarily via malware domain accesses, our approach can be used to detect and prevent malware infections.


belief propagation big data analysis for security graph inference malicious domain detection 


  1. 1.
    CRA: Challenges and opportunities with big data (2012),
  2. 2.
    Cardenas, A.A., Manadhata, P.K., Rajan, S.P.: Big data analytics for security. IEEE Security & Privacy 11(6), 74–76 (2013)CrossRefGoogle Scholar
  3. 3.
  4. 4.
    Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: Proceedings of the National Conference on Artificial Intelligence (1982)Google Scholar
  5. 5.
    Yedida, J., Freeman, W., Weiss, Y.: Understanding Belief Propagation and its Generalizations. Exploring Aritificial Intelligence in the New Millennium (2003)Google Scholar
  6. 6.
    Freeman, W.T., Pasztor, E.C., Carmichael, O.T.: Learning low-level vision. International Journal of Computer Vision 40(1), 25–47 (2000)CrossRefzbMATHGoogle Scholar
  7. 7.
    Mceliece, R., Mackay, D., Cheng, J.: Turbo decoding as an instance of pearl’s belief propagation algorithm. IEEE Journal on Selected Areas in Communications (1998)Google Scholar
  8. 8.
    Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: Netprobe: a fast and scalable system for fraud detection in online auction networks. In: World Wide Web Conference (2007)Google Scholar
  9. 9.
    Chau, D., Nachenberg, C., Wilhelm, J., Wright, A., Faloutsos, C.: Polonium: Tera-scale graph mining and inference for malware detection. In: SIAM International Conference on Data Mining (2011)Google Scholar
  10. 10.
    Murphy, K., Weiss, Y., Jordan, M.: Loopy Belief Propagation for Approximate Inference: An Empirical Study. Uncertainity in Artificial Intelligence (1999)Google Scholar
  11. 11.
    Frey, B.J., MacKay, D.J.C.: A revolution: Belief propagation in graphs with cycles. In: Neural Information Processing Systems (NIPS) (1997)Google Scholar
  12. 12.
  13. 13.
    Pretti, M.: A message-passing algorithm with damping. Journal of Statistical Mechanics: Theory and Experiment 2005(11), P11008 (2005)Google Scholar
  14. 14.
    Yen, T.F., Oprea, A., Onarlioglu, K., Leetham, T., Robertson, W., Juels, A., Kirda, E.: Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. In: Proceedings of the 29th Annual Computer Security Applications Conference, ACSAC 2013, pp. 199–208. ACM, New York (2013)CrossRefGoogle Scholar
  15. 15.
    Giura, P., Wang, W.: A context-based detection framework for advanced persistent threats. In: International Conference on Cyber Security (2012)Google Scholar
  16. 16.
    Bilge, L., Balzarotti, D., Robertson, W., Kirda, E., Kruegel, C.: Disclosure: Detecting botnet command and control servers through large-scale netflow analysis. In: Proceedings of the 28th Annual Computer Security Applications Conference, ACSAC 2012, pp. 129–138. ACM, New York (2012)Google Scholar
  17. 17.
    Yadav, S., Reddy, A.K.K., Reddy, A.N., Ranjan, S.: Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, IMC 2010. ACM, New York (2010)Google Scholar
  18. 18.
    Bilge, L., Kirda, E., Kruegel, C., Balduzzi, M.: EXPOSURE: Finding Malicious Domain Using Passive DNS Analysis. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2011)Google Scholar
  19. 19.
    Antonakakis, M., Perdisci, R., Lee, W., Vasiloglou II, N., Dagon, D.: Detecting malware domains at the upper dns hierarchy. In: 20th USENIX Security Symposium (2011)Google Scholar
  20. 20.
    Antonakakis, M., Perdisci, R., Dagon, D., Lee, W., Feamster, N.: Building a Dynamic Reputation System for DNS. In: USENIX Security Symposium (2010)Google Scholar
  21. 21.
    Antonakakis, M., Perdisci, R., Nadji, Y., Vasiloglou, N., Abu-Nimeh, S., Lee, W., Dagon, D.: From throw-away traffic to bots: Detecting the rise of dga-based malware. In: 21st USENIX Security Symposium (2012)Google Scholar
  22. 22.
    Jiang, N., Cao, J., Jin, Y., Li, L.E., Zhang, Z.L.: Identifying Suspicious Activities Through DNS Failure Graph Analysis. In: IEEE Conference on Network Protocols (2010)Google Scholar
  23. 23.
    Yadav, S., Reddy, A.L.N.: Winning with DNS failures: Strategies for faster botnet detection. In: Rajarajan, M., Piper, F., Wang, H., Kesidis, G. (eds.) SecureComm 2011. LNICST, vol. 96, pp. 446–459. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  24. 24.
    Anderson, D.S., Fleizach, C., Savage, S., Voelker, G.M.: Spamscatter: Characterizing internet scam hosting infrastructure. In: 16th USENIX Security Symposium (2007)Google Scholar
  25. 25.
    Lin, M., Chiu, C., Lee, Y., Pao, H.: Malicious URL filtering- a big data application. In: IEEE BigData (2013)Google Scholar
  26. 26.
    Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (June 2009)Google Scholar
  27. 27.
    Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and Evaluation of a Real-Time URL Spam Filtering Service. IEEE Security and Privacy (2011)Google Scholar
  28. 28.
    Zhang, Y., Hong, J., Cranor, L.: Cantina: A content-based approach to detecting phishing web sites. In: World Wide Web Conference (May 2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Pratyusa K. Manadhata
    • 1
  • Sandeep Yadav
    • 2
  • Prasad Rao
    • 1
  • William Horne
    • 1
  1. 1.Hewlett-Packard LaboratoriesUSA
  2. 2.Damballa Inc.USA

Personalised recommendations