Abstract
Enterprises routinely collect terabytes of security relevant data, e.g., network logs and application logs, for several reasons such as cheaper storage, forensic analysis, and regulatory compliance. Analyzing these big data sets to identify actionable security information and hence to improve enterprise security, however, is a relatively unexplored area. In this paper, we introduce a system to detect malicious domains accessed by an enterprise’s hosts from the enterprise’s HTTP proxy logs. Specifically, we model the detection problem as a graph inference problemwe construct a host-domain graph from proxy logs, seed the graph with minimal ground truth information, and then use belief propagation to estimate the marginal probability of a domain being malicious. Our experiments on data collected at a global enterprise show that our approach scales well, achieves high detection rates with low false positive rates, and identifies previously unknown malicious domains when compared with state-of-the-art systems. Since malware infections inside an enterprise spread primarily via malware domain accesses, our approach can be used to detect and prevent malware infections.
Keywords
- belief propagation
- big data analysis for security
- graph inference
- malicious domain detection
Download conference paper PDF
References
CRA: Challenges and opportunities with big data (2012), http://cra.org/ccc/docs/init/bigdatawhitepaper.pdf
Cardenas, A.A., Manadhata, P.K., Rajan, S.P.: Big data analytics for security. IEEE Security & Privacy 11(6), 74–76 (2013)
Symantec internet security threat report (2011), http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_2011_21239364.en-us.pdf
Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: Proceedings of the National Conference on Artificial Intelligence (1982)
Yedida, J., Freeman, W., Weiss, Y.: Understanding Belief Propagation and its Generalizations. Exploring Aritificial Intelligence in the New Millennium (2003)
Freeman, W.T., Pasztor, E.C., Carmichael, O.T.: Learning low-level vision. International Journal of Computer Vision 40(1), 25–47 (2000)
Mceliece, R., Mackay, D., Cheng, J.: Turbo decoding as an instance of pearl’s belief propagation algorithm. IEEE Journal on Selected Areas in Communications (1998)
Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: Netprobe: a fast and scalable system for fraud detection in online auction networks. In: World Wide Web Conference (2007)
Chau, D., Nachenberg, C., Wilhelm, J., Wright, A., Faloutsos, C.: Polonium: Tera-scale graph mining and inference for malware detection. In: SIAM International Conference on Data Mining (2011)
Murphy, K., Weiss, Y., Jordan, M.: Loopy Belief Propagation for Approximate Inference: An Empirical Study. Uncertainity in Artificial Intelligence (1999)
Frey, B.J., MacKay, D.J.C.: A revolution: Belief propagation in graphs with cycles. In: Neural Information Processing Systems (NIPS) (1997)
Alexa: Top Sites, http://www.alexa.com/topsites
Pretti, M.: A message-passing algorithm with damping. Journal of Statistical Mechanics: Theory and Experiment 2005(11), P11008 (2005)
Yen, T.F., Oprea, A., Onarlioglu, K., Leetham, T., Robertson, W., Juels, A., Kirda, E.: Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. In: Proceedings of the 29th Annual Computer Security Applications Conference, ACSAC 2013, pp. 199–208. ACM, New York (2013)
Giura, P., Wang, W.: A context-based detection framework for advanced persistent threats. In: International Conference on Cyber Security (2012)
Bilge, L., Balzarotti, D., Robertson, W., Kirda, E., Kruegel, C.: Disclosure: Detecting botnet command and control servers through large-scale netflow analysis. In: Proceedings of the 28th Annual Computer Security Applications Conference, ACSAC 2012, pp. 129–138. ACM, New York (2012)
Yadav, S., Reddy, A.K.K., Reddy, A.N., Ranjan, S.: Detecting algorithmically generated malicious domain names. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, IMC 2010. ACM, New York (2010)
Bilge, L., Kirda, E., Kruegel, C., Balduzzi, M.: EXPOSURE: Finding Malicious Domain Using Passive DNS Analysis. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2011)
Antonakakis, M., Perdisci, R., Lee, W., Vasiloglou II, N., Dagon, D.: Detecting malware domains at the upper dns hierarchy. In: 20th USENIX Security Symposium (2011)
Antonakakis, M., Perdisci, R., Dagon, D., Lee, W., Feamster, N.: Building a Dynamic Reputation System for DNS. In: USENIX Security Symposium (2010)
Antonakakis, M., Perdisci, R., Nadji, Y., Vasiloglou, N., Abu-Nimeh, S., Lee, W., Dagon, D.: From throw-away traffic to bots: Detecting the rise of dga-based malware. In: 21st USENIX Security Symposium (2012)
Jiang, N., Cao, J., Jin, Y., Li, L.E., Zhang, Z.L.: Identifying Suspicious Activities Through DNS Failure Graph Analysis. In: IEEE Conference on Network Protocols (2010)
Yadav, S., Reddy, A.L.N.: Winning with DNS failures: Strategies for faster botnet detection. In: Rajarajan, M., Piper, F., Wang, H., Kesidis, G. (eds.) SecureComm 2011. LNICST, vol. 96, pp. 446–459. Springer, Heidelberg (2012)
Anderson, D.S., Fleizach, C., Savage, S., Voelker, G.M.: Spamscatter: Characterizing internet scam hosting infrastructure. In: 16th USENIX Security Symposium (2007)
Lin, M., Chiu, C., Lee, Y., Pao, H.: Malicious URL filtering- a big data application. In: IEEE BigData (2013)
Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (June 2009)
Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and Evaluation of a Real-Time URL Spam Filtering Service. IEEE Security and Privacy (2011)
Zhang, Y., Hong, J., Cranor, L.: Cantina: A content-based approach to detecting phishing web sites. In: World Wide Web Conference (May 2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Manadhata, P.K., Yadav, S., Rao, P., Horne, W. (2014). Detecting Malicious Domains via Graph Inference. In: Kutyłowski, M., Vaidya, J. (eds) Computer Security - ESORICS 2014. ESORICS 2014. Lecture Notes in Computer Science, vol 8712. Springer, Cham. https://doi.org/10.1007/978-3-319-11203-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-11203-9_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11202-2
Online ISBN: 978-3-319-11203-9
eBook Packages: Computer ScienceComputer Science (R0)