Paragraph: Thwarting Signature Learning by Training Maliciously

  • James Newsome
  • Brad Karp
  • Dawn Song
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4219)


Defending a server against Internet worms and defending a user’s email inbox against spam bear certain similarities. In both cases, a stream of samples arrives, and a classifier must automatically determine whether each sample falls into a malicious target class (e.g., worm network traffic, or spam email). A learner typically generates a classifier automatically by analyzing two labeled training pools: one of innocuous samples, and one of samples that fall in the malicious target class.

Learning techniques have previously found success in settings where the content of the labeled samples used in training is either random, or even constructed by a helpful teacher, who aims to speed learning of an accurate classifier. In the case of learning classifiers for worms and spam, however, an adversary controls the content of the labeled samples to a great extent. In this paper, we describe practical attacks against learning, in which an adversary constructs labeled samples that, when used to train a learner, prevent or severely delay generation of an accurate classifier. We show that even a delusive adversary, whose samples are all correctly labeled, can obstruct learning. We simulate and implement highly effective instances of these attacks against the Polygraph [15] automatic polymorphic worm signature generation algorithms.


automatic signature generation machine learning worm spam 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygar, J.D.: Can machine learning be secure? In: ASIA CCS (March 2006)Google Scholar
  2. 2.
    Brumley, D., Newsome, J., Song, D., Wang, H., Jha, S.: Towards automatic generation of vulnerability-based signatures. In: IEEE Symposium on Security and Privacy (2006)Google Scholar
  3. 3.
    Costa, M., Crowcroft, J., Castro, M., Rowstron, A.: Vigilante: End-to-end containment of internet worms. In: SOSP (2005)Google Scholar
  4. 4.
    Crandall, J.R., Chong, F.: Minos: Architectural support for software security through control data integrity. In: International Symposium on Microarchitecture (December 2004)Google Scholar
  5. 5.
    Crandall, J.R., Su, Z., Wu, S.F., Chong, F.T.: On deriving unknown vulnerabilities from zero-day polymorphic and metamorphic worm exploits. In: 12th ACM Conference on Computer and Communications Security (CCS) (2005)Google Scholar
  6. 6.
    Dalvi, N., Domingos, P., Mausam, Sanghai, S., Verma, D.: Adversarial classification. In: Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2004)Google Scholar
  7. 7.
    Kim, H.-A., Karp, B.: Autograph: toward automated, distributed worm signature detection. In: 13th USENIX Security Symposium (August 2004)Google Scholar
  8. 8.
    Kreibich, C., Crowcroft, J.: Honeycomb - creating intrusion detection signatures using honeypots. In: HotNets (November 2003)Google Scholar
  9. 9.
    Li, Z., Sanghi, M., Chen, Y., Kao, M.-Y., Chavez, B.: Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience. In: IEEE Symposium on Security and Privacy (May 2006)Google Scholar
  10. 10.
    Liang, Z., Sekar, R.: Fast and automated generation of attack signatures: A basis for building self-protecting servers. In: 12th ACM Conference on Computer and Communications Security (CCS) (2005)Google Scholar
  11. 11.
    Littlestone, N.: Learning quickly when irrelevant attributes abound: A new linear threshold algorithm. Machine Learning 2, 285–318 (1988)Google Scholar
  12. 12.
    Littlestone, N.: Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow. In: Fourth Annual Workshop on Computational Learning Theory, pp. 147–156 (1991)Google Scholar
  13. 13.
    Lowd, D., Meek, C.: Adversarial learning. In: Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2005)Google Scholar
  14. 14.
    Newsome, J., Brumley, D., Song, D.: Vulnerability-specific execution filtering for exploit prevention on commodity software. In: 13th Symposium on Network and Distributed System Security (NDSS 2006) (2006)Google Scholar
  15. 15.
    Newsome, J., Karp, B., Song, D.: Polygraph: Automatically generating signatures for polymorphic worms. In: IEEE Symposium on Security and Privacy (May 2005)Google Scholar
  16. 16.
    Newsome, J., Song, D.: Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In: 12th Annual Network and Distributed System Security Symposium (NDSS) (February 2005)Google Scholar
  17. 17.
    delusive (definition). In Oxford English Dictionary. Oxford University Press, Oxford (2006)Google Scholar
  18. 18.
    Perdisci, R., Dagon, D., Lee, W., Fogla, P., Sharif, M.: Misleading worm signature generators using deliberate noise injection. In: IEEE Symposium on Security and Privacy (May 2006)Google Scholar
  19. 19.
  20. 20.
    Sidiroglou, S., Locasto, M.E., Boyd, S.W., Keromytis, A.D.: Building a reactive immune system for software services. In: USENIX Annual Technical Conference (2005)Google Scholar
  21. 21.
    Singh, S., Estan, C., Varghese, G., Savage, S.: Automated worm fingerprinting. In: 6th ACM/USENIX Symposium on Operating System Design and Implementation (OSDI) (December 2004)Google Scholar
  22. 22.
    Staniford, S., Moore, D., Paxson, V., Weaver, N.: The top speed of flash worms. In: ACM CCS WORM (2004)Google Scholar
  23. 23.
    Suh, G.E., Lee, J., Devadas, S.: Secure program execution via dynamic information flow tracking. In: ASPLOS (2004)Google Scholar
  24. 24.
    Tang, Y., Chen, S.: Defending against internet worms: A signature-based approach. In: IEEE INFOCOM (March 2005)Google Scholar
  25. 25.
    Xu, J., Ning, P., Kil, C., Zhai, Y., Bookholt, C.: Automatic diagnosis and response to memory corruption vulnerabilities. In: 12th Annual ACM Conference on Computer and Communication Security (CCS) (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • James Newsome
    • 1
  • Brad Karp
    • 2
  • Dawn Song
    • 1
  1. 1.Carnegie Mellon University 
  2. 2.University College London 

Personalised recommendations