Machine Learning

, Volume 58, Issue 2–3, pp 217–230 | Cite as

Principle Components and Importance Ranking of Distributed Anomalies

  • Kyrre Begnum
  • Mark Burgess


Correlations between locally averaged host observations, at different times and places, hint at information about the associations between the hosts in a network. These smoothed, pseudo-continuous time-series imply relationships with entities in the wider environment. For anomaly detection, mining this information might provide a valuable source of observational experience for determining comparative anomalies or rejecting false anomalies. The difficulties with distributed analysis lie in collating the distributed data and in comparing observables on different hosts, in different frames of reference. In the present work, we examine two methods (Principle Component Analysis and Eigenvector Centrality) that shed light on the usefulness of comparing data destined for different locations in a network.


machine learning anomaly detection 


  1. Balakrishnan, V. (1997). Graph theory. New York: Schaum’s Outline Series, McGraw-Hill.Google Scholar
  2. Barbará, D., Li, Y., Couto, J., Lin, J.-L., & Jajodia, S. (2003). Bootstrapping a data mining intrusion detection system. In Proceedings of the 2003 ACM Symposium on Applied Computing. New York, NY: ACM Press.Google Scholar
  3. Bonacich, P. (1987). Power and centrality: A family of measures. American Journal of Sociology, 92, 1170–1182.Google Scholar
  4. Burgess, M. (1993). Cfengine WWW site.
  5. Burgess, M. (1995). A site configuration engine. Cambridge MA: Computing Systems, MIT Press, Vol. 8, p. 309.Google Scholar
  6. Burgess, M. (1998). Computer immunology. In Proceedings of the Twelth Systems Administration Conference (LISA XII). USENIX Association: Berkeley, CA, p. 283.Google Scholar
  7. Burgess, M. (2002). Two dimensional time-series for anomaly detection and regulation in adaptive systems. IFIP/IEEE 13th International Workshop on Distributed Systems: Operations and Management (DSOM 2002) (p. 169).Google Scholar
  8. Burgess, M. (2004). Analytical network and system administration—Managing human-computer systems. Chichester: J. Wiley & Sons.Google Scholar
  9. Burgess, M. (resubmitted). Probabilistic anomaly detection in distributed computer networks. Science of Computer Programming.Google Scholar
  10. Burgess, M. & Canright, G. (2003). Scalability of peer configuration management in partially reliable and ad hoc networks. In Proceedings of the VIII IFIP/IEEE IM Conference on Network Management (p. 293).Google Scholar
  11. Burgess, M., Haugerud, H., Reitan, T., & Straumsnes, S. (2001). Measuring host normality. ACM Transactions on Computing Systems. 20, 125–160.Google Scholar
  12. Canright, G., Eng⊘-Monsen, K., & Weltzien, å. (2003). Multiplex structure of the communications network in a small working group.Social Networks—An International Journal of Structural Analysis (submitted for publication).Google Scholar
  13. Duda, R., Hart, P., & Stork, D. (2001). Pattern Classification. New York: Wiley Interscience.Google Scholar
  14. Grimmett, G. & Stirzaker, D. (2001). Probability and random processes, 3rd edition. Oxford: Oxford Scientific Publications.Google Scholar
  15. Han, S.-H., Kim, M.-S., Ju, H.-T., & Hong, J.-K. (2002). The architecture of NG-MON: A Passive Network Monitoring System for High-Speed IP Networks. In IFIP/IEEE 13th International Workshop on Distributed Systems: Operations and Management (DSOM 2002) (p. 16).Google Scholar
  16. Kleinberg, J. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46, 604.Google Scholar
  17. Page, L., Brin, S., Motwani, R., & Winograd, T. (1998). The PageRank citation Ranking: Bringing order to the Web. Technical report, Stanford Digital Library Technologies Project.Google Scholar
  18. Ranum, M. J., Landfield, K., Stolarchuk, M., Sienkiewicz, M., Lambeth, A., & Wall, E. (1997). Implementing a generalized tool for network monitoring. In Proceedings of the Eleventh Systems Administration Conference (LISA XI) (p. 1). Berkeley, CA: USENIX Association.Google Scholar
  19. Snort, Intrusion detection system.
  20. Somayaji, A., & Forrest, S. (2000). Automated reponse using system-call delays. In Proceedings of the 9th USENIX Security Symposium (p. 185).Google Scholar
  21. Somayaji, A., Hofmeyr, S., & Forrest, S. (1997). Principles of a computer immune system. New Security Paradigms Workshop, ACM (pp. 75–82).Google Scholar
  22. Steinder, M. & Sethi, A. (2002). Distributed fault localization in hierarchically routed networks. In IFIP/IEEE 13th International Workshop on Distributed Systems: Operations and Management (DSOM 2002) (p. 195).Google Scholar
  23. Steinder, M. & Sethi, A. (2003). A survey of fault localization techniques in computer networks. Science of Computer Programming (to appear).Google Scholar
  24. Stolfo, S. J., Lee, W., Chan, P. K., Fan, W., & Eskin, E. (2001). Data mining-based intrusion detectors: An overview of the columbia IDS project. ACM SIGMOD, 30:4.Google Scholar
  25. Zanero, S., & Savaresi, S. M. Unsupervised learning techniques for an intrusion detection system. In Proceedings of the 2004 ACM Symposium on Applied Computing.Google Scholar

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  1. 1.Faculty of EngineeringOslo University CollegeNorway

Personalised recommendations