Encyclopedia of Social Network Analysis and Mining

2014 Edition
| Editors: Reda Alhajj, Jon Rokne

Network Anomaly Detection Using Co-clustering

  • Evangelos E. Papalexakis
  • Alex Beutel
  • Peter Steenkiste
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-6170-8_354

Synonyms

Biclustering; Data mining; Intrusion detection; Knowledge discovery

Glossary

Anomaly

Anything that is out of the ordinary, an outlier. Usually, may be something that a systematic model of the data fails to capture

Co-clustering

Class of algorithms that seek to simultaneously cluster rows and columns of a data matrix; when referring to matrices, it may be found in the literature as “biclustering” as well

CDF

Cumulative distribution function; a function that gives the probability that a random variable X has a value ≤ x

Introduction

Security is an increasingly large problem in today’s Internet. However, the initial Internet architecture did not consider security to be a high priority, leaving the problem of managing security concerns to end hosts. This is a growing problem for system administrators who have to continually fend off a variety of attacks and intrusion attempts by both individuals and large botnets. An early study from 2001 suggested that there are roughly...

This is a preview of subscription content, log in to check access.

References

  1. Akoglu L, McGlohon M, Faloutsos C (2010) OddBall: spotting anomalies in weighted graphs. In: Zaki MJ et al (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 410-421Google Scholar
  2. Anagnostopoulos A, Dasgupta A, Kumar R (2008) Approximation algorithms for co-clustering. In: Proceedings of the twenty-seventh ACM SIGMOD- SIGACT-SIGART symposium on principles of database systems, Vancouver. ACM, pp 201-210Google Scholar
  3. Banerjee A, Dhillon I, Ghosh J, Merugu S, Modha D (2004) A generalized maximum entropy approach to bregman co-clustering and matrix approximation. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, Seattle. ACM, pp 509-514Google Scholar
  4. Beutel A, Xu W, Guruswami V, Palow C, Faloutsos C (2013) CopyCatch: stopping group attacks by spotting lockstep behavior in social networks. In: Proceedings of the 22nd WWW international world wide web conference, Rio de JaneiroGoogle Scholar
  5. Bro R, Papalexakis E, Acar E, Sidiropoulos N (2011) SMR co-clustering code on-line. http://www.models.life.ku.dk/cocluster. Last accessed 19 Sept 2012
  6. Bro R, Papalexakis E, Acar E, Sidiropoulos N (2012) Coclustering – a useful tool for chemometrics. J Chemom 26:256-263Google Scholar
  7. Cho H, Dhillon I, Guan Y, Sra S (2004) Minimum sum-squared residue co-clustering of gene expression data. In: Proceedings of the fourth SIAM international conference on data mining, Orlando, vol 114Google Scholar
  8. Dhillon I (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco. ACM, pp 269-274Google Scholar
  9. Dhillon I (2007) Bregman co-clustering code on-line. http://www.lans.ece.utexas.edu/facility.html. Last accessed 19 Sept 2012
  10. Dhillon I, Mallela S, Modha D (2003) Information- theoretic co-clustering. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC. ACM, pp 89-98Google Scholar
  11. Elkan C (1999) Results of the KDD’99 classifier learning contest. http://cseweb.ucsd.edu/~elkan/clresults.html. Last accessed 19 Sept 2012
  12. Gu G, Perdisci R, Zhang J, Lee W (2008) BotMiner: clustering analysis of network traffic for protocol-and structure-independent botnet detection. In: Proceedings of the 17th conference on security symposium, San Jose. USENIX Association, pp 139-154Google Scholar
  13. Guan Y, Ghorbani A, Belacel N et al (2003) Y-means: a clustering method for intrusion detection. In: Canadian conference on electrical and computer engineering, MontréalGoogle Scholar
  14. Hartigan J (1972) Direct clustering of a data matrix. J Am Stat Assoc 67:123-129Google Scholar
  15. Henderson K, Eliassi-Rad T, Faloutsos C, Akoglu L, Li L, Maruhashi K, Prakash B, Tong H (2010) Metric forensics: a multi-level approach for mining volatile graphs. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC. ACM, pp 163-172Google Scholar
  16. Kabiri P, Zargar G (2009) Category-based selection of effective parameters for intrusion detection. Int J Com- put Sci Netw Secur (IJCSNS) 9(9):181-188Google Scholar
  17. Leung K, Leckie C (2005) Unsupervised anomaly detection in network intrusion detection using clusters. In: Proceedings of the twenty-eighth Australasian conference on computer sciencevolume 38, Newcastle. Australian Computer Society, pp 333-342Google Scholar
  18. Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129-137zbMATHMathSciNetGoogle Scholar
  19. Madeira S, Oliveira A (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform 1(1):24–45Google Scholar
  20. Maruhashi K, Guo F, Faloutsos C (2011) MultiAspect- Forensics: pattern mining on large-scale heterogeneous networks with tensor analysis. In: Proceedings of the third international conference on advances in social network analysis and mining, KaohsiungGoogle Scholar
  21. Moore D, Voelker G, Savage S (2001) Inferring internet denial-of-service activity. In: Proceedings of the 10th Usenix security symposium, Washington, DC, pp 9-22Google Scholar
  22. Mukherjee B, Heberlein L, Levitt K (1994) Network intrusion detection. IEEE Netw 8(3):26-41Google Scholar
  23. New York Times (2009) Twitter restores service after attack. http://bits.blogs.nytimes.com/2009/08/06/twitter-overwhelmed-by-web-attack/. Last accessed 19 Sept 2012
  24. Papalexakis E, Sidiropoulos N (2011) Co-clustering as multilinear decomposition with sparse latent factors. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), Prague, 2011. IEEE, pp 2064-2067Google Scholar
  25. Papalexakis E, Sidiropoulos N, Garofalakis M (2010) Reviewer profiling using sparse matrix regression. In: IEEE international conference on data mining workshops (ICDMW), Sydney, 2010. IEEE, pp 1214-1219Google Scholar
  26. Papalexakis E, Sidiropoulos N, Bro R (2013) From k-means to higher-way co-clustering: multilinear decomposition with sparse latent factors. IEEE Trans Signal Process 61(2):493-506Google Scholar
  27. Patcha A, Park J (2007) An overview of anomaly detection techniques: existing solutions and latest technological trends. Comput Netw 51(12):3448-3470Google Scholar
  28. Peng W, Li T (2011) Temporal relation co-clustering on directional social network and author-topic evolution. Knowl Inf Syst 26(3):467-486Google Scholar
  29. Pfahringer B (2000) Winning the KDD99 classification cup: bagged boosting. http://www.sigkdd.org/explorations/issues/1-2-2000-01/pfahringer.pdf. Last accessed 19 Sept 2012
  30. Portnoy L, Eskin E, Stolfo S (2001) Intrusion detection with unlabeled data using clustering. In: Proceedings of ACM CSS workshop on data mining applied to security (DMSA-2001), Philadelphia. CiteseerGoogle Scholar
  31. Shah H, Undercoffer J, Joshi A (2003) Fuzzy clustering for intrusion detection. In: The 12th IEEE international conference on fuzzy systems, FUZZ’03, St. Louis, 2003, vol 2. IEEE, pp 1274-1278Google Scholar
  32. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58:267-288zbMATHMathSciNetGoogle Scholar
  33. UCI (1999) KDD 99 cup dataset. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. Last accessed 19 Sept 2012
  34. Wired (2010) Google hack attack was ultra sophisticated, new details show. http://www.wired.com/threatlevel/2010/01/operation-aurora/. Last accessed 19 Sept 2012

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Evangelos E. Papalexakis
    • 1
  • Alex Beutel
    • 1
  • Peter Steenkiste
    • 1
    • 2
  1. 1.School of Computer Science, Carnegie Mellon UniversityPittsburghUSA
  2. 2.Department of Electrical & Computer Engineering, Carnegie Mellon UniversityPittsburghUSA