Advertisement

Soft Computing

, Volume 15, Issue 6, pp 1195–1215 | Cite as

Detecting anomalies from high-dimensional wireless network data streams: a case study

  • Ji ZhangEmail author
  • Qigang Gao
  • Hai Wang
  • Hua Wang
Focus

Abstract

In this paper, we study the problem of anomaly detection in wireless network streams. We have developed a new technique, called Stream Projected Outlier deTector (SPOT), to deal with the problem of anomaly detection from multi-dimensional or high-dimensional data streams. We conduct a detailed case study of SPOT in this paper by deploying it for anomaly detection from a real-life wireless network data stream. Since this wireless network data stream is unlabeled, a validating method is thus proposed to generate the ground-truth results in this case study for performance evaluation. Extensive experiments are conducted and the results demonstrate that SPOT is effective in detecting anomalies from wireless network data streams and outperforms existing anomaly detection methods.

Keywords

Outlier detection High-dimensional data Subspaces Data streams 

References

  1. Aggarwal CC (2005) On abnormality detection in spuriously populated data streams. In: 2005 SIAM international conference on data mining (SDM’05), Newport Beach, pp 84–93Google Scholar
  2. Aggarwal CC, Yu PS (2001) Outlier detection in high dimensional data. In: 2001 ACM SIGMOD international conference on management of data (SIGMOD’01). Santa Barbara, pp 37–46Google Scholar
  3. Aggarwal CC, Yu PS (2005) An effective and efficient algorithm for high-dimensional outlier detection. VLDB J 14:211–221CrossRefGoogle Scholar
  4. Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: 2003 very large database conference (VLDB’03), Berlin, Germany, pp 81–92Google Scholar
  5. Aggarwal CC, Han J, Wang J, Yu PS (2004) A framework for projected clustering of high dimensional data streams. In: 2004 Very large database conference (VLDB’04), Toronto, Canada, pp 852–863Google Scholar
  6. Angiulli F, Pizzuti C (2002) Fast outlier detection in high dimensional spaces. In: 2002 European conference on principles of data mining and knowledge discovery (PKDD’02). Helsinki, Finland, pp 15–26Google Scholar
  7. Barbara D (2002) Requirements for clustering data streams. ACM SIGKDD Explorations Newsletter, vol 3, Issue 2. ACM Press, London, pp 23–27Google Scholar
  8. Balazinska M, Castro P (2003) Characterizing mobility and network usage in a corporate wireless local-area network. In: 2003 International conference on mobile systems, applications, and services (MobiSys’03), San Francisco, CA, USA, pp 232–239Google Scholar
  9. Breuning M, Kriegel HP, Ng R, Sander J (2000) LOF: identifying density-based local outliers. In: 2000 ACM SIGMOD international conference on management of data (SIGMOD’00), Dallas, Texas, USA, pp 93–104Google Scholar
  10. Boudjeloud L, Poulet F (2005) Visual interactive evolutionary algorithm for high dimensional data clustering and outlier detection. In: 9th Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD’05). Hanoi, Vietnam, pp 426–431Google Scholar
  11. Eskin E, Arnold A, Prerau M, Portnoy L, Stolfo S (2002) A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. In: Applications of Data Mining in Computer Security, pp 34–42Google Scholar
  12. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: 1984 ACM SIGMOD international conference on management of data (SIGMOD’84). Boston, Massachusetts, pp 47–57Google Scholar
  13. Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases. In: 1998 ACM SIGMOD international conference on management of data (SIGMOD’98). Seattle, WA, USA, pp 73–84Google Scholar
  14. Han J, Kamber M (2000) Data mining: concepts and techniques. Morgan Kaufman Publishers,Google Scholar
  15. Jin W, Tung AKH, Han J, Wang W (2006) Ranking outliers using symmetric neighborhood relationship. 2006 Pacific-Asia conference on knowledge discovery and data mining (PAKDD’06), Singapore, pp 577–593Google Scholar
  16. Knorr EM, Ng R (1998) Algorithms for mining distance-based outliers in large dataset. In: 1998 Very large database conference (VLDB’98). New York, NY, USA, pp 392–403Google Scholar
  17. Knorr EM, Ng R (1999) Finding intentional knowledge of distance-based outliers. In: 1999 Very large database conference (VLDB’99), Edinburgh, Scotland, pp 211–222Google Scholar
  18. Khoshgoftaar TM, Nath SV, Zhong S (2005) Intrusion detection in wireless networks using clusterings techniques with expert analysis. In: The fourth international conference on machine leaning and applications (ICMLA’05), Los Angeles, CA, USA, pp 54–63Google Scholar
  19. Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New YorkGoogle Scholar
  20. Nguyen HV, Gopalkrishnan V (2009) Efficient pruning schemes for distance-based outlier detection. In: 2009 European conference on machine learning and knowledge discovery in databases (ECML/PKDD’09), Bled, Slovinia, pp 160–175Google Scholar
  21. Pokrajac D, Lazarevic A, Latecki L (2007) Incremental local outlier detection for data streams. In: IEEE symposiums on computational intelligence and data mining (CIDM’07). Honolulu, Hawaii, USA, pp 504–515Google Scholar
  22. Palpanas T, Papadopoulos D, Kalogeraki V, Gunopulos D (2003) Distributed deviation detection in sensor networks. SIGMOD Record 32(4):77–82CrossRefGoogle Scholar
  23. Ramaswamy S, Rastogi R, Kyuseok S (2000) efficient algorithms for mining outliers from large data sets. In: 2000 ACM SIGMOD international conference on management of data (SIGMOD’00). Dallas Texas, USA, pp 427–438Google Scholar
  24. Tang J, Chen Z, Fu A, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. In: 2002 Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD’02), Taipei, Taiwan, pp 535–548Google Scholar
  25. Wang W, Zhang J, Wang H (2005) Grid-ODF: detecting outliers effectively and efficiently in large multi-dimensional databases. In: 2005 International conference on computational intelligence and security (CIS’05), Xi’an, China, pp 765–770Google Scholar
  26. Wang B, Xiao G, Yu H, Yang X (2009) Distance-based outlier detection on uncertain data. In: 2009 Ninth IEEE international conference on computer and information technology, Xiamen, China, pp 293–298Google Scholar
  27. Zhu C, Kitagawa H, Faloutsos C (2005) Example-based robust outlier detection in high dimensional datasets. In: 2005 IEEE international conference on data mining (ICDM’05), Houston, Texas, USA, pp 829–832Google Scholar
  28. Zhu C, Kitagawa H, Papadimitriou S, Faloutsos C (2004) OBE: outlier by example. In: 2004 Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD’04), Sydney, Australia, pp 222–234Google Scholar
  29. Zhang J, Lou M, Ling TW, Wang H (2004) HOS-Miner: a system for detecting outlying subspaces of high-dimensional data. In: 2004 Very large database conference (VLDB’04), Toronto, Canada, pp 1265–1268Google Scholar
  30. Zhang J, Wang H (2006) Detecting outlying subspaces for high-dimensional data: the new task, algorithms and performance. In: Knowledge and information systems (KAIS), pp 333–355Google Scholar
  31. Zhang J, Gao Q, Wang H (2006) A novel method for detecting outlying subspaces in high-dimensional databases using genetic algorithm. In: 2006 International conference on data mining (ICDM’06), Hong Kong, China, pp 731–740Google Scholar
  32. Zhong C, Lin X, Zhang M (2009) A local outlier detection approach based on graph-cut. In: 2009 International joint conference on computational sciences and optimization, Sanya, China, pp 714–718Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.University of Southern QueenslandToowoombaAustralia
  2. 2.Dalhousie UniversityHalifaxCanada
  3. 3.Saint Mary’s UniversityHalifaxCanada

Personalised recommendations