Skip to main content

Detecting Projected Outliers in High-Dimensional Data Streams

  • Conference paper
Database and Expert Systems Applications (DEXA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5690))

Included in the following conference series:

Abstract

In this paper, we study the problem of projected outlier detection in high dimensional data streams and propose a new technique, called Stream Projected Ouliter deTector (SPOT), to identify outliers embedded in subspaces. Sparse Subspace Template (SST), a set of subspaces obtained by unsupervised and/or supervised learning processes, is constructed in SPOT to detect projected outliers effectively. Multi-Objective Genetic Algorithm (MOGA) is employed as an effective search method for finding outlying subspaces from training data to construct SST. SST is able to carry out online self-evolution in the detection stage to cope with dynamics of data streams. The experimental results demonstrate the efficiency and effectiveness of SPOT in detecting outliers in high-dimensional data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Yu, P.S.: An effective and efficient algorithm for high-dimensional outlier detection. VLDB Journal 14, 211–221 (2005)

    Article  Google Scholar 

  2. Aggarwal, C.C.: On Abnormality Detection in Spuriously Populated Data Streams. In: SDM 2005, Newport Beach, CA (2005)

    Google Scholar 

  3. Aggarwal, C.C., Yu, P.S.: Outlier Detection in High Dimensional Data. In: SIGMOD 2001, Santa Barbara, California, USA, pp. 37–46 (2001)

    Google Scholar 

  4. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A Framework for Clustering Evolving Data Streams. In: VLDB 2003, Berlin, Germany, pp. 81–92 (2003)

    Google Scholar 

  5. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A Framework for Projected Clustering of High Dimensional Data Streams. In: VLDB 2004, Toronto, Canada, pp. 852–863 (2004)

    Google Scholar 

  6. Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 15–26. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Breuning, M., Kriegel, H.-P., Ng, R., Sander, J.: LOF: Identifying Density-Based Local Outliers. In: SIGMOD 2000, Dallas, Texas, pp. 93–104 (2000)

    Google Scholar 

  8. Guttman, A.: R-trees: a Dynamic Index Structure for Spatial Searching. In: SIGMOD 1984, Boston, Massachusetts, pp. 47–57 (1984)

    Google Scholar 

  9. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufman Publishers, San Francisco (2000)

    MATH  Google Scholar 

  10. Knorr, E.M., Ng, R.T.: Algorithms for Mining Distance-based Outliers in Large Dataset. In: VLDB 1998, New York, NY, pp. 392–403 (1998)

    Google Scholar 

  11. Knorr, E.M., Ng, R.T.: Finding Intentional Knowledge of Distance-based Outliers. In: VLDB 1999, Edinburgh, Scotland, pp. 211–222 (1999)

    Google Scholar 

  12. Palpanas, T., Papadopoulos, D., Kalogeraki, V., Gunopulos, D.: Distributed deviation detection in sensor networks. SIGMOD Record 32(4), 77–82 (2003)

    Article  Google Scholar 

  13. Ramaswamy, S., Rastogi, R., Kyuseok, S.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: SIGMOD 2000, Dallas Texas, pp. 427–438 (2000)

    Google Scholar 

  14. Papadimitriou, S., Kitagawa, H., Gibbons, P.B., Faloutsos, C.: LOCI: Fast Outlier Detection Using the Local Correlation Integral. In: ICDE 2003, Bangalore, India, p. 315 (2003)

    Google Scholar 

  15. Pokrajac, D., Lazarevic, A., Latecki, L.: Incremental Local Outlier Detection for Data Streams. In: CIDM 2007, Honolulu, Hawaii, USA, pp. 504–515 (2007)

    Google Scholar 

  16. Subramaniam, S., Palpanas, T., Papadopoulos, D., Kalogeraki, V., Gunopulos, D.: Online Outlier Detection in Sensor Data Using Non-Parametric Models. In: VLDB 2006, Seoul, Korea, pp. 187–198 (2006)

    Google Scholar 

  17. Tang, J., Chen, Z., Fu, A.W.-c., Cheung, D.W.: Enhancing effectiveness of outlier detections for low density patterns. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS, vol. 2336, p. 535. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  18. Zhang, J., Lou, M., Ling, T.W., Wang, H.: HOS-Miner: A System for Detecting Outlying Subspaces of High-dimensional Data. In: VLDB 2004, Toronto, Canada, pp. 1265–1268 (2004)

    Google Scholar 

  19. Zhang, J., Gao, Q., Wang, H.: A Novel Method for Detecting Outlying Subspaces in High-dimensional Databases Using Genetic Algorithm. In: ICDM 2006, Hong Kong, China, pp. 731–740 (2006)

    Google Scholar 

  20. Zhang, J., Wang, H.: Detecting Outlying Subspaces for High-dimensional Data: the New Task, Algorithms and Performance. In: Knowledge and Information Systems (KAIS), pp. 333–355 (2006)

    Google Scholar 

  21. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In: SIGMOD 1996, Montreal, Canada, pp. 103–114 (1996)

    Google Scholar 

  22. Zhu, C., Kitagawa, H., Faloutsos, C.: Example-Based Robust Outlier Detection in High Dimensional Datasets. In: ICDM 2005, Houston, Texas, pp. 829–832 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, J., Gao, Q., Wang, H., Liu, Q., Xu, K. (2009). Detecting Projected Outliers in High-Dimensional Data Streams. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2009. Lecture Notes in Computer Science, vol 5690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03573-9_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03573-9_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03572-2

  • Online ISBN: 978-3-642-03573-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics