Abstract
Outlier analysis is an important task in data mining and has attracted much attention in both research and applications. Previous work on outlier detection involves different types of databases such as spatial databases, time series databases, biomedical databases, etc. However, few of the existing studies have considered spatial networks where points reside on every edge. In this paper, we study the interesting problem of distance-based outliers in spatial networks. We propose an efficient mining method which partitions each edge of a spatial network into a set of length d segments, then quickly identifies the outliers in the remaining edges after pruning those unnecessary edges which cannot contain outliers. We also present algorithms that can be applied when the spatial network is updating points or the input parameters of outlier measures are changed. The experimental results verify the scalability and efficiency of our proposed methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C., Yu, P.: Outlier detection for high dimensional data. In: SIGMOD (2001)
Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: Identifying Density-Based Local Outliers. In: SIGMOD (2000)
Barnett, V., Lewis, T.: Outliers in Statistical Data. John Wiley & Sons, Chichester (1994)
Chakrabarti, D.: AutoPart: Parameter-Free Graph Partitioning and Outlier Detection. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, Springer, Heidelberg (2004)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases. In: KDD (1996)
Guha, S., Rastogi, R., Shim, K.: Cure: An efficient clustering algorithm for large databases. In: SIGMOD (1998)
Hawkins, D.: Identification of Outliers. Chapman and Hall, London (1980)
Hautamki, V., Krkkinen, I., Frnti, P.: Outlier detection using k-nearest neighbour graph. In: ICPR (2004)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco
Jagadish, H., Koudas, N., Muthukrishnan, S.: Mining deviants in a time series database. In: VLDB 1999 (1999)
Jin, W., Tung, A.K.H., Han, J.W.: Mining Top-n Local Outliers in Large Databases. In: KDD (2001)
Edwin, M., Knorr, R.T.: Ng: Algorithms for Mining Distance-Based Outliers in Large Datasets. In: VLDB (1998)
Knorr, E., Ng, R.: Finding Intensional Knowledge of Distance-Based Outliers. In: VLDB (1999)
Muthukrishnan, S.: Rahul Shah, Jeffrey Scott Vitter: Mining Deviants in Time Series Data Streams. In: SSDBM (2004)
Ng, R., Han, J.: Efficient and effective clustering method for spatial data mining. In: VLDB (1994)
Papadimitriou, S., Kitagawa, H., Gibbons, P.B., Faloutsos, C.: LOCI: Fast Outlier Detection Using the Local Correlation Integral. In: ICDE (2003)
Papadimitriou, S., Faloutsos, C.: Cross-Outlier Detection. In: Hadzilacos, T., Manolopoulos, Y., Roddick, J.F., Theodoridis, Y. (eds.) SSTD 2003. LNCS, vol. 2750, Springer, Heidelberg (2003)
Roussopoulos, N., Kelley, S., Vincent, F.: Nearest Neighbor Queries. In: SIGMOD (1995)
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: SIGMOD (2000)
Shekhar, S., Lu, C.-T., Zhang, P.: Detecting graph-based spatial outliers: algorithms and applications (a summary of results). In: KDD (2001)
Sander, J., Ng, R.T., Sleumer, M.C., Yuen, M.S., Jones, S.J.: A methodology for analyzing SAGE libraries for cancer profiling. ACM Trans. Inf. Syst. 23(1), 35–60 (2005)
Wong, W.-K., Moore, A.W., Cooper, G.F., Wagner, M.: Rule-Based Anomaly Pattern Detection for Detecting Disease Outbreaks. In: AAAI (2002)
Yiu, M.L., Mamoulis, N.: Clustering Objects on a Spatial Network. In: SIGMOD (2004)
Yiu, M.L., Mamoulis, N., Papadias, D.: Aggregate Nearest Neighbor Queries in Road Networks. IEEE Trans. Knowl. Data Eng 17(6), 820–833 (2005)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: SIGMOD (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jin, W., Jiang, Y., Qian, W., Tung, A.K.H. (2006). Mining Outliers in Spatial Networks. In: Li Lee, M., Tan, KL., Wuwongse, V. (eds) Database Systems for Advanced Applications. DASFAA 2006. Lecture Notes in Computer Science, vol 3882. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733836_13
Download citation
DOI: https://doi.org/10.1007/11733836_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33337-1
Online ISBN: 978-3-540-33338-8
eBook Packages: Computer ScienceComputer Science (R0)