Incremental Commute Time Using Random Walks and Online Anomaly Detection

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9851)

Abstract

Commute time is a random walk based metric on graphs and has found widespread successful applications in many application domains. However, the computation of the commute time is expensive, involving the eigen decomposition of the graph Laplacian matrix. There has been effort to approximate the commute time in offline mode. Our interest is inspired by the use of commute time in online mode. We propose an accurate and efficient approximation for computing the commute time in an incremental fashion in order to facilitate real-time applications. An online anomaly detection technique is designed where the commute time of each new arriving data point to any data point in the current graph can be estimated in constant time ensuring a real-time response. The proposed approach shows its high accuracy and efficiency in many synthetic and real datasets and takes only 8 milliseconds on average to detect anomalies online on the DBLP graph which has more than 600,000 nodes and 2 millions edges.

Keywords

Commute time Random walk Incremental learning Online anomaly detection 

References

  1. 1.
    Agrawal, R.K.: Karmeshu: Perturbation scheme for online learning of features: Incremental principal component analysis. Pattern Recogn. 41, 1452–1460 (2008)CrossRefMATHGoogle Scholar
  2. 2.
    Bay, S.D., Schwabacher, M.: Mining distance-based outliers in near linear timewith randomization and a simple pruning rule. In: KDD 2003: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 29–38. ACM, New York, NY, USA (2003)Google Scholar
  3. 3.
    Fouss, F., Renders, J.M.: Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 19(3), 355–369 (2007)CrossRefGoogle Scholar
  4. 4.
    Frank, A., Asuncion, A.: Uci machine learning repository (2010)Google Scholar
  5. 5.
    Golub, G.H.: Some modified matrix eigenvalue problems. SIAM Rev. 15(2), 318–334 (1973)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Gu, M., Eisenstat, S.C.: A stable and efficient algorithm for the rank-one modification of the symmetric eigenproblem. SIAM J. Matrix Anal. Appl. 15, 1266–1276 (1994)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Khoa, N.L.D., Chawla, S.: Robust outlier detection using commute time andeigenspace embedding. In: PAKDD 2010: Proceedings of the The 14th Pacific-AsiaConference on Knowledge Discovery and Data Mining, pp. 422–434. Springer, Berlin/Heidelberg (2010)Google Scholar
  8. 8.
    Khoa, N.L.D., Zhang, B., Wang, Y., Chen, F., Mustapha, S.: Robust dimensionality reduction and damage detection approaches in structural health monitoring. Struct. Health Monit. 13(4), 406–417 (2014)CrossRefGoogle Scholar
  9. 9.
    Koutis, I., Miller, G.L., Tolliver, D.: Combinatorial preconditioners andmultilevel solvers for problems in computer vision and image processing. In: Proceedings of the 5th International Symposium on Advances in VisualComputing: Part I, pp. 1067–1078. ISVC 2009, Springer-Verlag, Berlin, Heidelberg (2009)Google Scholar
  10. 10.
    Lovász, L.: Random walks on graphs: a survey. Comb. Paul Erdös is Eighty 2, 1–46 (1993)Google Scholar
  11. 11.
    von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Ning, H., Xu, W., Chi, Y., Gong, Y., Huang, T.: Incremental spectral clustering with application to monitoring of evolving blog communities. In: SIAM International Conference on Data Mining (2007)Google Scholar
  13. 13.
    Purnamrita Sarkar, A.W.M.: A tractable approach to finding closest truncated-commute-time neighbors in large graphs. In: The 23rd Conference on Uncertainty in Artificial Intelligence (UAI) (2007)Google Scholar
  14. 14.
    Qiu, H., Hancock, E.: Clustering and embedding using commute times. IEEE TPAMI 29(11), 1873–1890 (2007)CrossRefGoogle Scholar
  15. 15.
    Saerens, M., Fouss, F., Yen, L., Dupont, P.: The principal components analysisof a graph, and its relationships to spectral clustering. In: Proceedings of the15th European Conference on Machine Learning (ECML 2004), pp. 371–383. Springer-Verlag, Heidelberg (2004)Google Scholar
  16. 16.
    Sarkar, P., Moore, A.W., Prakash, A.: Fast incremental proximity search in large graphs. In: Proceedings of the 25th International Conference on Machine Learning, pp. 896–903. ICML 2008, NY, USA. ACM, New York (2008)Google Scholar
  17. 17.
    Spielman, D.A., Srivastava, N.: Graph sparsification by effective resistances. In: Proceedings of the 40th Annual ACM Symposium on Theory of Computing, pp. 563–568. STOC 2008, NY, USA. ACM, New York (2008)Google Scholar
  18. 18.
    Spielman, D.A., Teng, S.H.: Nearly-linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. CoRR abs/cs/0607105 (2006)Google Scholar
  19. 19.
    Venkatasubramanian, S., Wang, Q.: The johnson-lindenstrauss transform: An empirical study. In: Mller-Hannemann, M., Werneck, R.F.F. (ed.) ALENEX, pp. 164–173. SIAM (2011)Google Scholar
  20. 20.
    Zaidi, Z.R., Hakami, S., Landfeldt, B., Moors, T.: Real-time detection of traffic anomalies in wireless mesh networks. Wirel. Netw. 16, 1675–1689 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Data61, CSIROSydneyAustralia
  2. 2.Qatar Computing Research Institute, HBKUDohaQatar
  3. 3.University of SydneySydneyAustralia

Personalised recommendations