Advertisement

A Method for Analyzing the Asymptotic Behavior of the Walk Process in Restricted Random Walk Cluster Algorithm

  • Markus Franke
  • Andreas Geyer-Schulz
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

The Restricted Random Walk clustering algorithm is based on the execution of a series of random walks on a similarity or distance graph such that in the course of the walk only edges with growing similarity are visited. Cluster construction follows the idea that the later a pair of nodes occurs in a walk, the higher their similarity and thus their tendency to belong to the same cluster.

The resulting clusters show a grade of stochastic variation that depends on the number of walks. In this research paper, we scrutinize the asymptotic behavior of this stochastic process for an infinite number of walks. Thus, we are able to establish a starting point for the analysis of the influence of stochastics on the clusters.

To this end, we construct a cycle-free graph based on the transition matrix of the walk process. The edges of the similarity graph form the nodes of the new graph. Its edges are determined by the predecessor-successor relation of the similarity graph’s edges. We then use a combination of shortest and longest path algorithms to calculate the highest possible position of each node pair in the walks that determines the cluster composition. In order to give an idea of the potential results of such an analysis, we show an exemplary comparison with single linkage clustering. On a local view, the clusters are very similar, however, a global view reveals differences in the order in which linkage of clusters takes place. This is due to the fact that restricted random walk clustering only has a local perspective of the similarity matrix while single linkage takes into account the whole matrix.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. DAVIS, A., GARDNER, B.B. and GARDNER, M.R. (1948): Deep South. A Social Anthropological Study of Caste and Class. University of Chicago Press, Chicago.Google Scholar
  2. DIJKSTRA, E.W. (1959): A Note on Two Problems in Connexion With Graphs. Numerische Mathematik, 1, 269–271.CrossRefMathSciNetMATHGoogle Scholar
  3. FRANKE, M. and THEDE, A. (2005): Clustering of Large Document Sets with Restricted Random Walks on Usage Histories. In: C. Weihs and W. Gaul (Eds.): Classification — the Ubiquitous Challenge. Springer, Heidelberg, 402–409.CrossRefGoogle Scholar
  4. FRANKE, M. and GEYER-SCHULZ, A. (2006): Using Restricted Random Walks for Library Recommendations and Knowledge Space Exploration. International Journal of Pattern Recognition and Artificial Intelligence, Special Issue on Personalization Techniques for Recommender Systems and Intelligent User Interfaces (to appear).Google Scholar
  5. KARR, A.F. (1993): Probability. Springer, Heidelberg.CrossRefMATHGoogle Scholar
  6. SCHÖLL, J. and SCHÖLL-PASCHINGER, E. (2003): Classification by Restricted Random Walks. Pattern Recognition, 36,6, 1279–1290.CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Markus Franke
    • 1
  • Andreas Geyer-Schulz
    • 1
  1. 1.Institute for Information Systems and ManagementUniversität Karlsruhe (TH)KarlsruheGermany

Personalised recommendations