Advances in Data Analysis pp 51-58 | Cite as

# A Method for Analyzing the Asymptotic Behavior of the Walk Process in Restricted Random Walk Cluster Algorithm

## Abstract

The Restricted Random Walk clustering algorithm is based on the execution of a series of random walks on a similarity or distance graph such that in the course of the walk only edges with growing similarity are visited. Cluster construction follows the idea that the later a pair of nodes occurs in a walk, the higher their similarity and thus their tendency to belong to the same cluster.

The resulting clusters show a grade of stochastic variation that depends on the number of walks. In this research paper, we scrutinize the asymptotic behavior of this stochastic process for an infinite number of walks. Thus, we are able to establish a starting point for the analysis of the influence of stochastics on the clusters.

To this end, we construct a cycle-free graph based on the transition matrix of the walk process. The edges of the similarity graph form the nodes of the new graph. Its edges are determined by the predecessor-successor relation of the similarity graph’s edges. We then use a combination of shortest and longest path algorithms to calculate the highest possible position of each node pair in the walks that determines the cluster composition. In order to give an idea of the potential results of such an analysis, we show an exemplary comparison with single linkage clustering. On a local view, the clusters are very similar, however, a global view reveals differences in the order in which linkage of clusters takes place. This is due to the fact that restricted random walk clustering only has a local perspective of the similarity matrix while single linkage takes into account the whole matrix.

## Preview

Unable to display preview. Download preview PDF.

## References

- DAVIS, A., GARDNER, B.B. and GARDNER, M.R. (1948):
*Deep South. A Social Anthropological Study of Caste and Class*. University of Chicago Press, Chicago.Google Scholar - DIJKSTRA, E.W. (1959): A Note on Two Problems in Connexion With Graphs.
*Numerische Mathematik*,*1, 269–271*.CrossRefMathSciNetzbMATHGoogle Scholar - FRANKE, M. and THEDE, A. (2005): Clustering of Large Document Sets with Restricted Random Walks on Usage Histories. In: C. Weihs and W. Gaul (Eds.):
*Classification — the Ubiquitous Challenge*. Springer, Heidelberg,*402–409*.CrossRefGoogle Scholar - FRANKE, M. and GEYER-SCHULZ, A. (2006): Using Restricted Random Walks for Library Recommendations and Knowledge Space Exploration.
*International Journal of Pattern Recognition and Artificial Intelligence*,*Special Issue on Personalization Techniques for Recommender Systems and Intelligent User Interfaces*(to appear).Google Scholar - KARR, A.F. (1993):
*Probability*. Springer, Heidelberg.CrossRefzbMATHGoogle Scholar - SCHÖLL, J. and SCHÖLL-PASCHINGER, E. (2003): Classification by Restricted Random Walks.
*Pattern Recognition*, 36,6, 1279–1290.CrossRefzbMATHGoogle Scholar