Visualizing Very Large Graphs Using Clustering Neighborhoods
This paper presents a method for visualization of large graphs in a two-dimensional space, such as a collection of Web pages. The main contribution here is in the representation change to enable better handling of the data. The idea of the method consists from three major steps: (1) First, we transform a graph into a sparse matrix, where for each vertex in the graph there is one sparse vector in the matrix. Sparse vectors have non-zero components for the vertices that are close to the vertex represented by the vector. (2) Next, we perform hierarchical clustering (eg., hierarchical K-Means) on the set of sparse vectors, resulting in the hierarchy of clusters. (3) In the last step, we map hierarchy of clusters into a two-dimensional space in the way that more similar clusters appear closely on the picture. The effect of the whole procedure is that we assign unique X and Y coordinates to each vertex, in a way those vertices or groups of vertices on several levels of hierarchy that are stronger connected in a graph are place closer in the picture. The method is particular useful for power distributed graphs. We show applications of the method on real-world examples of visualization of institution collaboration graph and cross-sell recommendation graph.
KeywordsHierarchical Cluster Sparse Matrix Graph Transformation Cosine Similarity Original Graph
Unable to display preview. Download preview PDF.
- 1.Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience, Hoboken (2000)Google Scholar
- 2.Fayyad, U., Grinstein, G.G., Wierse, A. (eds.): Information Visualization in Data Mining and Knowledge Discovery. Morgan Kaufmann, San Francisco (2001)Google Scholar
- 3.Grobelnik, M., Mladenić, D.: Efficient visualization of large text corpora. In: Proceedings of the seventh TELRI seminar, Dubrovnik, Croatia (2002)Google Scholar
- 4.Grobelnik, M., Mladenić, D.: Analysis of a database of research projects using text mining and link analysis. In: Mladenić, D., Lavrac, N., Bohanec, M., Moyle, S. (eds.) Data mining and decision support: integration and collaboration. The Kluwer international series in engineering and computer science, SECS 745, pp. 157–166. Kluwer Academic Publishers, Dordrecht (2003)Google Scholar
- 5.Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2001)Google Scholar
- 10.Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Tech. Rept. SIDL-WP-1999-020, Stanford University (January 1998)Google Scholar
- 11.Robbins, K.S., Gorman, M.: Fast Visualization Methods for Comparing Dynamics: A Case Study in Combustion. In: Proceedings of the 11th IEEE Visualization 2000 Conference. IEEE Computer Society, Los Alamitos (2000)Google Scholar
- 12.Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: Proceedings of KDD Workshop on Text Mining, pp. 109–110 (2000)Google Scholar
- 13.Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)Google Scholar