Abstract
Cluster analysis describes the division of a dataset into subsets of related objects, which are usually disjoint. There is considerable variety among the different types of clustering algorithms. Some of these clustering algorithms represent the dataset as a graph, and use graph-based properties to generate the clusters. However, many graph properties have not been explored as the basis for a clustering algorithm. In graph theory, a subgraph of a graph is distance-preserving if the distances (lengths of shortest paths) between every pair of vertices in the subgraph are the same as the corresponding distances in the original graph. In this paper, we consider the question of finding proper distance-preserving subgraphs, and the problem of partitioning a simple graph into an arbitrary number of distance-preserving subgraphs for clustering purposes. We then present a clustering algorithm called DP-Cluster, based on the notion of distance-preserving subgraphs. We also introduce the concept of relaxation values to the distance-preserving subgraph finding heuristic embedded in DP-Cluster, and investigate this and other variations of the algorithm. One area of research that makes considerable use of graph theory is the analysis of social networks. For this reason we evaluate the performance of DP-Cluster on two real-world social network datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ankerst, M., Breunig, M., Kriegel, H., Sander, J.: OPTICS: ordering points to identify the clustering structure. ACM SIGMOD Rec. 28(2), 49–60 (1999)
Bandelt, H., Mulder, H.: Distance-hereditary graphs. J. Comb. Theory B 41(2), 182–208 (1986)
Bellman, R.: On a routing problem. Q. Appl. Math. 16(1), 87–90 (1958)
Charikar, M., Chekuri, C., Feder, T., Motwani, R.: Incremental clustering and dynamic information retrieval. In: Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, pp. 626–635. ACM, New York (1997)
Damiand, G., Habib, M., Paul, C.: A simple paradigm for graph recognition: application to cographs and distance hereditary graphs. Theor. Comput. Sci. 263(1–2), 99–111 (2001)
Dijkstra, E.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271 (1959)
Doreian, P., Batagelj, V., Ferligoj, A.: Positional analyses of sociometric data. Models and Methods in Social Network Analysis, pp. 77–97. Cambridge University Press, New York (2005)
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of KDD, vol. 96, pp. 226–231. AAAI, Menlo Park (1996)
Flake, G., Tarjan, R., Tsioutsiouliklis, K.: Graph clustering and minimum cut trees. Int. Math. 1(4), 385–408 (2004)
Floyd, R.: Algorithm 97: shortest path. Commun. ACM 5(6), 345 (1962)
Getoor, L., Diehl, C.: Link mining: a survey. ACM SIGKDD Explor. Newsl. 7(2), 12 (2005)
Hammer, P., Maffray, F.: Completely separable graphs. Discret. Appl. Math. 27(1–2), 85–99 (1990)
Howorka, E.: A characterization of distance-hereditary graphs. Q. J. Math. Oxf. Ser. 2(28), 417–420 (1977)
Liu, K., Bhaduri, K., Das, K., Nguyen, P., Kargupta, H.: Client-side web mining for community formation in peer-to-peer environments. ACM SIGKDD Explor. Newsl. 8(2), 20 (2006)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. Defense Technical Information Center, Ft. Belvoir (1966)
Newman, M., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 26113 (2004)
Nussbaum, R., Esfahanian, A., Tan, P.: Clustering social networks using distance-preserving subgraphs. In: 2010 International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 380–385. IEEE, Washington, DC (2010)
Plesnik, J.: A heuristic for the p-center problem in graphs. Discret. Appl. Math. 17(3), 263–268 (1987)
Scripps, J., Tan, P.: Constrained overlapping clusters: minimizing the negative effects of bridge-nodes. Stat. Anal. Data Min. 3(1), 20–37 (2010)
Tantipathananandh, C., Berger-Wolf, T., Kempe, D.: A framework for community identification in dynamic social networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p. 726. ACM, New York (2007)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)
Watts, D., Strogatz, S.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998)
Zhou, D., Councill, I., Zha, H., Giles, C.: Discovering temporal communities from social network documents. In: Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, pp. 745–750. IEEE, Washington, DC (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Wien
About this chapter
Cite this chapter
Nussbaum, R., Esfahanian, AH., Tan, PN. (2013). Clustering Social Networks Using Distance-Preserving Subgraphs. In: Özyer, T., Rokne, J., Wagner, G., Reuser, A. (eds) The Influence of Technology on Social Network Analysis and Mining. Lecture Notes in Social Networks, vol 6. Springer, Vienna. https://doi.org/10.1007/978-3-7091-1346-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-7091-1346-2_14
Published:
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-1345-5
Online ISBN: 978-3-7091-1346-2
eBook Packages: Computer ScienceComputer Science (R0)