Abstract
In a clustering problem one has to partition a set of elements into homogeneous and well-separated subsets. From a graph theoretic point of view, a cluster graph is a vertex-disjoint union of cliques. The clustering problem is the task of making fewest changes to the edge set of an input graph so that it becomes a cluster graph. We study the complexity of three variants of the problem. In the Cluster Completion variant edges can only be added. In Cluster Deletion, edges can only be deleted. In Cluster Editing, both edge additions and edge deletions are allowed. We also study these variants when the desired solution must contain a prespecified number of clusters.
We show that Cluster Editing is NP-complete, Cluster Deletion is NPhard to approximate to within some constant factor, and Cluster Completion is polynomial. When the desired solution must contain exactly p clusters, we show that Cluster Editing is NP-complete for every p≥ 2; Cluster Deletion is polynomial for p = 2 but NP-complete for p> 2; and Cluster Completion is polynomial for any p. We also give a constant factor approximation algorithm for Cluster Editing when p = 2.
Supported in part by the Israel Science Foundation (grant number 565/99).
Supported by an Eshkol fellowship from the Ministry of Science, Israel.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. A. Alizadeh, M. B. Eisen, et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature, 403(6769):503–511, 2000.
A. Ben-Dor, R. Shamir, and Z. Yakhini. Clustering gene expression patterns. Journal of Computational Biology, 6(3/4):281–297, 1999.
C. Berge. Graphs and Hypergraphs. North-Holland, Amsterdam, 1973.
M. R. Garey and D. S. Johnson. Computers and Intractability: A Gui de to theTheory of NP-Completeness. W. H. Freeman and Co., San Francisco, 1979.
M. X. Goemans and D. P. Williamson. Improved approximation algorithms for maximum cut and satis.ability problems using semide.nite programming. Journal of the ACM, 42(6):1115–1145, 1995.
T. R. Golub, D. K. Slonim, et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286:531–537, October 1999.
C. Hagen and A.B. Kahng. New spectral methods for ratio cut partitioning and clustering. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 11(9):1074–1085, 1992.
P. Hansen and B. Jaumard. Cluster analysis and mathematical programming. Mathematical Programming, 79:191–215, 1997.
J.A. Hartigan. Clustering Algorithms. John Wiley and Sons, 1975.
D. S. Hochbaum, editor. Approximation Alogrithms for NP-Hard Problems. PWS Publishing, Boston, 1997.
L. Lovasz. Covering and coloring of hypergraphs. In Proc. 4th Southeastern Conf. on Combinatorics, Graph Theory, and Computing. Utilitas Mathematica Publishing, 1973.
A. Natanzon. Complexity and approximation of some graph modi.cation problems. Master’s thesis, Department of Computer Science, Tel Aviv University, 1999.
A. Natanzon, R. Shamir, and R. Sharan. Complexity classi.cation of some edge modification problems. Discrete Applied Mathematics, 113:109–128, 2001.
C. Papadimitriou and M. Yannakakis. Optimization, approximation, and complexity classes. J. of Computer and System Science, 43:425–440, 1991.
R. Sharan and R. Shamir. CLICK: A clustering algorithm with applications to gene expression analysis. In Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB), pages 307–316, 2000.
Z. Wu and R. Leahy. An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11):1101–1113, 1993.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shamir, R., Sharan, R., Tsur, D. (2002). Cluster Graph Modification Problems. In: Goos, G., Hartmanis, J., van Leeuwen, J., Kučera, L. (eds) Graph-Theoretic Concepts in Computer Science. WG 2002. Lecture Notes in Computer Science, vol 2573. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36379-3_33
Download citation
DOI: https://doi.org/10.1007/3-540-36379-3_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00331-1
Online ISBN: 978-3-540-36379-8
eBook Packages: Springer Book Archive