Abstract
Motivated by the fact that distances between data points in many real-world clustering instances are often based on heuristic measures, Bilu and Linial [6] proposed analyzing objective based clustering problems under the assumption that the optimum clustering to the objective is preserved under small multiplicative perturbations to distances between points. In this paper, we provide several results within this framework. For separable center-based objectives, we present an algorithm that can optimally cluster instances resilient to \((1 + \sqrt{2})\)-factor perturbations, solving an open problem of Awasthi et al. [2]. For the k-median objective, we additionally give algorithms for a weaker, relaxed, and more realistic assumption in which we allow the optimal solution to change in a small fraction of the points after perturbation. We also provide positive results for min-sum clustering which is a generally much harder objective than k-median (and also non-center-based). Our algorithms are based on new linkage criteria that may be of independent interest.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arya, V., Garg, N., Khandekar, R., Meyerson, A., Munagala, K., Pandit, V.: Local search heuristics for k-median and facility location problems. SIAM J. Comput. 33(3) (2004)
Awasthi, P., Blum, A., Sheffet, O.: Center-based clustering under perturbation stability. Inf. Process. Lett. 112(1-2), 49–54 (2012)
Balcan, M.F., Gupta, P.: Robust hierarchical clustering. In: COLT (2010)
Balcan, M.F., Liang, Y.: Clustering under Perturbation Resilience. CoRR, abs/1112.0826 (2011)
Bartal, Y., Charikar, M., Raz, D.: Approximating min-sum -clustering in metric spaces. In: STOC (2001)
Bilu, Y., Linial, N.: Are stable instances easy? In: Innovations in Computer Science (2010)
Charikar, M., Guha, S., Tardos, É., Shmoys, D.B.: A constant-factor approximation algorithm for the k-median problem. J. Comput. Syst. Sci. 65(1) (2002)
de la Vega, W.F., Karpinski, M., Kenyon, C., Rabani, Y.: Approximation schemes for clustering problems. In: STOC (2003)
Jain, K., Mahdian, M., Saberi, A.: A new greedy approach for facility location problems. In: STOC (2002)
Reyzin, L.: Data stability in clustering: A closer look. CoRR, abs/1107.2379 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Balcan, M.F., Liang, Y. (2012). Clustering under Perturbation Resilience. In: Czumaj, A., Mehlhorn, K., Pitts, A., Wattenhofer, R. (eds) Automata, Languages, and Programming. ICALP 2012. Lecture Notes in Computer Science, vol 7391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31594-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-31594-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31593-0
Online ISBN: 978-3-642-31594-7
eBook Packages: Computer ScienceComputer Science (R0)