Abstract
The Cluster Editing problem asks to transform a graph into a disjoint union of cliques using a minimum number of edge modifications. Although the problem has been proven NP-complete several times, it has nevertheless attracted much research both from the theoretical and the applied side. The problem has been the inspiration for numerous algorithms in bioinformatics, aiming at clustering entities such as genes, proteins, phenotypes, or patients. In this paper, we review exact and heuristic methods that have been proposed for the Cluster Editing problem, and also applications of these algorithms for biological problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sharan, R., Maron-Katz, A., Shamir, R.: CLICK and EXPANDER: a system for clustering and visualizing gene expression data. Bioinformatics 19(14), 1787–1799 (2003)
Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. J. Comput. Biol. 6(3-4), 281–297 (1999)
Hartuv, E., Schmitt, A.O., Lange, J., Meier-Ewert, S., Lehrach, H., Shamir, R.: An algorithm for clustering cDNA fingerprints. Genomics 66(3), 249–256 (2000)
Zahn Jr., C.T.: Approximating symmetric relations by equivalence relations. J. Soc. Indust. Appl. Math. 12(4), 840–847 (1964)
Moon, J.W.: A note on approximating symmetric relations by equivalence classes. Siam J. Appl. Math. 14(2), 226–227 (1966)
Rahmann, S., Wittkop, T., Baumbach, J., Martin, M., Truss, A., Böcker, S.: Exact and heuristic algorithms for weighted cluster editing. In: Proc. of Computational Systems Bioinformatics (CSB 2007), vol. 6, pp. 391–401 (2007)
Grötschel, M., Wakabayashi, Y.: A cutting plane algorithm for a clustering problem. Math. Program. 45, 52–96 (1989)
Kochenberger, G.A., Glover, F., Alidaee, B., Wang, H.: Clustering of microarray data via clique partitioning. J. Comb. Optim. 10(1), 77–92 (2005)
Delvaux, S., Horsten, L.: On best transitive approximations to simple graphs. Acta. Inform. 40(9), 637–655 (2004)
Shamir, R., Sharan, R., Tsur, D.: Cluster graph modification problems. Discrete Appl. Math. 144(1-2), 173–182 (2004)
Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Mach. Learn. 56(1), 89–113 (2004)
Křivánek, M., Morávek, J.: NP-hard problems in hierarchical-tree clustering. Acta Inform. 23(3), 311–323 (1986)
Komusiewicz, C., Uhlmann, J.: Cluster editing with locally bounded modifications. Discrete Appl. Math. 160(15), 2259–2270 (2012)
Impagliazzo, R., Paturi, R., Zane, F.: Which problems have strongly exponential complexity? J. Comput. Syst. Sci. 63(4), 512–530 (2001)
Mannaa, B.: Cluster editing problem for points on the real line: A polynomial time algorithm. Inform. Process. Lett. 110, 961–965 (2010)
Böcker, S.: A golden ratio parameterized algorithm for cluster editing. J. Discrete Algorithms 16, 79–89 (2012)
Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, Berlin (1999)
Cao, Y., Chen, J.: Cluster editing: Kernelization based on edge cuts. Algorithmica 64(1), 152–169 (2012)
Fomin, F.V., Kratsch, S., Pilipczuk, M., Pilipczuk, M., Villanger, Y.: Tight bounds for parameterized complexity of cluster editing. In: Proc. of Symposium on Theoretical Aspects of Computer Science (STACS 2013). LIPIcs, vol. 20, pp. 32–43. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik (2013)
Charikar, M., Guruswami, V., Wirth, A.: Clustering with qualitative information. J. Comput. System Sci. 71(3), 360–383 (2005)
Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: Ranking and clustering. J. ACMÂ 55(5), Article 23 (2008)
van Zuylen, A., Williamson, D.P.: Deterministic pivoting algorithms for constrained ranking and clustering problems. Math. Oper. Res. 34(3), 594–620 (2009)
Vescia, G.: Descriptive classification of cetacea: whales, porpoises and dolphins. In: Marcotorchino, J., Proth, J.M., Janssen, J. (eds.) Data Analysis in Real Life Environment: Ins and Outs of Solving Problems, pp. 7–14. Elsevier Science, North-Holland, Amsterdam (1985)
Vescia, G.: Automatic classification of cetaceans by similarity aggregation. In: Marcotorchino, J., Proth, J.M., Janssen, J. (eds.) Data Analysis in Real Life Environment: Ins and Outs of Solving Problems, pp. 15–24. Elsevier Science, North-Holland, Amsterdam (1985)
Marcotorchino, J., Michaud, P.: Heuristic approach to the similarity aggregation problem. Methods of Operations Research 43, 395–404 (1981)
Marcotorchino, J., Michaud, P.: Optimization in exploratory data analysis. In: Proc. of Symposium on Operations Research, Köln, Germany. Physica Verlag (1981)
Schader, M., Tüshaus, U.: Ein Subgradientenverfahren zur Klassifikation qualitativer Daten. Operations Research Spektrum 7, 1–5 (1985)
Böcker, S., Briesemeister, S., Klau, G.W.: Exact algorithms for cluster editing: Evaluation and experiments. Algorithmica 60(2), 316–334 (2011)
Böcker, S., Briesemeister, S., Klau, G.W.: Exact algorithms for cluster editing: Evaluation and experiments. In: McGeoch, C.C. (ed.) WEA 2008. LNCS, vol. 5038, pp. 289–302. Springer, Heidelberg (2008)
Gramm, J., Guo, J., Hüffner, F., Niedermeier, R.: Graph-modeled data clustering: Fixed-parameter algorithms for clique generation. Theory Comput. Syst. 38(4), 373–392 (2005)
Gramm, J., Guo, J., Hüffner, F., Niedermeier, R.: Automated generation of search tree algorithms for hard graph modification problems. Algorithmica 39(4), 321–347 (2004)
Böcker, S., Briesemeister, S., Bui, Q.B.A., Truss, A.: Going weighted: Parameterized algorithms for cluster editing. Theor. Comput. Sci. 410(52), 5467–5480 (2009)
Böcker, S., Damaschke, P.: Even faster parameterized cluster deletion and cluster editing. Inform. Process. Lett. 111(14), 717–721 (2011)
Protti, F., da Silva, M.D., Szwarcfiter, J.L.: Applying modular decomposition to parameterized cluster editing problems. Theory Comput. Syst. 44(1), 91–104 (2009)
Fellows, M.R.: The lost continent of polynomial time: Preprocessing and kernelization. In: Bodlaender, H.L., Langston, M.A. (eds.) IWPEC 2006. LNCS, vol. 4169, pp. 276–277. Springer, Heidelberg (2006)
Fellows, M.R., Langston, M.A., Rosamond, F.A., Shaw, P.: Efficient parameterized preprocessing for cluster editing. In: Csuhaj-Varjú, E., Ésik, Z. (eds.) FCT 2007. LNCS, vol. 4639, pp. 312–321. Springer, Heidelberg (2007)
Guo, J.: A more effective linear kernelization for cluster editing. Theor. Comput. Sci. 410(8-10), 718–726 (2009)
Chen, J., Meng, J.: A 2k kernel for the cluster editing problem. J. Comput. Syst. Sci. 78(1), 211–220 (2012)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Wittkop, T., Baumbach, J., Lobo, F.P., Rahmann, S.: Large scale clustering of protein sequences with FORCE — a layout based heuristic for weighted cluster editing. BMC Bioinformatics 8, 396 (2007)
Wittkop, T., Emig, D., Lange, S., Rahmann, S., Albrecht, M., Morris, J.H., Böcker, S., Stoye, J., Baumbach, J.: Partitioning biological data with transitivity clustering. Nat. Methods 7(6), 419–420 (2010)
Smoot, M.E., Ono, K., Ruscheinski, J., Wang, P.L., Ideker, T.: Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27(3), 431–432 (2011)
Morris, J.H., Apeltsin, L., Newman, A.M., Baumbach, J., Wittkop, T., Su, G., Bader, G.D., Ferrin, T.E.: clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinformatics 12, 436 (2011)
Cerdeira, L.T., Carneiro, A.R., Ramos, R.T.J., de Almeida, S.S., D’Afonseca, V., Schneider, M.P.C., Baumbach, J., Tauch, A., McCulloch, J.A., Azevedo, V.A.C., Silva, A.: Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. J. Microbiol. Methods 86(2), 218–223 (2011)
Baumbach, J., Tauch, A., Rahmann, S.: Towards the integrated analysis, visualization and reconstruction of microbial gene regulatory networks. Brief Bioinform. 10(1), 75–83 (2009)
Baumbach, J.: On the power and limits of evolutionary conservation–unraveling bacterial gene regulatory networks. Nucleic Acids Res. 38(22), 7877–7884 (2010)
Röttger, R., Kalaghatgi, P., Sun, P., Soares, S.C., Azevedo, V., Wittkop, T., Baumbach, J.: Density parameter estimation for finding clusters of homologous proteins–tracing actinobacterial pathogenicity lifestyles. Bioinformatics 29(2), 215–222 (2013)
Röttger, R., Kreutzer, C., Vu, T.D., Wittkop, T., Baumbach, J.: Online transitivity clustering of biological data with missing values. In: Proc. of German Conference on Bioinformatics (GCB 2012), pp. 57–68 (2012)
Sakai, S., Takaki, Y., Shimamura, S., Sekine, M., Tajima, T., Kosugi, H., Ichikawa, N., Tasumi, E., Hiraki, A.T., Shimizu, A., Kato, Y., Nishiko, R., Mori, K., Fujita, N., Imachi, H., Takai, K.: Genome sequence of a mesophilic hydrogenotrophic methanogen methanocella paludicola, the first cultivated representative of the order methanocellales. PLoS One 6(7), e22898 (2011)
Jochmann, N., Kurze, A.K., Czaja, L.F., Brinkrolf, K., Brune, I., Hüser, A.T., Hansmeier, N., Pühler, A., Borovok, I., Tauch, A.: Genetic makeup of the Corynebacterium glutamicum LexA regulon deduced from comparative transcriptomics and in vitro DNA band shift assays. Microbiology 155(pt. 5), 1459–1477 (2009)
Baumbach, J., Rahmann, S., Tauch, A.: Reliable transfer of transcriptional gene regulatory networks between taxonomically related organisms. BMC Syst. Biol. 3, 8 (2009)
Pauling, J., Röttger, R., Tauch, A., Azevedo, V., Baumbach, J.: CoryneRegNet 6.0—updated database content, new analysis methods and novel features focusing on community demands. Nucleic Acids Res. 40(Database issue), D610–D614 (2012)
Pauling, J., Röttger, R., Neuner, A., Salgado, H., Collado-Vides, J., Kalaghatgi, P., Azevedo, V., Tauch, A., Pühler, A., Baumbach, J.: On the trail of EHEC/EAEC—unraveling the gene regulatory networks of human pathogenic Escherichia coli bacteria. Integr. Biol. (Camb.) 4(7), 728–733 (2012)
Wittkop, T., Emig, D., Truss, A., Albrecht, M., Böcker, S., Baumbach, J.: Comprehensive cluster analysis with transitivity clustering. Nat. Protocols 6, 285–295 (2011)
Hauschild, A.C., Schneider, T., Pauling, J., Rupp, K., Jang, M., Baumbach, J., Baumbach, J.: Computational methods for metabolomics data analysis of ion mobility spectrometry data — reviewing the state of the art. Metabolites 2(4), 733–755 (2012)
Wittkop, T., Rahmann, S., Böcker, S., Baumbach, J.: Extension and robustness of transitivity clustering for protein-protein interaction network analysis. Internet Math. 7(4), 255–273 (2011)
Robertson, A.L., Bate, M.A., Buckle, A.M., Bottomley, S.P.: The rate of polyQ-mediated aggregation is dramatically affected by the number and location of surrounding domains. J. Mol. Biol. 413(4), 879–887 (2011)
Pacheco, L.G.C., Slade, S.E., Seyffert, N., Santos, A.R., Castro, T.L.P., Silva, W.M., Santos, A.V., Santos, S.G., Farias, L.M., Carvalho, M.A.R., Pimenta, A.M.C., Meyer, R., Silva, A., Scrivens, J.H., Oliveira, S.C., Miyoshi, A., Dowson, C.G., Azevedo, V.: A combined approach for comparative exoproteome analysis of Corynebacterium pseudotuberculosis. BMC Microbiol. 11(1), 12 (2011)
Wittkop, T., Berman, A.E., Fleisch, K.M., Mooney, S.D.: DEFOG: discrete enrichment of functionally organized genes. Integr. Biol (Camb) 4(7), 795–804 (2012)
Marx, D., Razgon, I.: Fixed-parameter tractability of multicut parameterized by the size of the cutset. In: Proc. of ACM Symposium on Theory of Computing (STOC 2011), pp. 469–478. ACM press, New York (2011)
Bousquet, N., Daligault, J., Thomassé, S.: Multicut is FPT. In: Proc. of ACM Symposium on Theory of Computing (STOC 2011), pp. 459–468. ACM press, New York (2011)
Damaschke, P.: Fixed-parameter enumerability of cluster editing and related problems. Theory Comput. Syst. 46(2), 261–283 (2010)
Komusiewicz, C., Uhlmann, J.: Alternative parameterizations for cluster editing. In: Černá, I., Gyimóthy, T., Hromkovič, J., Jefferey, K., Králović, R., Vukolić, M., Wolf, S. (eds.) SOFSEM 2011. LNCS, vol. 6543, pp. 344–355. Springer, Heidelberg (2011)
Damaschke, P.: Bounded-degree techniques accelerate some parameterized graph algorithms. In: Chen, J., Fomin, F.V. (eds.) IWPEC 2009. LNCS, vol. 5917, pp. 98–109. Springer, Heidelberg (2009)
Weller, M., Komusiewicz, C., Niedermeier, R., Uhlmann, J.: On making directed graphs transitive. J. Comput. Syst. Sci. 78(2), 559–574 (2012)
Böcker, S., Briesemeister, S., Klau, G.W.: On optimal comparability editing with applications to molecular diagnostics. BMC Bioinformatics 10(suppl. 1), S61 (2009); Proc. of Asia-Pacific Bioinformatics Conference (APBC 2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Böcker, S., Baumbach, J. (2013). Cluster Editing. In: Bonizzoni, P., Brattka, V., Löwe, B. (eds) The Nature of Computation. Logic, Algorithms, Applications. CiE 2013. Lecture Notes in Computer Science, vol 7921. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39053-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-39053-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39052-4
Online ISBN: 978-3-642-39053-1
eBook Packages: Computer ScienceComputer Science (R0)