Improved Approximation Algorithms for Bipartite Correlation Clustering

  • Nir Ailon
  • Noa Avigdor-Elgrabli
  • Edo Liberty
  • Anke van Zuylen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6942)

Abstract

In this work we study the problem of Bipartite Correlation Clustering (BCC), a natural bipartite counterpart of the well studied Correlation Clustering (CC) problem. Given a bipartite graph, the objective of BCC is to generate a set of vertex-disjoint bi-cliques (clusters) which minimizes the symmetric difference to it. The best known approximation algorithm for BCC due to Amit (2004) guarantees an 11-approximation ratio.

In this paper we present two algorithms. The first is an improved 4-approximation algorithm. However, like the previous approximation algorithm, it requires solving a large convex problem which becomes prohibitive even for modestly sized tasks.

The second algorithm, and our main contribution, is a simple randomized combinatorial algorithm. It also achieves an expected 4-approximation factor, it is trivial to implement and highly scalable. The analysis extends a method developed by Ailon, Charikar and Newman in 2008, where a randomized pivoting algorithm was analyzed for obtaining a 3-approximation algorithm for CC. For analyzing our algorithm for BCC, considerably more sophisticated arguments are required in order to take advantage of the bipartite structure.

Whether it is possible to achieve (or beat) the 4-approximation factor using a scalable and deterministic algorithm remains an open problem.

Keywords

Bipartite Graph Input Graph Approximation Guarantee Correlation Cluster Output Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Guo, J., Hüffner, F., Komusiewicz, C., Zhang, Y.: Improved Algorithms for Bicluster Editing. In: Agrawal, M., Du, D.-Z., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 445–456. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  2. 2.
    Symeonidis, P., Nanopoulos, A., Papadopoulos, A., Manolopoulos, Y.: Nearest-biclusters collaborative filtering (2006)Google Scholar
  3. 3.
    Amit, N.: The bicluster graph editing problem (2004)Google Scholar
  4. 4.
    Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans. Comput. Biol. Bioinformatics 1, 24–45 (2004)CrossRefGoogle Scholar
  5. 5.
    Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103. AAAI Press, Menlo Park (2000)Google Scholar
  6. 6.
    Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Machine Learning 56, 89–113 (2004)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004, p. 36. ACM, New York (2004)Google Scholar
  8. 8.
    Zha, H., He, X., Ding, C., Simon, H., Gu, M.: Bipartite graph partitioning and data clustering. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, CIKM 2001, pp. 25–32. ACM, New York (2001)CrossRefGoogle Scholar
  9. 9.
    Demaine, E.D., Emanuel, D., Fiat, A., Immorlica, N.: Correlation clustering in general weighted graphs. Theoretical Computer Science (2006)Google Scholar
  10. 10.
    Charikar, M., Guruswami, V., Wirth, A.: Clustering with qualitative information. J. Comput. Syst. Sci. 71(3), 360–383 (2005)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: Ranking and clustering. J. ACM 55(5), 1–27 (2008)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Ailon, N., Liberty, E.: Correlation Clustering Revisited: The True Cost of Error Minimization Problems. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009. LNCS, vol. 5555, pp. 24–36. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  13. 13.
    van Zuylen, A., Williamson, D.P.: Deterministic pivoting algorithms for constrained ranking and clustering problems. Math. Oper. Res. 34(3), 594–620 (2009); Preliminary version appeared in SODA 2007 (with Rajneesh Hegde and Kamal Jain)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Giotis, I., Guruswami, V.: Correlation clustering with a fixed number of clusters. In: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1167–1176. ACM, New York (2006)CrossRefGoogle Scholar
  15. 15.
    Karpinski, M., Schudy, W.: Linear time approximation schemes for the gale-berlekamp game and related minimization problems. CoRR, abs/0811.3244 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Nir Ailon
    • 1
  • Noa Avigdor-Elgrabli
    • 1
  • Edo Liberty
    • 2
  • Anke van Zuylen
    • 3
  1. 1.TechnionHaifaIsrael
  2. 2.Yahoo! ResearchHaifaIsrael
  3. 3.Max-Planck Institut für InformatikSaarbrückenGermany

Personalised recommendations