Skip to main content
Log in

New heuristics for the Bicluster Editing Problem

  • S.I.: CLAIO 2014
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

The NP-hard Bicluster Editing Problem (BEP) consists of editing a minimum number of edges of an input bipartite graph G in order to transform it into a vertex-disjoint union of complete bipartite subgraphs. Editing an edge consists of either adding it to the graph or deleting it from the graph. Applications of the BEP include data mining and analysis of gene expression data. In this work, we generate and analyze random bipartite instances for the BEP to perform empirical tests. A new reduction rule for the problem is proposed, based on the concept of critical independent sets, providing an effective reduction in the size of the instances. We also propose a set of heuristics using concepts of the metaheuristics ILS, VNS, and GRASP, including a constructive heuristic based on analyzing vertex neighborhoods, three local search procedures, and an auxiliary data structure to speed up the local search. Computational experiments show that our heuristics outperform other methods from the literature with respect to both solution quality and computational time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Ailon, N., Avigdor-Elgrabli, N., Liberty, E., & van Zuylen, A. (2012). Improved approximation algorithms for bipartite correlation clustering. SIAM Journal on Computing, 41(5), 1110–1121.

    Article  Google Scholar 

  • Amit, N. (2004). The bicluster graph editing problem. Master’s thesis, Tel Aviv University, Tel Aviv-Yafo.

  • Bansal, N., Blum, A., & Chawla, S. (2004). Correlation clustering. Machine Learning, 56, 89–113.

    Article  Google Scholar 

  • Barrett, T., Wilhite, S. E., Ledoux, P., Evangelista, C., Kim, I. F., Tomashevsky, M., et al. (2013). NCBI GEO: Archive for functional genomics data sets—update. Nucleic Acids Research, 41(Database–Issue), 991–995.

    Google Scholar 

  • Bastos, L. O. (2012). New algorithms and theoretical results for the graph partitioning problem via edge editions (in Portuguese). Ph.D. thesis, Fluminense Federal University, Brazil.

  • Ben-Dor, A., Chor, B., Karp, R., & Yakhini, Z. (2002). Discovering local structure in gene expression data: The order-preserving submatrix problem. In: Proceedings of the sixth annual international conference on computational biology, New York, NY: ACM RECOMB’02, pp. 49–57. doi:10.1145/565196.565203.

  • Bergmann, S., Ihmels, J., & Barkai, N. (2003). Iterative signature algorithm for the analysis of large-scale gene expression data. Physical Review E, 67(3 Pt 1), 031902.

    Article  Google Scholar 

  • Bozdağ, D., Parvin, J. D., & Catalyurek, U. V. (2009). A biclustering method to discover co-regulated genes using diverse gene expression datasets. In: Proceedings of the 1st international conference on bioinformatics and computational biology, BICoB’09 (pp. 151–163). Berlin, Heidelberg: Springer. doi:10.1007/978-3-642-00727-9_16.

  • Cheng, Y., & Church, G. M. (2000). Biclustering of expression data. In: Proceedings of the eighth international conference on intelligent systems for molecular biology (pp. 93–103). Menlo Park: AAAI Press.

  • Gilbert, E. N. (1959). Random graphs. Annals of Mathematical Statistics, 3, 1141–1144.

    Article  Google Scholar 

  • Guo, J., Hüffner, F., Komusiewicz, C., & Zhang, Y. (2008). Improved algorithms for bicluster editing. In: TAMC’08—5th international conference on theory and applications of models of computation, Lecture Notes in Computer Science, (Vol. 4978, pp. 445–456).

  • Hansen, P., Mladenović, N., & Moreno Perez, J. (2010). Variable neighbourhood search: Methods and applications. Annals of Operations Research, 175, 367–407.

    Article  Google Scholar 

  • Hartigan, J. A. (1972). Direct clustering of a data matrix. Journal of the American Statistical Association, 67(337), 123–129. http://www.jstor.org/stable/2284710.

  • Hartigan, J. A. (1975). Clustering algorithms (99th ed.). New York, NY: Wiley.

    Google Scholar 

  • Hochreiter, S., Bodenhofer, U., Heusel, M., Mayr, A., Mitterecker, A., Kasim, A., et al. (2010). FABIA: Factor analysis for bicluster acquisition. Bioinformatics, 26(12), 1520–1527.

    Article  Google Scholar 

  • Huttenhower, C., Mutungu, K. T., Indik, N., Yang, W., Schroeder, M., Forman, J., et al. (2009). Detailing regulatory networks through large scale data integration. Bioinformatics, 25(24), 3267–3274.

    Article  Google Scholar 

  • Kluger, Y., Basri, R., Chang, J., & Gerstein, M. (2003). Spectral biclustering of microarray data: Coclustering genes and conditions. Genome Research, 13, 703–716.

    Article  Google Scholar 

  • Lazzeroni, L., & Owen, A. (2000). Plaid models for gene expression data. Statistica Sinica, 12, 61–86.

    Google Scholar 

  • Li, G., Ma, Q., Tang, H., Paterson, A. H., & Xu, Y. (2009). QUBIC: A qualitative biclustering algorithm for analyses of gene expression data. Nucleic Acids Research, 37(15), e101.

    Article  Google Scholar 

  • Loureno, H., Martin, O., & Stutzle, T. (2003). Iterated local search. In F. Glover, G. Kochenberger, F. S. Hillier, & C. C. Price (Eds.), Handbook of metaheuristics, international series in operations research and management science (Vol. 57, pp. 320–353). New York: Springer.

    Google Scholar 

  • Madeira, S. C., & Oliveira, A. L. (2004). Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1, 24–45.

    Article  Google Scholar 

  • Murali, T. M., & Kasif, S. (2003). Extracting conserved gene expression motifs from gene expression data. In: The Pacific symposium on biocomputing (pp. 77–88).

  • Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., et al. (2006). A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics, 22(9), 1122–1129. doi:10.1093/bioinformatics/btl060.

    Article  Google Scholar 

  • Protti, F., da Silva, M. D., & Szwarcfiter, J. L. (2009). Applying modular decomposition to parameterized cluster editing problems. Theory of Computing Systems, 44, 91–104.

    Article  Google Scholar 

  • Resende, M. (2001). Greedy randomized adaptive search procedures. In C. A. Floudas & P. M. Pardalos (Eds.), Encyclopedia of Optimization (pp. 913–922). US: Springer.

    Chapter  Google Scholar 

  • Shamir, R., Sharan, R., & Tsur, D. (2004). Cluster graph modification problems. Discrete Applied Mathematics, 144, 173–182. doi:10.1016/j.dam.2004.01.007.

    Article  Google Scholar 

  • Subhashini, R., & Kumar, V. J. S. (2010). Evaluating the performance of similarity measures used in document clustering and information retrieval. In: First international conference on Integrated intelligent computing (ICIIC), 2010 (pp. 27–31). doi:10.1109/ICIIC.2010.42.

  • Sun, P., Guo, J., & Baumbach, J. (2013). BiCluE: exact and heuristic algorithms for weighted bi-cluster editing of biomedical data. BMC Proceedings, 7(Suppl 7), S9. doi:10.1186/1753-6561-7-S7-S9.

    Google Scholar 

  • Sun, P., Speicher, N. K., Rttger, R., Guo, J., & Baumbach, J. (2014). Bi-force: large-scale bicluster editing and its application to gene expression data biclustering. Nucleic Acids Research. doi:10.1093/nar/gku201.

  • Tanay, A., Sharan, R., & Shamir, R. (2006). Biclustering algorithms: A survey. In S. Aluru (Ed.), Handbook of computational molecular biology. Boca Raton: Chapman Hall/CRC Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gilberto F. de Sousa Filho.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Sousa Filho, G.F., Júnior, T.L.B., Cabral, L.A.F. et al. New heuristics for the Bicluster Editing Problem. Ann Oper Res 258, 781–814 (2017). https://doi.org/10.1007/s10479-016-2261-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-016-2261-x

Keywords

Navigation