Solving Biclustering with a GRASP-Like Metaheuristic: Two Case-Studies on Gene Expression Analysis

  • Angelo Facchiano
  • Paola Festa
  • Anna Marabotti
  • Luciano Milanesi
  • Francesco Musacchia
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7548)

Abstract

The explosion of ‘‘omics’’ data over the past few decades has generated an increasing need of efficiently analyzing high-dimensional gene expression data in several different and heterogenous contexts, such as for example in information retrieval, knowledge discovery, and data mining. For this reason, biclustering, or simultaneous clustering of both genes and conditions has generated considerable interest over the past few decades. Unfortunately, the problem of locating the most significant bicluster has been shown to be NP-complete. We have designed and implemented a GRASP-like heuristic algorithm to efficiently find good solutions in reasonable running times, and to overcome the inner intractability of the problem from a computational point of view.

Experimental results on two datasets of expression data are promising indicating that this algorithm is able to find significant biclusters, especially from a biological point of view.

Keywords

Biclustering gene expression analysis GRASP combinatorial optimization approximate solutions 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hartigan, J.: Direct clustering of a data matrix. J. Am. Stat. Assoc. 67, 123–127 (1972)CrossRefGoogle Scholar
  2. 2.
    Cheng, Y., Church, G.M.: Biclustering of expression data. In: Altman, R., Bailey, T., Bourne, P., Gribskov, M., Lengauer, T., Shindyalov, I. (eds.) Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB 2000), pp. 93–103 (2000)Google Scholar
  3. 3.
    Madeira, S., Oliveira, A.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans. Comput. Biol. Bioinform. 1, 24–45 (2004)CrossRefGoogle Scholar
  4. 4.
    Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(suppl. 1), S136–S144 (2002)CrossRefGoogle Scholar
  5. 5.
    Wang, H., Wang, W., Yang, J., Yu, P.: Clustering by pattern similarity in large data sets. In: Proc. 2002 ACM SIGMOD Int’l Conf. Management of Data, pp. 394–405 (2002)Google Scholar
  6. 6.
    Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. Proc. Natl. Acad. Sci. USA 97 22, 12079–12084 (2000)CrossRefGoogle Scholar
  7. 7.
    Tang, C., Zhang, L., Zhang, I., Ramanathan, M.: Interrelated two-way clustering: An unsupervised approach for gene expression data analysis. In: Proc. Second IEEE Int’l Symp. Bioinformatics and Bioeng., pp. 41–48 (2001)Google Scholar
  8. 8.
    Duffy, D., Quiroz, A.: A permutation based algorithm for block clustering. J. Classif. 8, 65–91 (1991)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Cho, H., Dhillon, I., Guan, Y., Sra, S.: Minimum Sum-Squared Residue Co-clustering of Gene Expression Data. In: Berry, M., Dayal, U. (eds.) Proceedings of the 4th SIAM Int’l Conf. Data Mining (2004)Google Scholar
  10. 10.
    Yang, J., Wang, W., Wang, H., Yu, P.: δ-clusters: Capturing subspace correlation in a large data set. In: Proc. 18th IEEE Int’l Conf. Data Eng., pp. 517–528 (2002)Google Scholar
  11. 11.
    Yang, J., Wang, W., Wang, H., Yu, P.: Enhanced biclustering on expression data. In: Proc. Third IEEE Conf. Bioinformatics and Bioeng., pp. 321–327 (2003)Google Scholar
  12. 12.
    Klugar, Y., Basri, R., Chang, J., Gerstein, M.: Spectral biclustering of microarray data: Coclustering genes and conditions. Genome Res. 13, 703–716 (2003)CrossRefGoogle Scholar
  13. 13.
    Segal, E., Taskar, B., Gasch, A., Friedman, N., Koller, D.: Rich probabilistic models for gene expression. Bioinformatics 17(suppl. 1), S243–S252 (2001)CrossRefGoogle Scholar
  14. 14.
    Sheng, Q., Moreau, Y., Moor, B.D.: Biclustering microarray data by gibbs sampling. Bioinformatics 19(suppl. 2), ii196–ii205 (2003)Google Scholar
  15. 15.
    Manjunath Aradhya, V.N., Masulli, F., Rovetta, S.: A Novel Approach for Biclustering Gene Expression Data Using Modular Singular Value Decomposition. In: Masulli, F., Peterson, L.E., Tagliaferri, R. (eds.) CIBB 2009. LNCS, vol. 6160, pp. 254–265. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    Bryan, K., Cunningham, P., Bolshakova, N.: Application of simulated annealing to the biclustering of gene expression data. IEEE Trans. Inf. Technol. Biomed. 10(3), 519–525 (2006)CrossRefGoogle Scholar
  17. 17.
    Mitra, S., Banka, H.: Multi-objective evolutionary biclustering of gene expression data. Pattern Recogn. 39, 2464–2477 (2006)MATHCrossRefGoogle Scholar
  18. 18.
    Dharan, S., Nair, A.: Biclustering of gene expression data using reactive greedy randomized adaptive search procedure. BMC Bioinformatics 10(suppl. 1), S27 (2009)CrossRefGoogle Scholar
  19. 19.
    Tanay, A., Sharan, R., Shamir, R.: Biclustering Algorithms: A Survey. In: Aluru, S. (ed.) Handbook of Computational Molecular Biology. Computer and Information Science Series. S. Chapman & Hall/CRC (2005)Google Scholar
  20. 20.
    Peeters, R.: The maximum edge biclique problem is NP-Complete. Discrete Appl. Math. 131(3), 651–654 (2003)MathSciNetMATHCrossRefGoogle Scholar
  21. 21.
    Feo, T., Resende, M.: A probabilistic heuristic for a computationally difficult set covering problem. Oper. Res. Lett. 8, 67–71 (1989)MathSciNetMATHCrossRefGoogle Scholar
  22. 22.
    Feo, T., Resende, M.: Greedy randomized adaptive search procedures. J. Global Optim. 6, 109–133 (1995)MathSciNetMATHCrossRefGoogle Scholar
  23. 23.
    Festa, P., Resende, M.: GRASP: An annotated bibliography. In: Ribeiro, C., Hansen, P. (eds.) Essays and Surveys on Metaheuristics, pp. 325–367. Kluwer Academic Publishers (2002)Google Scholar
  24. 24.
    Festa, P., Resende, M.: An annotated bibliography of GRASP – Part I: Algorithms. International Transactions in Operational Research 16(1), 1–24 (2009)MathSciNetMATHCrossRefGoogle Scholar
  25. 25.
    Festa, P., Resende, M.: An annotated bibliography of GRASP – Part II: Applications. International Transactions in Operational Research 16(2), 131–172 (2009)MathSciNetMATHCrossRefGoogle Scholar
  26. 26.
    Prais, M., Ribeiro, C.: Reactive GRASP: An application to a matrix decomposition problem in TDMA traffic assignment. INFORMS J. Comput. 12, 164–176 (2000)MathSciNetMATHCrossRefGoogle Scholar
  27. 27.
    Binato, S., Oliveira, G.: A Reactive GRASP for transmission network expansion planning. In: Ribeiro, C., Hansen, P. (eds.) Essays and Surveys on Metaheuristics, pp. 81–100. Kluwer Academic Publishers (2002)Google Scholar
  28. 28.
    Delmaire, H., Díaz, J., Fernández, E., Ortega, M.: Reactive GRASP and tabu search based heuristics for the single source capacitated plant location problem. INFOR 37, 194–225 (1999)Google Scholar
  29. 29.
    Tavazoie, S., Hughes, J., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999)CrossRefGoogle Scholar
  30. 30.
    Alizadeh, A., Eisen, M., Davis, R., Ma, C., Lossos, I., Rosenwald, A., Boldrick, J., Sabet, H., Tran, T., Yu, X., Powell, J., Yang, L., Marti, G., Moore, T., Hudson, J., Lu, L., Lewis, D., Tibshirani, R., Sherlock, G., Chan, W., Greiner, T., Weisenburger, D., Armitage, J., Warnke, R., Levy, R., Wilson, W., Grever, M., Byrd, J., Botstein, D., Brown, P., Staudt, L.: Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)CrossRefGoogle Scholar
  31. 31.
  32. 32.
    Mi, H., Dong, Q., Muruganujan, A., Gaudet, P., Lewis, S., Thomas, P.: PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the gene ontology consortium. Nucleic Acids Res. 38, D204–D210 (2010)CrossRefGoogle Scholar
  33. 33.
    Frinhani, R.M.D., Silva, R.M.A., Mateus, G.R., Festa, P., Resende, M.G.C.: GRASP with Path-Relinking for Data Clustering: A Case Study for Biological Data. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 410–420. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  34. 34.
    Laguna, M., Martí, R.: GRASP and path relinking for 2-layer straight line crossing minimization. INFORMS J. Comput. 11, 44–52 (1999)MATHCrossRefGoogle Scholar
  35. 35.
    Festa, P., Pardalos, P., Resende, M., Ribeiro, C.: Randomized heuristics for the MAX-CUT problem. Optim. Methods Softw. 7, 1033–1058 (2002)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Mladenović, N., Hansen, P.: Variable neighborhood search. Comput. Oper. Res. 24, 1097–1100 (1997)MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Angelo Facchiano
    • 1
  • Paola Festa
    • 2
  • Anna Marabotti
    • 3
  • Luciano Milanesi
    • 3
  • Francesco Musacchia
    • 2
  1. 1.Istituto di Scienze dell’Alimentazione - CNRItaly
  2. 2.University of Napoli ‘‘Federico II’’Italy
  3. 3.Istituto di Tecnologie Biomediche - CNRItaly

Personalised recommendations