Possibilistic Approach to Biclustering: An Application to Oligonucleotide Microarray Data Analysis

  • Maurizio Filippone
  • Francesco Masulli
  • Stefano Rovetta
  • Sushmita Mitra
  • Haider Banka
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4210)

Abstract

The important research objective of identifying genes with similar behavior with respect to different conditions has recently been tackled with biclustering techniques. In this paper we introduce a new approach to the biclustering problem using the Possibilistic Clustering paradigm. The proposed Possibilistic Biclustering algorithm finds one bicluster at a time, assigning a membership to the bicluster for each gene and for each condition. The biclustering problem, in which one would maximize the size of the bicluster and minimizing the residual, is faced as the optimization of a proper functional. We applied the algorithm to the Yeast database, obtaining fast convergence and good quality solutions. We discuss the effects of parameter tuning and the sensitivity of the method to parameter values. Comparisons with other methods from the literature are also presented.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE Transactions on Computational Biology and Bioinformatics 1, 24–45 (2004)CrossRefGoogle Scholar
  2. 2.
    Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103. AAAI Press, Menlo Park (2000)Google Scholar
  3. 3.
    Hartigan, J.A.: Direct clustering of a data matrix. Journal of American Statistical Association 67(337), 123–129 (1972)CrossRefGoogle Scholar
  4. 4.
    Kung, S.Y., Mak, M.W., Tagkopoulos, I.: Multi-metric and multi-substructure biclustering analysis for gene expression data. In: Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference (CSB 2005) (2005)Google Scholar
  5. 5.
    Turner, H., Bailey, T., Krzanowski, W.: Improved biclustering of microarray data demonstrated through systematic performance tests. Computational Statistics and Data Analysis 48(2), 235–254 (2005)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Peeters, R.: The maximum edge biclique problem is NP-Complete. Discrete Applied Mathematics 131, 651–654 (2003)MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Yang, J., Wang, H., Wang, W., Yu, P.: Enhanced biclustering on expression data. In: Proceedings of the Third IEEE Symposium on BioInformatics and Bioengineering (BIBE 2003), pp. 1–7 (2003)Google Scholar
  8. 8.
    Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, S136–S144 (2002)Google Scholar
  9. 9.
    Zhang, Z., Teo, A., Ooi, B.C.a.: Mining deterministic biclusters in gene expression data. In: Proceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE 2004), pp. 283–292 (2004)Google Scholar
  10. 10.
    Mitra, S., Banka, H.: Multi-objective evolutionary biclustering of gene expression data (to appear, 2006)Google Scholar
  11. 11.
    Bryan, K., Cunningham, P., Bolshakova, N.: Biclustering of expression data using simulated annealing. In: 18th IEEE Symposium on Computer-Based Medical Systems (CBMS 2005), pp. 383–388 (2005)Google Scholar
  12. 12.
    Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems 1(2), 98–110 (1993)CrossRefGoogle Scholar
  13. 13.
    Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, Chichester (1973)MATHGoogle Scholar
  14. 14.
    Kohonen, T.: Self-Organizing Maps. Springer, New York (2001)MATHGoogle Scholar
  15. 15.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)MATHGoogle Scholar
  16. 16.
    Rose, K., Gurewwitz, E., Fox, G.: A deterministic annealing approach to clustering. Pattern Recogn. Lett. 11(9), 589–594 (1990)MATHCrossRefGoogle Scholar
  17. 17.
    Runkler, T.A., Bezdek, J.C.: Alternating cluster estimation: a new tool for clustering and function approximation. IEEE Transactions on Fuzzy Systems 7(4), 377–393 (1999)CrossRefGoogle Scholar
  18. 18.
    Krishnapuram, R., Keller, J.M.: The possibilistic c-means algorithm: insights and recommendations. IEEE Transactions on Fuzzy Systems 4(3), 385–393 (1996)CrossRefGoogle Scholar
  19. 19.
    Masulli, F., Schenone, A.: A fuzzy clustering based segmentation system as support to diagnosis in medical imaging. Artificial Intelligence in Medicine 16(2), 129–147 (1999)CrossRefGoogle Scholar
  20. 20.
    Nasraoui, O., Krishnapuram, R.: Crisp interpretations of fuzzy and possibilistic clustering algorithms, Aachen, Germany, vol. 3, pp. 1312–1318 (1995)Google Scholar
  21. 21.
    Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nature Genetics 22(3) (1999)Google Scholar
  22. 22.
    Ball, C.A., Dolinski, K., Dwight, S.S., Harris, M.A., Tarver, L.I., Kasarskis, A., Scafe, C.R., Sherlock, G., Binkley, G., Jin, H., Kaloper, M., Orr, S.D., Schroeder, M., Weng, S., Zhu, Y., Botstein, D., Cherry, M.J.: Integrating functional genomic information into the saccharomyces genome database. Nucleic Acids Research 28(1), 77–80 (2000)CrossRefGoogle Scholar
  23. 23.
    Aach, J., Rindone, W., Church, G.: Systematic management and analysis of yeast gene expression data (2000)Google Scholar
  24. 24.
    R Foundation for Statistical Computing Vienna, Austria: R: A language and environment for statistical computing (2005)Google Scholar
  25. 25.
    Bleuler, S., Prelić, A., Zitzler, E.: An EA framework for biclustering of gene expression data. In: Congress on Evolutionary Computation (CEC 2004), Piscataway, NJ, pp. 166–173. IEEE, Los Alamitos (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Maurizio Filippone
    • 1
  • Francesco Masulli
    • 1
  • Stefano Rovetta
    • 1
  • Sushmita Mitra
    • 2
    • 3
  • Haider Banka
    • 2
  1. 1.DISI, Dept. Computer and Information SciencesUniversity of Genova and CNISMGenovaItaly
  2. 2.Center for Soft Computing: A National FacilityIndian Statistical InstituteKolkataIndia
  3. 3.Machine Intelligence UnitIndian Statistical InstituteKolkataIndia

Personalised recommendations