CMSB 2006: Computational Methods in Systems Biology pp 312-322 | Cite as
Possibilistic Approach to Biclustering: An Application to Oligonucleotide Microarray Data Analysis
Abstract
The important research objective of identifying genes with similar behavior with respect to different conditions has recently been tackled with biclustering techniques. In this paper we introduce a new approach to the biclustering problem using the Possibilistic Clustering paradigm. The proposed Possibilistic Biclustering algorithm finds one bicluster at a time, assigning a membership to the bicluster for each gene and for each condition. The biclustering problem, in which one would maximize the size of the bicluster and minimizing the residual, is faced as the optimization of a proper functional. We applied the algorithm to the Yeast database, obtaining fast convergence and good quality solutions. We discuss the effects of parameter tuning and the sensitivity of the method to parameter values. Comparisons with other methods from the literature are also presented.
Keywords
Gene Expression Data Probabilistic Constraint Picard Iteration Possibilistic Approach Frequent Pattern Mining AlgorithmPreview
Unable to display preview. Download preview PDF.
References
- 1.Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE Transactions on Computational Biology and Bioinformatics 1, 24–45 (2004)CrossRefGoogle Scholar
- 2.Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103. AAAI Press, Menlo Park (2000)Google Scholar
- 3.Hartigan, J.A.: Direct clustering of a data matrix. Journal of American Statistical Association 67(337), 123–129 (1972)CrossRefGoogle Scholar
- 4.Kung, S.Y., Mak, M.W., Tagkopoulos, I.: Multi-metric and multi-substructure biclustering analysis for gene expression data. In: Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference (CSB 2005) (2005)Google Scholar
- 5.Turner, H., Bailey, T., Krzanowski, W.: Improved biclustering of microarray data demonstrated through systematic performance tests. Computational Statistics and Data Analysis 48(2), 235–254 (2005)MATHCrossRefMathSciNetGoogle Scholar
- 6.Peeters, R.: The maximum edge biclique problem is NP-Complete. Discrete Applied Mathematics 131, 651–654 (2003)MATHCrossRefMathSciNetGoogle Scholar
- 7.Yang, J., Wang, H., Wang, W., Yu, P.: Enhanced biclustering on expression data. In: Proceedings of the Third IEEE Symposium on BioInformatics and Bioengineering (BIBE 2003), pp. 1–7 (2003)Google Scholar
- 8.Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, S136–S144 (2002)Google Scholar
- 9.Zhang, Z., Teo, A., Ooi, B.C.a.: Mining deterministic biclusters in gene expression data. In: Proceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE 2004), pp. 283–292 (2004)Google Scholar
- 10.Mitra, S., Banka, H.: Multi-objective evolutionary biclustering of gene expression data (to appear, 2006)Google Scholar
- 11.Bryan, K., Cunningham, P., Bolshakova, N.: Biclustering of expression data using simulated annealing. In: 18th IEEE Symposium on Computer-Based Medical Systems (CBMS 2005), pp. 383–388 (2005)Google Scholar
- 12.Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems 1(2), 98–110 (1993)CrossRefGoogle Scholar
- 13.Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, Chichester (1973)MATHGoogle Scholar
- 14.Kohonen, T.: Self-Organizing Maps. Springer, New York (2001)MATHGoogle Scholar
- 15.Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)MATHGoogle Scholar
- 16.Rose, K., Gurewwitz, E., Fox, G.: A deterministic annealing approach to clustering. Pattern Recogn. Lett. 11(9), 589–594 (1990)MATHCrossRefGoogle Scholar
- 17.Runkler, T.A., Bezdek, J.C.: Alternating cluster estimation: a new tool for clustering and function approximation. IEEE Transactions on Fuzzy Systems 7(4), 377–393 (1999)CrossRefGoogle Scholar
- 18.Krishnapuram, R., Keller, J.M.: The possibilistic c-means algorithm: insights and recommendations. IEEE Transactions on Fuzzy Systems 4(3), 385–393 (1996)CrossRefGoogle Scholar
- 19.Masulli, F., Schenone, A.: A fuzzy clustering based segmentation system as support to diagnosis in medical imaging. Artificial Intelligence in Medicine 16(2), 129–147 (1999)CrossRefGoogle Scholar
- 20.Nasraoui, O., Krishnapuram, R.: Crisp interpretations of fuzzy and possibilistic clustering algorithms, Aachen, Germany, vol. 3, pp. 1312–1318 (1995)Google Scholar
- 21.Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nature Genetics 22(3) (1999)Google Scholar
- 22.Ball, C.A., Dolinski, K., Dwight, S.S., Harris, M.A., Tarver, L.I., Kasarskis, A., Scafe, C.R., Sherlock, G., Binkley, G., Jin, H., Kaloper, M., Orr, S.D., Schroeder, M., Weng, S., Zhu, Y., Botstein, D., Cherry, M.J.: Integrating functional genomic information into the saccharomyces genome database. Nucleic Acids Research 28(1), 77–80 (2000)CrossRefGoogle Scholar
- 23.Aach, J., Rindone, W., Church, G.: Systematic management and analysis of yeast gene expression data (2000)Google Scholar
- 24.R Foundation for Statistical Computing Vienna, Austria: R: A language and environment for statistical computing (2005)Google Scholar
- 25.Bleuler, S., Prelić, A., Zitzler, E.: An EA framework for biclustering of gene expression data. In: Congress on Evolutionary Computation (CEC 2004), Piscataway, NJ, pp. 166–173. IEEE, Los Alamitos (2004)Google Scholar