Abstract
Clustering has been one of the most popular approaches used in gene expression data analysis. A clustering method is typically used to partition genes according to their similarity of expression under different conditions. However, it is often the case that some genes behave similarly only on a subset of conditions and their behavior is uncorrelated over the rest of the conditions. As traditional clustering methods will fail to identify such gene groups, the biclustering paradigm is introduced recently to overcome this limitation. In contrast to traditional clustering, a biclustering method produces biclusters, each of which identifies a set of genes and a set of conditions under which these genes behave similarly. The boundary of a bicluster is usually fuzzy in practice as genes and conditions can belong to multiple biclusters at the same time but with different membership degrees. However, to the best of our knowledge, a method that can discover fuzzy value-coherent biclusters is still missing. In this paper, (i) we propose a new fuzzy bicluster model for value-coherent biclusters; (ii) based on this model, we define an objective function whose minimum will characterize good fuzzy value-coherent biclusters; and (iii) we propose a genetic algorithm based method, Genetic Fuzzy Biclustering Algorithm (GFBA), to identify fuzzy value-coherent biclusters. Our experiments show that GFBA is very efficient in converging to the global optimum.
This work was partially supported by the Agricultural Experiment Station at the University of the District of Columbia (Project No.: DC-0LIANG; Accession No.: 0203877).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proc. of the 8th International Conference on Intelligent Systems for Molecular Biology, pp. 93–103 (2000)
Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics 1, 24–45 (2004)
Hartigan, J.: Direct clustering of a data matrix. Journal of American Statistical Association 67(337), 123–129 (1972)
Tibshirani, R., et al.: Clustering methods for the analysis of DNA microarray data. Technical report, Dept. of Health Research and Policy, Dept. of Genetics, and Dept. of Biochemistry, Stanford Univ. (1999)
Cho, H., et al.: Minimum sum-squared residue coclustering of gene expression data. In: Proc. of Fourth SIAM Int’l Conf. Data Mining (2004)
Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. Proc. of the Natural Academy of Sciences USA, 12079–12084 (2000)
Califano, A., Stolovitzky, G., Tu, Y.: Analysis of gene expression microarays for phenotype classification. In: Proc. of Intl Conf. Computacional Molecular Biology, pp. 75–85 (2000)
Sheng, Q., Moreau, Y., Moor, B.D.: Biclustering microarray data by gibbs sampling. Bioinformatics 19, ii196–ii205 (2003)
Segal, E., et al.: Rich probabilistic models for gene expression. Bioinformatics 17, S243–S252 (2001)
Yang, J., et al.: Enhanced biclustering on expression data. In: Proc. of 3rd IEEE Conference on Bioinformatics and Bioengineering, pp. 321–327 (2003)
Tang, C., Zhang, L., Ramanathan, M.: Interrelated two way clustering: an unsupervised approach for gene expression data analysis. In: Proc. of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering, pp. 41–48 (2001)
Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Technical report, Stanford Univ. (2000)
Bleuler, S., Prelic, A., Zitzler, E.: An EA framework for biclustering of gene expression data. In: Proc. of Congress on Evolutionary Computation CEC2004., vol. 1, pp. 166–173 (2004)
Chakraborty, A., Maka, H.: Biclustering of gene expression data using genetic algorithm. In: Proc. of the 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 1–8 (2005)
Ben-Dor, A., et al.: Discovering local structure in gene expression data: The order- preserving submatrix problem. In: Proc. of the Sixth Int Conf. Computational Biology, pp. 49–57 (2002)
Liu, J., Wang, W.: OP-Cluster: Clustering by tendency in high dimensional space. In: Proc. of Third IEEE Intl Conf. Data Mining, pp. 187–194 (2003)
Zadeh, L.: Fuzzy sets. Information and Control 8, 338–353 (1965)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981)
Dave, R.N.: Characterization and detection of noise in clustering. Pattern Recognition Letters 12(11), 657–664 (1991)
Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems 1(2), 98–110 (1993)
Fei, X., et al.: GFBA: A genetic fuzzy biclustering algorithm for discovering value-coherent biclusters, TR-DB-102006-FLPL. Technical report, Dept. of Computer Science, Wayne State Univ. (August 2006), http://paris.cs.wayne.edu/~aw6056/paper.pdf
Dembele, D., Kastner, P.: Fuzzy c-means method for clustering microarray data. Bioinformatics 19(8), 973–980 (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fei, X., Lu, S., Pop, H.F., Liang, L.R. (2007). GFBA: A Biclustering Algorithm for Discovering Value-Coherent Biclusters. In: Măndoiu, I., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2007. Lecture Notes in Computer Science(), vol 4463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72031-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-72031-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72030-0
Online ISBN: 978-3-540-72031-7
eBook Packages: Computer ScienceComputer Science (R0)