Journal of Signal Processing Systems

, Volume 50, Issue 3, pp 267–280

Discovering Biclusters by Iteratively Sorting with Weighted Correlation Coefficient in Gene Expression Data



We propose a framework for biclustering gene expression profiles. This framework applies dominant set approach to create sets of sorting vectors for the sorting of the rows in the data matrix. In this way, the coexpressed rows of gene expression vectors could be gathered. We iteratively sort and transpose the gene expression data matrix to gather the blocks of coexpressed subset. Weighted correlation coefficient is used to measure the similarity in the gene level and the condition level. Their weights are updated each time using the sorting vector of the previous iteration. In this way, the highly correlated bicluster is located at one corner of the rearranged gene expression data matrix. We applied our approach to synthetic data and three real gene expression data sets with encouraging results. Secondly, we propose ACV (average correlation value) to evaluate the homogeneity of a bicluster or a data matrix. This criterion conforms to the intuitive biological notion of coexpressed set of genes or samples and is compared with the mean squared residue score. ACV is found to be more appropriate for both additive models and multiplicative models.


biclustering gene expression data microarray data weighted correlation coefficient 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringThe Chinese University of HongkongHong KongPeople’s Republic of China

Personalised recommendations