Identifying Different Types of Biclustering Patterns Using a Correlation-Based Dilated Biclusters Algorithm

  • Mahmoud MounirEmail author
  • Mohamed Hamdy
  • Mohamed Essam Khalifa
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 921)


An essential step in the analysis of gene expression profiles is the identification of sets of co-regulated genes or genes tend to be active under only subsets of experimental conditions or participate in multiple cellular processes or functions. Biclustering is a non-supervised technique exceeds the traditional clustering techniques because it can find groups of both genes and conditions simultaneously. In this paper, we proposed a biclustering algorithm called Correlation-Based Dilated Biclusters CBDB to find sets of biclusters with correlated gene expression patterns. This algorithm has many phases starting with the preprocessing phase, determination of elementary biclusters, then the dilation phase depending on a heuristic searching approach with Pearson correlation coefficient as a measure of coherency, after that, the removal phase to exclude sets of genes and conditions that show low level of coherency, finally, the elimination of duplicated and overlapped biclusters phase. This approach showed reasonable results on both synthetic and real datasets compared with other correlation-based biclustering techniques.


Clustering Biclustering Microarrays Gene expression profiles Correlated patterns 


  1. 1.
    Dziuda, D.: Data Mining for Genomics and Proteomics: Analysis of Gene and Protein Expression Data, 1st edn. Wiley, New York (2010)CrossRefGoogle Scholar
  2. 2.
    Dumancas, G., Adrianto, I., Bello, G., Dozmorov, M.: Current developments in machine learning techniques in biological data mining. Bioinform. Biol. Insights. 11 (2017)Google Scholar
  3. 3.
    Chen, J., Lonardi, S.: Biological Data Mining. In: Chapman and Hall/CRC Data Mining and Knowledge Discovery Series, 1st edn. CRC Press (2017)Google Scholar
  4. 4.
    Iswarya Lakshmi, K., Chandran, C.: Biclustering approaches for prediction of class discovery from gene expression data. In: Proceeding of International Seminar on Emerging Trends and Innovative Technologies in Biological Sciences (2011)Google Scholar
  5. 5.
    Beatriz, P., Raúl, G., Aguilar-Ruiz, J.: Biclustering on expression data: a review. J. Biomed. Inform. 57, 163–180 (2015)CrossRefGoogle Scholar
  6. 6.
    Mounir, M., Hamdy, M.: On biclustering of gene expression data. In: IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt, pp. 641–648 (2015)Google Scholar
  7. 7.
    Madeira, S., Oliveira, A.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Comput. Biol. Bioinf. 1, 24–45 (2004)CrossRefGoogle Scholar
  8. 8.
    Ben Saber, H., Elloumi, M.: A new study on biclustering tools, biclusters validation and evaluation functions. Int. J. Comput. Sci. Eng. Surv. (IJCSES) 6(1), 1–13 (2015)CrossRefGoogle Scholar
  9. 9.
    Erten, C., Sözdinler, M.: Improving performances of suboptimal greedy iterative biclustering heuristics via localization. Bioinformatics 26, 2594–2600 (2010)CrossRefGoogle Scholar
  10. 10.
    Denittoa, M., Farinellia, A., Figueiredob, M., Bicego, M.: A biclustering approach based on factor graphs and the max-sum algorithm. Pattern Recogn. 62, 114–124 (2017)CrossRefGoogle Scholar
  11. 11.
    Aguilar-Ruiz, J.: Shifting and scaling patterns from gene expression data. Bioinformatics 21, 3840–3845 (2005)CrossRefGoogle Scholar
  12. 12.
    Allocco, D., Kohane, I.S., Butte, A.J.: Quantifying the relationship between co-expression, co-regulation and gene function. BMC Bioinformatics 5, 18 (2004)CrossRefGoogle Scholar
  13. 13.
    Bhattacharya, A., De, K.: Bi-correlation clustering algorithm for determining a set of co-regulated genes. Bioinformatics 25, 2795–2801 (2009)CrossRefGoogle Scholar
  14. 14.
    Yun, T., Yi, G.S.: Biclustering for the comprehensive search of correlated gene expression patterns using clustered seed expansion. BMC Genomics 14(1), 144 (2013)CrossRefGoogle Scholar
  15. 15.
    Zhang, Y., Xie, J., Yang, J., Fennell, A., Zhang, C., Ma, Q.: QUBIC: a bioconductor package for qualitative biclustering analysis of gene co-expression data. Bioinformatics 33(3), 450–452 (2017)Google Scholar
  16. 16.
    Bentham, R., Bryson, K., Szabadkai, G.: MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections. Nucleic Acids Res. 45(15), 8712–8730 (2017)CrossRefGoogle Scholar
  17. 17.
    Henriques, R., Ferreira, F., Madeira, S.: BicPAMS: software for biological data analysis with pattern-based biclustering. BMC Bioinf. 18(1), 82 (2017)CrossRefGoogle Scholar
  18. 18.
    Eren, K., Deveci, M., Küçüktunç, O., Çatalyürek, Ü.: A comparative analysis of biclustering algorithms for gene expression data. Brief. Bioinform. 14, 279–292 (2012)CrossRefGoogle Scholar
  19. 19.
    Rodrigo, S., Luis, Q., Roberto, T.: Methods to bicluster validation and comparison in microarray data. In: The Proceeding of 8th International Conference in Intelligent Data Engineering and Automated Learning - IDEAL 2007, 16–19 December, Birmingham, UK, pp. 780–789 (2007)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Mahmoud Mounir
    • 1
    Email author
  • Mohamed Hamdy
    • 1
  • Mohamed Essam Khalifa
    • 2
  1. 1.Information Systems Department, Faculty of Computer and Information SciencesAin Shams UniversityCairoEgypt
  2. 2.Basic Sciences Department, Faculty of Computer and Information SciencesAin Shams UniversityCairoEgypt

Personalised recommendations