Advertisement

BicBioEC: biclustering in biomarker identification for ESCC

  • P. Kakati
  • D. K. BhattacharyyaEmail author
  • J. K. Kalita
Original Article
  • 69 Downloads

Abstract

Analysis of gene expression patterns enables identification of significant genes related to a specific disease. We analyze gene expression data for esophageal squamous cell carcinoma (ESCC) using biclustering, gene–gene network topology and pathways to identify significant biomarkers. Biclustering is a clustering technique by which we can extract coexpressed genes over a subset of samples. We introduce a parallel and robust biclustering algorithm to identify shifted, scaled and shifted-and-scaled biclusters of high biological relevance. Additionally, we introduce a mapping algorithm to establish the module–bicluster relationship across control and disease stages and a hub-gene identification method to support our analysis framework. The C-CUDA implementation of our biclustering algorithm makes the method attractive due to faster speed and higher accuracy of results. Biomarkers such as CCNB1, CDK4, and KRT5 have been found to be closely associated with ESCC.

Keywords

Gene expression Bicluster Primary gene Secondary gene Biomarkers SSSIM GPU computing 

Supplementary material

13721_2019_200_MOESM1_ESM.pdf (100 kb)
Supplementary material 1 (pdf 100 KB)

References

  1. Aguilar-Ruiz JS (2005) Shifting and scaling patterns from gene expression data. Bioinformatics 21(20):3840–3845CrossRefGoogle Scholar
  2. Ahmed HA, Mahanta P, Bhattacharyya DK, Kalita JK (2014) Shifting-and-scaling correlation based biclustering algorithm. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 11(6):1239–1252CrossRefGoogle Scholar
  3. Berriz GF, King OD, Bryant B, Sander C, Roth FP (2003) Characterizing gene sets with funcassociate. Bioinformatics 19(18):2502–2504CrossRefGoogle Scholar
  4. Bhattacharya A, Cui Y (2017) A gpu-accelerated algorithm for biclustering analysis and detection of condition-dependent coexpression network modules. Sci Rep 7(1):4162CrossRefGoogle Scholar
  5. Cho H, Dhillon IS, Guan Y, Sra S (2004) Minimum sum-squared residue co-clustering of gene expression data. In: Proceedings of the 2004 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, pp 114–125Google Scholar
  6. CUDA 7.5 toolkit. https://developer.nvidia.com/cuda-toolkit-archive. Accessed 1 Nov 2017
  7. Cytoscape v 3.6.0. http://www.cytoscape.org/index.html. Accessed 25 Feb 2018
  8. Esophageal squamous cell carcinoma. http://www.malacards.org/card/esophagus_squamous_cell_carcinoma. Accessed 30 July 2017
  9. Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, Parkin DM, Forman D, Bray F (2015) Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 136(5):E359–E386CrossRefGoogle Scholar
  10. FuncAssociate. http://llama.mshri.on.ca/funcassociate/. Accessed 1 Jan 2018
  11. Geneanalytics. http://geneanalytics.genecards.org/. Accessed 10 May 2018
  12. GeneCards. https://www.genecards.org/. Accessed 30 July 2017
  13. Gephi-0.9.2. https://gephi.org/. Accessed 24 Jan 2018
  14. González-Domínguez J, Expósito RR (2018) Parbibit: parallel tool for binary biclustering on modern distributed-memory systems. PLoS One 13(4):e0194,361CrossRefGoogle Scholar
  15. IGraph,r-package. http://igraph.org/r/. Accessed 10 Mar 2018
  16. Kelsen D (2008) Principles and practice of gastrointestinal oncology. Lippincott Williams & Wilkins, PhiladelphiaGoogle Scholar
  17. Malacards. http://www.malacards.org/. Accessed 30 July 2017
  18. Mandal K, Sarmah R, Bhattacharyya DK (2018) Biomarker identification for cancer disease using biclustering approach: an empirical study. IEEE/ACM Trans Comput Biol Bioinform 1:1–1Google Scholar
  19. Olson CF (1995) Parallel algorithms for hierarchical clustering. Parallel Comput 21(8):1313–1325MathSciNetCrossRefGoogle Scholar
  20. Orzechowski P, Sipper M, Huang X, Moore J (2018) Ebic: an evolutionary-based parallel biclustering algorithm for pattern discovery. Bioinformatics (Oxford, England).  https://doi.org/10.1093/bioinformatics/bty401 CrossRefGoogle Scholar
  21. R 3.4.3. https://cran.r-project.org/bin/windows/base/. Accessed 1 Nov 2017
  22. Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(suppl 1):S136–S144CrossRefGoogle Scholar
  23. Zhao W, Ma H, He Q (2009) Parallel k-means clustering based on mapreduce. In: IEEE international conference on cloud computing. Springer, Berlin, Heidelberg, pp 674–679Google Scholar
  24. Zhou J, Khokhar A (2006) Parrescue: Scalable parallel algorithm and implementation for biclustering over large distributed datasets. In: 26th IEEE international conference on distributed computing systems (ICDCS’06). IEEE, pp 21Google Scholar

Copyright information

© Springer-Verlag GmbH Austria, part of Springer Nature 2019

Authors and Affiliations

  • P. Kakati
    • 1
  • D. K. Bhattacharyya
    • 1
    Email author
  • J. K. Kalita
    • 2
  1. 1.Department of Computer Science and EngineeringTezpur UniversityTezpurIndia
  2. 2.Department of Computer ScienceUniversity of ColoradoColorado SpringsUSA

Personalised recommendations