Journal of Signal Processing Systems

, Volume 79, Issue 2, pp 159–166 | Cite as

Application of the Bi-CoPaM Method to Five Escherichia Coli Datasets Generated under Various Biological Conditions

  • Basel Abu-Jamous
  • Rui Fa
  • David J. Roberts
  • Asoke K. NandiEmail author


The increasing amounts of high-throughput biological datasets stimulate the information engineering and machine learning research community to direct more studies towards designing and applying novel methods which are sophisticated and specialised to tackle the problems that are specific in such datasets. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method tackles the problem of scrutinising multiple gene expression microarray datasets to identify the subsets of genes which are consistently co-expressed across them. It allows for clustering results which better reflect the biological fact that most of the genes in any cell are expected to be irrelevant to the specific context in hand, as well as the fact that many genes might participate in multiple processes. This has been achieved by clustering the given set of genes while allowing any gene to have any of the three eventualities, to be exclusively assigned to a single cluster, to be simultaneously assigned to multiple clusters, or not to be assigned to any of the clusters. In this study, we expand the scope of application of the Bi-CoPaM method by applying it, for the first time, to bacterial datasets, namely to a set of five Escherichia coli bacterial datasets generated under different biological conditions, in order to identify the subsets of genes which are consistently co-expressed, i.e. well correlated with each other. We identify two clusters with such consistent co-expression, and interestingly, they themselves are consistently negatively correlated with each other. The first cluster is enriched with genes participating in protein synthesis and DNA repair while the second is enriched with transporting genes. Consequently, we draw biological hypotheses that relate some of the genes with currently unknown biological processes to their potential processes. These hypotheses can serve as pilots for focused future gene discovery studies.


Genome-wide analysis Consistent co-expression Bi-CoPaM Escherichia coli Multiple datasets analysis 



This article summarises independent research funded by the National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004). The views expressed are those of the author (s) and not necessarily those of the NHS, the NIHR or the Department of Health. A.K. Nandi would like to thank TEKES for their award of the Finland Distinguished Professorship.

Disclosure Declaration

No conflict of interest declared.


  1. 1.
    Piro, R. M., Ala, U., Molineris, I., Grassi, E., Bracco, C., Perego, G. P., et al. (2011). An atlas of tissue-specific conserved coexpression for functional annotation and disease gene prediction. European Journal of Human Genetics, 19, 1173–1180.CrossRefGoogle Scholar
  2. 2.
    Cahan, P., Rovegno, F., Mooney, D., Newman, J. C., Laurent, G. S., & McCaffrey, T. A. (2007). Meta-analysis of microarray results: challenges, opportunities, and recommendations for standardization. Gene, 401(1–2), 12–18.CrossRefGoogle Scholar
  3. 3.
    Nilsson, R., Schultz, I. J., Pierce, E. L., Soltis, K. A., Naranuntarat, A., Ward, D. M., et al. (2009). Discovery of genes essential for heme biosynthesis through large-scale gene expression analysis. Cell Metabolism, 10, 119–130.CrossRefGoogle Scholar
  4. 4.
    Pena, J. M., Lozano, J. A., & Larranaga, P. (1999). An empirical comparison of four initialization methods for the K-Means algorithm. Pattern Recognition Letters, 20(10), 1027–1040.CrossRefGoogle Scholar
  5. 5.
    Eisen, M. B., Spellman, P. T., Brown, P. O., & Botstein, D. (1998). “Cluster analysis and display of genome-wide expression patterns,”. Proceedings of the National Academy of Science, 95, 14863–14868.CrossRefGoogle Scholar
  6. 6.
    Xiao, X., Dow, E.R., Eberhart, R., Miled, Z.B., Oppelt, R.J. (2003). “Gene clustering using self-organizing maps and particle swarm optimization,” in IEEE Parallel and Distributed Processing Symposium Proceedings, Indianapolis, pp. 154–163.Google Scholar
  7. 7.
    Salem, S. A., Jack, L. B., & Nandi, A. K. (2008). Investigation of self-organizing oscillator networks for use in clustering microarray data. IEEE Transactions on Nanobioscience, 7(1), 65–79.CrossRefGoogle Scholar
  8. 8.
    Vega-Pons, S., & Ruiz-Shulcloper, J. (2011). A survey of clustering ensemble algorithms. International Journal of Pattern Recognition and Artifcial Intelligence, 25(3), 337–372.CrossRefMathSciNetGoogle Scholar
  9. 9.
    Fred, A., Jain, A. K. (2002). “Data clustering using evidence accumulation,” in Proceedings of the Sixteenth International Conference on Pattern Recognition (ICPR), vol. 4, pp. 276–280.Google Scholar
  10. 10.
    Yu, Z., Wong, H. S., & Wang, H. (2007). Graph-based consensus clustering for class discovery from gene expression data. Bioinformatics, 23(21), 2888–2896.CrossRefGoogle Scholar
  11. 11.
    Zhou, X., & Mao, K. Z. (2005). LS bound based gene selection for DNA microarray data. Bioinformatics, 21(8), 1559–1564.CrossRefGoogle Scholar
  12. 12.
    Avogadri, R., Valentini, G. (2008). “Ensemble clustering with a fuzzy approach,” in Supervised and Unsupervised Ensemble Methods and their Applications Studies in Computational Intelligence, Okun, O., Ed. Berlin: Springer-Verlag, vol. 126.Google Scholar
  13. 13.
    Abu-Jamous, B., Fa, R., Roberts, D.J., Nandi, A.K. (2013a). “Paradigm of Tunable Clustering using Binarization of Consensus Partition Matrices (Bi-CoPaM) for Gene Discovery,” PLOS ONE, vol. 8, no. 2, doi:  10.1371/journal.pone.0056432.
  14. 14.
    Abu-Jamous, B., Fa, R., Roberts, D.J., Nandi, A.K. (2013c). “Identification of genes consistently co-expressed in multiple microarrays by a genome-wide approach,” in ICASSP, Vancouver, Canada, p. In press.Google Scholar
  15. 15.
    Abu-Jamous, B., Fa, R., Roberts, D.J., Nandi, A.K. (2013b). “Yeast gene CMR1/YDL156W is consistently co-expressed with genes participating in DNA-metabolic processes in a variety of stringent clustering experiments,” Journal of the Royal Society Interface, vol. 10, no. 81, doi:  10.1098/rsif.2012.0990.
  16. 16.
    Abu-Jamous, B., Fa, R., Roberts, D.J., Nandi, A.K. (2013). “Method for the identification of the subsets of genes specifically consistently co-expressed in a set of datasets,” in Proceedings of the 2013 I.E. International Workshop on Machine Learning for Signal Processing (MLSP-2013), Southampton, UK.Google Scholar
  17. 17.
    Wade, C. H., Umbarger, M. A., & McAlear, M. A. (2006). The budding yeast rRNA and ribosome biosynthesis (RRB) regulon contains over 200 genes. Yeast, 23, 293–306.CrossRefGoogle Scholar
  18. 18.
    Lee, J., Zhang, X. S., Hegde, M., Bentley, W. E., Jayaraman, A., & Wood, T. K. (2008). Indole cell signaling occurs primarily at low temperatures in escherichia coli. The ISME Journal, 2, 1007–1023.CrossRefGoogle Scholar
  19. 19.
    Laubacher, M. E., & Ades, S. E. (2008). The Rcs phosphorelay is a cell envelope stress response activated by peptidoglycan stress and contributes to intrinsic antibiotic resistance. Journal of Bacteriology, 190(6), 2065–2074.CrossRefGoogle Scholar
  20. 20.
    Kamenšek, S. and Žgur-Bertok, D. (2013). “Global transcriptional responses to the bacteriocin colicin M in Escherichia coli,” BMC Microbiology, vol. 13, no. 42, doi:  10.1186/1471-2180-13-42.
  21. 21.
    Holm, A. K., Blank, L. M., Oldiges, M., Schmid, A., Solem, C., Jensen, P. R., et al. (2010). Metabolic and transcriptional response to cofactor perturbations in escherichia coli. The Journal of Biological Chemistry, 285(23), 17498–17506.CrossRefGoogle Scholar
  22. 22.
    Arunasri, K., Adil, M., Charan, K.V., Suvro, C., Reddy, S.H., and Shivaji, S. (2013). “Effect of simulated microgravity on E. coli K12 MG1655 growth and gene expression,” PLOS ONE, vol. 8, no. 3, doi:  10.1371/journal.pone.0057860.
  23. 23.
    The Gene Ontology Consortium. (2013). “Gene Ontology annotations and resources,”. Nucleic Acids Research, 41, D530–D535. Database.CrossRefGoogle Scholar
  24. 24.
    Barria, C., Malecki, M., & Arraiano, C. M. (2013). Bacterial adaptation to cold. Microbiology, 159(12), 2437–2443.CrossRefGoogle Scholar
  25. 25.
    Orelle, C., Carlson, S., Kaushal, B., Almutairi, M. M., Liu, H., Ochabowicz, A., et al. (2013). Tools for characterizing bacterial protein synthesis inhibitors. Antimicrobial Agents and Chemotherapy, 57(12), 5994–6004.CrossRefGoogle Scholar
  26. 26.
    Shalgi, R., Hurt, J. A., Krykbaeva, I., Taipale, M., Lindquist, S., & Burge, C. B. (2013). Widespread regulation of translation by elongation pausing in heat shock. Molecular Cell, 49(3), 439–452.CrossRefGoogle Scholar
  27. 27.
  28. 28.
    Partridge, J. D., Browning, D. F., Xu, M., Newnham, L. J., Scott, C., Roberts, R. E., et al. (2008). Characterization of the Escherichia coli K-12 ydhYVWXUT operon: regulation by FNR, NarL and NarP. Microbiology, 154(2), 608–618.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Basel Abu-Jamous
    • 1
  • Rui Fa
    • 1
  • David J. Roberts
    • 2
    • 3
  • Asoke K. Nandi
    • 1
    • 4
    Email author
  1. 1.Department of Electronic and Computer EngineeringBrunel UniversityUxbridgeUK
  2. 2.National Health Service Blood and TransplantOxfordUK
  3. 3.Radcliffe Department of MedicineUniversity of Oxford, John Radcliffe HospitalOxfordUK
  4. 4.Department of Mathematical Information TechnologyUniversity of JyväskyläJyväskyläFinland

Personalised recommendations