Using Formal Concept Analysis for the Extraction of Groups of Co-expressed Genes

  • Mehdi Kaytoue-Uberall
  • Sébastien Duplessis
  • Amedeo Napoli
Part of the Communications in Computer and Information Science book series (CCIS, volume 14)


In this paper, we present a data-mining approach in gene expression matrices. The method is aimed at extracting formal concepts, representing sets of genes that present similar quantitative variations of expression in certain biological situations or environments. Formal Concept Analysis is used both for its abilities in data-mining and information representation. We structure the method around three steps: numerical data is turned into binary data, then formal concepts are extracted and filtered with a new formalism. The method has been applied to a gene expression dataset obtained in a fungal species named Laccaria bicolor. The paper ends with a discussion and research perspectives.


Gene expression formal concept analysis scaling 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Becquet, C., Blachon, S., Jeudy, B., Boulicaut, J.F., Gandrillon, O.: Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human sage data. Genome Biol. 3(12) (2002)Google Scholar
  2. 2.
    Berry, A., Bordat, J.-P., Sigayret, A.: A local approach to concept generation. Annals of Mathematics and Artificial Intelligence 49(1-4), 117–136 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Cheng, Y., Church, G.M.: Biclustering of expression data, pp. 93–103. ISMB (2000)Google Scholar
  4. 4.
    Choi, V., Huang, Y., Lam, V., Potter, D., Laubenbacher, R., Duca, K.: Using formal concept analysis for microarray data comparison. J Bioinform. Comput Biol. 6(1), 65–75 (2008)CrossRefGoogle Scholar
  5. 5.
    Creighton, C., Hanash, S.: Mining gene expression databases for association rules. Bioinformatics 19(1), 79–86 (2003)CrossRefGoogle Scholar
  6. 6.
    Ganter, B., Wille, R.: Formal Concept Analysis (1999)Google Scholar
  7. 7.
    Hartigan, J.A.: Direct clustering of a data matrix. J. Am. Statistical Assoc. 67(337), 123–129 (1972)CrossRefGoogle Scholar
  8. 8.
    Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. IEEE Trans. on Knowledge and Data Engineering 16(11), 1370–1386 (2004)CrossRefGoogle Scholar
  9. 9.
    Kaytoue-Uberall, M., Duplessis, S., Napoli, A.: Toward the discovery of itemsets with significant variations in gene expression matrices. SFC-CLADAG (2008)Google Scholar
  10. 10.
    Lee, T.I., Rinaldi, N.J., Robert, F., Odom, D.T., Bar-Joseph, Z., Gerber, G.K., Hannett, N.M., Harbison, C.T., Thompson, C.M., Simon, I., Zeitlinger, J., Jennings, E.G., Murray, H.L., Gordon, D.B., Ren, B., Wyrick, J.J., Tagne, J.-B., Volkert, T.L., Fraenkel, E., Gifford, D.K., Young, R.A.: Transcriptional regulatory networks in saccharomyces cerevisiae. Science 298(5594), 799–804 (2002)CrossRefGoogle Scholar
  11. 11.
    Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics 1(1), 24–45 (2004)CrossRefGoogle Scholar
  12. 12.
    Martin, F., 67 other authors: The genome of laccaria bicolor provides insights into mycorrhizal symbiosis. Nature 452, 88–92 (2008)CrossRefGoogle Scholar
  13. 13.
    Motameny, S., Versmold, B., Schmutzler, R.: Formal concept analysis for the identification of combinatorial biomarkers in breast cancer. In: ICFCA (2008)Google Scholar
  14. 14.
    Napoli, A.: A smooth introduction to symbolic methods for knowledge discovery. In: Handbook of Categorization in Cognitive Science, pp. 913–933 (2005)Google Scholar
  15. 15.
    Pensa, R.G., Besson, J., Boulicaut, J.-F.: A methodology for biologically relevant pattern discovery from gene expression data, pp. 230–241 (2004)Google Scholar
  16. 16.
    Rioult, F., Robardet, C., Blachon, S., Crémilleux, B., Gandrillon, O., Boulicaut, J.-F.: Mining concepts from large sage gene expression matrices. In: KDID, pp. 107–118 (2003)Google Scholar
  17. 17.
    Besson, J., Robardet, C., Boulicaut, J.-F., Gandrillon, O., Blachon, S., Pensa, R.G.: Clustering formal concepts to discover biologically relevant knowledge from gene expression data. Silico Biology 7(4–5), 467–483 (2007)Google Scholar
  18. 18.
    Becquet, C., Blachon, S., Jeudy, B., Boulicaut, J.F., Gandrillon, O.: Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human sage data. Genome Biol. 3(12) (2002)CrossRefGoogle Scholar
  19. 19.
    Stuart, J.M., Segal, E., Koller, D., Kim, S.K.: A gene-coexpression network for global discovery of conserved genetic modules. Science 302(5643), 249–255 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Mehdi Kaytoue-Uberall
    • 1
  • Sébastien Duplessis
    • 2
  • Amedeo Napoli
    • 1
  1. 1.LORIAVandoeuvre-lès-Nancy CedexFrance
  2. 2.INRA UMR 1136 – Interactions Arbres/MicroorganismesChampenouxFrance

Personalised recommendations