Toward the Discovery of Itemsets with Significant Variations in Gene Expression Matrices

  • Mehdi Kaytoue
  • Sébastien Duplessis
  • Amedeo Napoli
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

Gene expression matrices are numerical tables that describe the level of expression of genes in different situations, characterizing their behaviour. Biologists are interested in identifying groups of genes presenting similar quantitative variations of expression. This paper presents new syntactic constraints for itemset mining in particular Boolean gene expression matrices. A two dimensional gene expression profile representation is introduced and adapted to itemset mining allowing one to control gene expression. Syntactic constraints are used to discover itemsets with significant expression variations from a large collection of gene expression profiles.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Blachon, S., Pensa, R., Besson, J., Robardet, C., Boulicaut, J.-F., Gandrillon, O.: Clustering formal concepts to discover biologically relevant knowledge from gene expression data. In Silico. Biol. 7(0033), 1–15 (July 2007)Google Scholar
  2. 2.
    Boulicaut, J.-F., Besson, J.: Actionability and formal concepts: a data mining perspective. In: Formal Concept Analysis, LNAI 4933, pp. 14–31. Springer, Heidelberg (2008)Google Scholar
  3. 3.
    Creighton, C., Hanash, S.: Mining gene expression databases for association rules. Bioinformatics 19(1), 79–86 (2003)CrossRefGoogle Scholar
  4. 4.
    Hsiao, C.-J., Zaki, M.J.: Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans. Knowl. Data Eng. 17(4), 462–478 (2005)CrossRefGoogle Scholar
  5. 5.
    Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. IEEE Trans. Knowl. Data Eng. 16(11), 1370–1386 (2004)CrossRefGoogle Scholar
  6. 6.
    Kaytoue, M., Duplessis, S., Kuznetsov, S.O., Napoli., A.: Two FCA-based methods for mining gene expression data. In: Formal Concept Analysis, LNAI 5548, pp. 251–266. Springer, Heidelberg (2009)Google Scholar
  7. 7.
    Kurgan, L., Cios, K., Kurgan, L.A., Cios, K.J., Member, S.: Caim discretization algorithm. IEEE Trans. Knowl. Data Eng. 16, 145–153 (2004)CrossRefGoogle Scholar
  8. 8.
    Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans. Comput. Biol. Bioinform. 1(1), 24–45 (2004)CrossRefGoogle Scholar
  9. 9.
    Martin, F.: The genome of Laccaria bicolor provides insights into mycorrhizal symbiosis. Nature 452(7183), 88–92 (2008). 68 Co-authors have participated in this paperCrossRefGoogle Scholar
  10. 10.
    Napoli, A.: A smooth introduction to symbolic methods for knowledge discovery. In: Cohen, H., Lefebvre, C., (eds.) Handbook of Categorization in Cognitive Science. Elsevier, Amsterdam (2005)Google Scholar
  11. 11.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: ICDT ’99: Proceedings of the 7th International Conference on Database Theory, pp. 398–416. Springer, London (1999)Google Scholar
  12. 12.
    Pensa, R., Besson, J., Boulicaut, J.-F.: A methodology for biologically relevant pattern discovery from gene expression data. In: Proceeding 7th International Conference on Discovery Science, LNAI 3245, pp. 230–241. Springer, Padova (Oct 2004)Google Scholar
  13. 13.
    Salleb-Aouissi, A., Vrain, C., Nortet, C.: Quantminer: A genetic algorithm for mining quantitative association rules. In: IJCAI, Hyderabad, India, pp. 1035–1040 (2007)Google Scholar
  14. 14.
    Stoughton, R.B.: Applications of DNA microarrays in biology. Annu. Rev. Biochem. 74(1), 53–82 (2005).CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Mehdi Kaytoue
    • 1
  • Sébastien Duplessis
    • 2
  • Amedeo Napoli
    • 1
  1. 1.LORIA, Campus ScientifiqueVandoeuvre-lés-NancyFrance
  2. 2.Centre INRA NancyChampenouxFrance

Personalised recommendations