Supporting Clinico-Genomic Knowledge Discovery: A Multi-strategy Data Mining Process

  • Alexandros Kanterakis
  • George Potamias
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3955)


We present a combined clinico-genomic knowledge discovery (CGKD) process suited for linking gene-expression (microarray) and clinical patient data. The process present a multi-strategy mining approach realized by the smooth integration of three distinct data-mining components: clustering (based on a discretized k-means approach), association rules mining, and feature-selection for selecting discrimant genes. The proposed CGKD process is applied on a real-world gene-expression profiling study (i.e., clinical outcome of breast cancer patients). Assessment of the results demonstrates the rationality and reliability of the approach.


Association Rule Association Rule Mining Clinical Attribute Discriminant Gene Clinical Patient Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Eisen, M., Spellman, P.T., Botstein, D., Brown, P.O.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 96, 14863–14867 (1998)CrossRefGoogle Scholar
  2. 2.
    Gupta, S.K., Rao, K.S., Bhatnagar, V.: K-means Clustering Algorithm for Categorical Attributes. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 203–208. Springer, Heidelberg (1999)Google Scholar
  3. 3.
    Mora López, L., Fortes Ruiz, I., Morales Bueno, R., Triguero Ruiz, F.: Dynamic discretization of continuous values from time series. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS, vol. 1810, pp. 280–291. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  4. 4.
    Botta, M., Giordana, A.: SMART+: A Multi-Strategy Learning Tool. In: Proc. IJCAI 1993, pp. 937–943 (1993)Google Scholar
  5. 5.
    Potamias, G., Koumakis, L., Moustakis, V.: Mining XML Clinical Data: The HealthObs System. Ingenierie des systems d’information, special session: Recherche, extraction et exploration d’information 10(1) (2004, 2005)Google Scholar
  6. 6.
    Potamias, G., Koumakis, L., Moustakis, V.S.: Gene selection via discretized gene-expression profiles and greedy feature-elimination. In: Vouros, G., Panayiotopoulos, T. (eds.) SETN 2004. LNCS, vol. 3025, pp. 256–266. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Agrawal, R., Imielinski, T., Arun, S.N.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the 1993 ACM SIGMOD International Conference on Management of Data (1993)Google Scholar
  8. 8.
    San, O.M., Huynh, V.-N., Nakamori, Y.: An alternative extension of the k-means algorithm for clustering categorical data. Int. J. Appl. Math. Comput. Sci. 14(2), 241–247 (2004)MathSciNetMATHGoogle Scholar
  9. 9.
    van’t Veer, L., Dai, H., Vijver, M.v.D., He, Y., Hart, A., Moa, M., Peterse, H., Kooy, K.v.D., Marton, M., Witteveen, A., Schreiber, G., Kerkhoven, R., Roberts, C., Linsley, P., Bernards, R., Friend, S.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Alexandros Kanterakis
    • 1
  • George Potamias
    • 1
  1. 1.Institute of Computer ScienceFoundation for Research & Technology – Hellas (FORTH)Heraklion, CreteGreece

Personalised recommendations