Mining Structural Databases: An Evolutionary Multi-Objetive Conceptual Clustering Methodology
The increased availability of biological databases containing representations of complex objects permits access to vast amounts of data. In spite of the recent renewed interest in knowledge-discovery techniques (or data mining), there is a dearth of data analysis methods intended to facilitate understanding of the represented objects and related systems by their most representative features and those relationship derived from these features (i.e., structural data). In this paper we propose a conceptual clustering methodology termed EMO-CC for Evolutionary Multi-Objective Conceptual Clustering that uses multi-objective and multi-modal optimization techniques based on Evolutionary Algorithms that uncover representative substructures from structural databases. Besides, EMO-CC provides annotations of the uncovered substructures, and based on them, applies an unsupervised classification approach to retrieve new members of previously discovered substructures. We apply EMO-CC to the Gene Ontology database to recover interesting substructures that describes problems from different points of view and use them to explain inmuno-inflammatory responses measured in terms of gene expression profiles derived from the analysis of longitudinal blood expression profiles of human volunteers treated with intravenous endotoxin compared to placebo.
KeywordsPareto Front Structural Database Pareto Optimal Front Origin Recognition Complex Conceptual Cluster
Unable to display preview. Download preview PDF.
- 4.Cook, D., Holder, L., Su, S., Maglothin, R., Jonyer, I.: Structural mining of molecular biology data. IEEE Engineering in Medicine and Biology, special issue on Advances in Genomics 4, 67–74 (2001)Google Scholar
- 7.Back, T., Fogel, D., Michalewicz, Z. (eds.): Handbook of Evolutionary Computation. IOP Publishing Ltd., Bristol (1997)Google Scholar
- 10.Romero-Zaliz, R., Cord´on, O., Rubio-Escudero, C., Zwir, I., Cobb, J. A multiobjective evolutionary conceptual clustering methodology for gene annotation from networking databases (Submited)Google Scholar
- 11.Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley- Interscience, Chichester (2000)Google Scholar
- 12.Der, G., Everitt, B.: A handbook of statistical analyses using SAS. CHAPMANHALL (1996)Google Scholar
- 14.Bezdek, J.: Fuzzy clustering. In: Ruspini, E., Bonissone, P., Pedrycz, W. (eds.) Handbook of Fuzzy Computation, pp. f6.1:1–f6.6:19. Institute of Physics Press (1998)Google Scholar
- 20.Romero-Zaliz, R., Zwir, I., Ruspini, E.: Generalized Analysis of Promoters (GAP): A method for DNA sequence description. In: Applications of Multi-Objective Evolutionary Algorithms, pp. 427–450. World Scientific, Singapore (2004)Google Scholar
- 21.Gasch, A., Eisen, M.: Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biology 3 (2002)Google Scholar