Abstract
We present a novel approach to cluster sets of protein sequences, based on Inductive Logic Programming (ILP). Preliminary results show that the method proposed produces understandable descriptions/explanations of the clusters. Furthermore, it can be used as a knowledge elicitation tool to explain clusters proposed by other clustering approaches, such as standard phylogenetic programs.
This work has been partially supported by the project ILP-Web-Service (PTDC/EIA/70841/2006) and by Fundação para a Ciência e Tecnologia. Nuno A. Fonseca is funded by FCT grant SFRH/BPD/26737/2006.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zelezný, F., Lavrač, N.: Propositionalization-based relational subgroup discovery with rsd. Machine Learning 62(1-2), 33–63 (2006)
Fonseca, N.A., Camacho, R., Rocha, R., Costa, V.S.: Compile the hypothesis space: do it once, use it often. Fundamenta Informaticae, Special Issue on Multi-Relational Data Mining (89), 45–67 (2008)
Ralaivola, L., Swamidass, S.J., Saigo, H., Baldi, P.: Graph kernels for chemical informatics. Neural Netw. 18(8), 1093–1110 (2005)
Hand, D.J., Smyth, P., Mannila, H.: Principles of data mining. MIT Press, Cambridge (2001)
Rice, P., Longden, I., Bleasby, A.: EMBOSS: The European Molecular Biology Open Software Suite. Trends in Genetics 6(16), 276–277 (2000)
Pereira, P., Fonseca, N.A., Silva, F.: Fast Discovery of Statistically Interesting Words. Technical Report DCC-2007-01, DCC-FC & LIACC, Universidade do Porto (2007)
Conesa, A., Götz, S., García-Gómez, J.M., Terol, J., Talón, M., Robles, M.: Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21(18), 3674–3676 (2005)
Ronquist, F., Huelsenbeck, J.P.: Mrbayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19(12), 1572–1574 (2003)
Notredame, C., Higgins, D.G., Heringa, J.: T-coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302(1), 205–217 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fonseca, N.A., Costa, V.S., Camacho, R., Vieira, C., Vieira, J. (2009). Partitional Clustering of Protein Sequences – An Inductive Logic Programming Approach. In: Omatu, S., et al. Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living. IWANN 2009. Lecture Notes in Computer Science, vol 5518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02481-8_152
Download citation
DOI: https://doi.org/10.1007/978-3-642-02481-8_152
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02480-1
Online ISBN: 978-3-642-02481-8
eBook Packages: Computer ScienceComputer Science (R0)