Genetic Programming for Predicting Protein Networks
One of the definitely unsolved main problems in molecular biology is the protein-protein functional association prediction problem. Genetic Programming (GP) is applied to this domain. GP evolves an expression, equivalent to a binary classifier, which predicts if a given pair of proteins interacts. We take advantages of GP flexibility, particularly, the possibility of defining new operations. In this paper, the missing values problem benefits from the definition of if-unknown, a new operation which is more appropriate to the domain data semantics. Besides, in order to improve the solution size and the computational time, we use the Tarpeian method which controls the bloat effect of GP. According to the obtained results, we have verified the feasibility of using GP in this domain, and the enhancement in the search efficiency and interpretability of solutions due to the Tarpeian method.
KeywordsProtein interaction prediction genetic programming data integration bioinformatics evolutionary computation machine learning classification control bloat
Unable to display preview. Download preview PDF.
- 1.Rojas, A., Juan, D., Valencia, A.: Molecular interactions: Learning form protein complexes. In: Leon, D., Markel, S. (eds.) Silico Technologies in Drug Target Identification and Validation, vol. 6, pp. 225–244 (2006)Google Scholar
- 9.Mahler, S., Robilliard, D., Fonlupt, C.: Tarpeian Bloat Control and Generalization Accuracy. In: Keijzer, M., Tettamanzi, A.G.B., Collet, P., van Hemert, J., Tomassini, M. (eds.) EuroGP 2005. LNCS, vol. 3447, pp. 203–214. Springer, Heidelberg (2005)Google Scholar
- 12.Zongker, D., Punch, B.: Lil-Gp Genetic Programming System (1998), http://garage.Cse.Msu.edu/software/lil-Gp/
- 13.Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Data Mining Researchers (2003)Google Scholar