Protein function prediction using guilty by association from interaction networks
- 442 Downloads
Protein function prediction from sequence using the Gene Ontology (GO) classification is useful in many biological problems. It has recently attracted increasing interest, thanks in part to the Critical Assessment of Function Annotation (CAFA) challenge. In this paper, we introduce Guilty by Association on STRING (GAS), a tool to predict protein function exploiting protein–protein interaction networks without sequence similarity. The assumption is that whenever a protein interacts with other proteins, it is part of the same biological process and located in the same cellular compartment. GAS retrieves interaction partners of a query protein from the STRING database and measures enrichment of the associated functional annotations to generate a sorted list of putative functions. A performance evaluation based on CAFA metrics and a fair comparison with optimized BLAST similarity searches is provided. The consensus of GAS and BLAST is shown to improve overall performance. The PPI approach is shown to outperform similarity searches for biological process and cellular compartment GO predictions. Moreover, an analysis of the best practices to exploit protein–protein interaction networks is also provided.
KeywordsProtein function Protein interaction network Gene ontology CAFA Protein sequence
The authors are grateful to members of the BioComputing UP lab for insightful discussions. This project was funded by FIRB Futuro in Ricerca grant RBFR08ZSXY, University of Padua grant CPDR123473, and AIRC grant MFAG12740 to S.T. D.P. is funded by FIRC Fondazione Italiana per la Ricerca sul Cancro project no. 16621.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.