Skip to main content

Correlated Protein Function Prediction with Robust Feature Selection

  • Conference paper
  • First Online:
Bio-inspired Computing: Theories and Applications (BIC-TA 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1160))

  • 863 Accesses

Abstract

Determining the functional roles of proteins is a vital task to understand life at molecular level and has great biomedical and pharmaceutical implications. With the development of novel high-throughput techniques, enormous amounts of protein-protein interaction (PPI) data are collected and provide an important and feasible way for studying protein function predictions. According to this, many approaches assign biological functions to all proteins using PPI networks directly. However, due to the extreme complexity of the topology structure of real PPI networks, it is very difficult and time consuming to seek the global optimization or clustering on the networks. In addition, biological functions are often highly correlated, which makes functions assigned to proteins are not independent. To address these challenges, in this paper we propose a two-stage function annotation method with robust feature selection. First, we transform the network into the low-dimensional representations of nodes via manifold learning. Then, we integrate the functional correlation into the framework of multi-label linear regression, and introduce robust sparse penalty to achieve the function assignment and representative feature selection simultaneously. For the optimization, we design an efficient algorithm to iteratively solve several subproblems with closed-form solutions. Extensive experiments against other baseline methods on Saccharomyces cerevisiae data demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Vazquez, A., Flammini, A., Maritan, A., Vespignani, A.: Global protein function prediction from protein-protein interaction networks. Nat. Biotechnol. 21(6), 697–700 (2003)

    Article  Google Scholar 

  2. Ashburner, M., et al.: Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet. 25, 25–29 (2000)

    Article  Google Scholar 

  3. Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, pp. 681–687. MIT Press, Cambridge (2001). http://dl.acm.org/citation.cfm?id=2980539.2980628

  4. Gligorijevic, V., Barot, M., Bonneau, R.: deepNF: deep network fusion for protein function prediction. Bioinformatics (Oxford, England) 34, 3873–3881 (2018). https://doi.org/10.1093/bioinformatics/bty440

    Article  Google Scholar 

  5. Hishigaki, H., Nakai, K., Ono, T., Tanigami, A., Takagi, T.: Assessment of prediction accuracy of protein function from protein-protein interaction data. Yeast 18(6), 523–531 (2001). https://doi.org/10.1002/yea.706

    Article  Google Scholar 

  6. Wang, H., Huang, H., Ding, C.: Image annotation using multi-label correlated green’s function. In: 2009 IEEE 12th International Conference on Computer Vision. pp. 2029–2034, September 2009. https://doi.org/10.1109/ICCV.2009.5459447

  7. Wang, H., Huang, H., Ding, C.: Image annotation using bi-relational graph of images and semantic labels. In: CVPR 2011, pp. 793–800, June 2011. https://doi.org/10.1109/CVPR.2011.5995379

  8. Karaoz, U., et al.: Whole-genome annotation by using evidence integration in functional-linkage networks. Proc. Natl. Acad. Sci. 101, 2888–2893 (2004). https://doi.org/10.1073/pnas.0307326101

    Article  Google Scholar 

  9. Liu, J., Wang, J., Yu, G.: Protein function prediction by random walks on a hybrid graph. Curr. Proteom. 13, 130–142 (2016). https://doi.org/10.2174/157016461302160514004307

    Article  Google Scholar 

  10. Mewes, H., et al.: MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 28(1), 37–40 (2000). https://doi.org/10.1093/nar/28.1.37. http://europepmc.org/articles/PMC102494

    Article  Google Scholar 

  11. Zhang, M.-L., Zhou, Z.-H.: Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006). https://doi.org/10.1109/TKDE.2006.162

    Article  Google Scholar 

  12. Nabieva, E., Jim, K., Agarwal, A., Chazelle, B., Singh, M.: Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21(1), 302–310 (2005)

    Article  Google Scholar 

  13. Pizzuti, C.: GA-net: a genetic algorithm for community detection in social networks. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 1081–1090. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87700-4_107

    Chapter  Google Scholar 

  14. Schwikowski, B., Uetz, P., Fields, S.: A network of protein-protein interactions in yeast. Nat. Biotechnol. 18(12), 1257–1261 (2000). https://doi.org/10.1038/82360

    Article  Google Scholar 

  15. Sharan, R., Ulitsky, I., Shamir, R.: Network-based prediction of protein function. Mol. Syst. Biol. 3, 88 (2007). https://doi.org/10.1038/msb4100129

    Article  Google Scholar 

  16. Wang, H., Huang, H., Ding, C.: Correlated protein function prediction via maximization of data-knowledge consistency. In: International Conference on Research in Computational Molecular Biology (2014)

    Google Scholar 

  17. Yu, Z., Fu, G., Wang, J., Zhao, Y.: NewGOA: predicting new go annotations of proteins by bi-random walks on a hybrid graph. IEEE/ACM Trans. Comput. Biol. Bioinform. 1 (2017). https://doi.org/10.1109/TCBB.2017.2715842

  18. Zhang, M., Wu, L.: LIFT: multi-label learning with label-specific features. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 107–120 (2015). https://doi.org/10.1109/TPAMI.2014.2339815

    Article  Google Scholar 

  19. Zhang, M.L.: ML-RBF: RBF neural networks for multi-label learning. Neural Process. Lett. 29(2), 61–74 (2009). https://doi.org/10.1007/s11063-009-9095-3

    Article  Google Scholar 

  20. Zhang, M.L., Peña, J.M., Robles, V.: Feature selection for multi-label naive bayes classification. Inform. Sci. 179(19), 3218–3229 (2009). https://doi.org/10.1016/j.ins.2009.06.010. http://www.sciencedirect.com/science/article/pii/S0020025509002552

    Article  MATH  Google Scholar 

  21. Zhang, M.L., Zhang, K.: Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 999–1008. ACM, New York (2010). https://doi.org/10.1145/1835804.1835930

  22. Zhao, H., Sun, D., Wang, R., Luo, B.: A network-based approach for protein functions prediction using locally linear embedding. In: International Conference on Bioinformatics and Biomedical Engineering (2010)

    Google Scholar 

  23. You, Z.H., Lei, Y.K., Gui, J., Huang, D.S., Zhou, X.: Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data. Bioinformatics 26(21), 2744–2751 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Key Natural Science Project of Anhui Provincial Education Department (KJ2018A0023), the Guangdong Province Science and Technology Plan Projects (2017B010110011), the Anhui Key Research and Development Plan (1804a09020101), the National Basic Research Program (973 Program) of China (2015CB351705) and the National Natural Science Foundation of China (61906002, 61402002, 61876002 and 61860206004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhuanlian Ding .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, D., Sun, H., Wu, H., Liang, H., Ding, Z. (2020). Correlated Protein Function Prediction with Robust Feature Selection. In: Pan, L., Liang, J., Qu, B. (eds) Bio-inspired Computing: Theories and Applications. BIC-TA 2019. Communications in Computer and Information Science, vol 1160. Springer, Singapore. https://doi.org/10.1007/978-981-15-3415-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-3415-7_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-3414-0

  • Online ISBN: 978-981-15-3415-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics