Advertisement

A Neural Procedure for Gene Function Prediction

Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 19)

Abstract

The graph classification problem consists, given a weighted graph and a partial node labeling, in extending the labels to all nodes. In many real-world context, such as Gene Function Prediction, the partial labeling is unbalanced: positive labels are much less than negatives. In this paper we present a new neural algorithm for predicting labels in presence of label imbalance. This algorithm is based on a family of Hopfield networks, described by 2 continuous parameters and 1 discrete parameter, and it consists of two main steps: 1) the network parameters are learnt through a cost-sensitive optimization procedure based on local search; 2) a suitable Hopfield network restricted to unlabeled nodes is considered and simulated. The reached equilibrium point induces the classification of unlabeled nodes. An experimental analysis on real-world unbalanced data in the context of genome-wide prediction of gene functions show the effectiveness of the proposed approach.

Keywords

Neural Network Hopfield Network Gene Function Prediction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Belkin, M., Matveeva, I., Niyogi, P.: Regularization and Semi-supervised Learning on Large Graphs. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 624–638. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  2. 2.
    Bengio, Y., Delalleau, O., Le Roux, N.: Label Propagation and Quadratic Criterion. In: Chapelle, O., Scholkopf, B., Zien, A. (eds.) Semi-Supervised Learning, pp. 193–216. MIT Press (2006)Google Scholar
  3. 3.
    Bertoni, A., Frasca, M., Valentini, G.: COSNet: A Cost Sensitive Neural Network for Semi-supervised Learning in Graphs. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 219–234. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    Borgatti, S., Mehra, A., Brass, D., Labianca, G.: Network Analysis in the Social Sciences. Science 232, 892–895 (2009)CrossRefGoogle Scholar
  5. 5.
    Deng, M., Chen, T., Sun, F.: An integrated probabilistic model for functional prediction of proteins. J. Comput. Biol. 11, 463–475 (2004)CrossRefGoogle Scholar
  6. 6.
    Dorogovtsev, S., Mendes, J.: Evolution of networks: From biological nets to the Internet and WWW. Oxford University Press, Oxford (2003)MATHGoogle Scholar
  7. 7.
    Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pp. 973–978 (2001)Google Scholar
  8. 8.
    Hopfield, J.: Neural networks and physical systems with emergent collective compatational abilities. Proc. Natl. Acad. Sci. USA 79, 2554–2558 (1982)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Karaoz, U., et al.: Whole-genome annotation by using evidence integration in functional-linkage networks. Proc. Natl. Acad. Sci. USA 101, 2888–2893 (2004)CrossRefGoogle Scholar
  10. 10.
    Lin, H.T., Lin, C.J., Weng, R.: A note on platt’s probabilistic outputs for support vector machines. Machine Learning 68(3), 267–276 (2007)CrossRefGoogle Scholar
  11. 11.
    Marcotte, E., Pellegrini, M., Thompson, M., Yeates, T., Eisenberg, D.: A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86 (1999)CrossRefGoogle Scholar
  12. 12.
    Ruepp, A., et al.: The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Research 32(18), 5539–5545 (2004)CrossRefGoogle Scholar
  13. 13.
    Szummer, M., Jaakkola, T.: Partially labeled classification with Markov random walks. In: Advances in Neural Information Processing Systems (NIPS), vol. 14, pp. 945–952. MIT Press (2001)Google Scholar
  14. 14.
    Tsuda, K., Shin, H., Scholkopf, B.: Fast protein classification with multiple networks. Bioinformatics 21(suppl. 2), ii59–ii65 (2005)Google Scholar
  15. 15.
    Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)CrossRefGoogle Scholar
  16. 16.
    Wuchty, S., Ravasz, E., Barabsi, A.L.: The architecture of biological networks. Complex Systems in Biomedicine 5259, 165–181 (2003)Google Scholar
  17. 17.
    Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML, pp. 912–919 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Dipartimento di Scienze dell’InformazioneUniversità degli Studi di MilanoMilanoItaly

Personalised recommendations