Advertisement

Bayesian Learning for Feed-Forward Neural Network with Application to Proteomic Data: The Glycosylation Sites Detection of the Epidermal Growth Factor-Like Proteins Associated with Cancer as a Case Study

  • Alireza Shaneh
  • Gregory Butler
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4013)

Abstract

There are some neural network applications in proteomics; however, design and use of a neural network depends on the nature of the problem and the dataset studied. Bayesian framework is a consistent learning paradigm for a feed-forward neural network to infer knowledge from experimental data. Bayesian regularization automates the process of learning by pruning the unnecessary weights of a feed-forward neural network, a technique of which has been shown in this paper and applied to detect the glycosylation sites in epidermal growth factor-like repeat proteins involving in cancer as a case study. After applying the Bayesian framework, the number of network parameters decreased by 47.62%. The model performance comparing to One Step Secant method increased more than 34.92%. Bayesian learning produced more consistent outcomes than one step secant method did; however, it is computationally complex and slow, and the role of prior knowledge and its correlation with model selection should be further studied.

Keywords

Glycosylation Site Hide Unit Proteomic Data Bayesian Learning Window Frame 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Appella, E., Weber, I.T., Blasi, F.: Structure and Function of Epidermal Growth Factor-Like Regions in Proteins. FEBS Lett. 231, 1–4 (1988)CrossRefGoogle Scholar
  2. 2.
    Bairoch, A., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O’Donovan, C., Redaschi, N., Yeh, L.S.: The Universal Protein Resource (UniProt). Nucleic Acids Res. 33, 154–159 (2005)CrossRefGoogle Scholar
  3. 3.
    Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A., Nielsen, H.: Assessing the Accuracy of Prediction Algorithms for Classification: An Overview. Bioinformatics 16, 412–424 (2000)CrossRefGoogle Scholar
  4. 4.
    Battiti, R.: First and Second Order Methods for Learning: Between Steepest Descent and Newton’s Method. Neural Computation 4, 141–166 (1992)CrossRefGoogle Scholar
  5. 5.
    Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., Studholme, D.J., Yeats, C., Eddy, S.R.: The Pfam Protein Families Database. Nucleic Acids Res. 32, 138–141 (2004)CrossRefGoogle Scholar
  6. 6.
    Bishop, C., Neural Network, M.: for Pattern Recognition. Oxford University Press, Oxford (1995)Google Scholar
  7. 7.
    Cai, Y.D., Yu, H., Chou, K.C.: Artificial Neural Network Method for Predicting the Specificity of GalNAc-Transferase. J. Protein Chem. 16, 689–700 (1997)CrossRefGoogle Scholar
  8. 8.
    Davis, C.G.: The Many Faces of Epidermal Growth Factor Repeats. New Biol. 5, 410–419 (1997)Google Scholar
  9. 9.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley Pub. Co., Reading (1989)MATHGoogle Scholar
  10. 10.
    Gupta, R., Birch, H., Kristoffer, R., Brunak, S., Hansen, J.E.: O-GLYCBASE Version 4.0: A Revised Database of O-Glycosylated Proteins. Nuc. Acid. Res. 27, 370–372 (1999)CrossRefGoogle Scholar
  11. 11.
    Gupta, R., Brunak, S.: Prediction of Glycosylation across the Human Proteome and the Correlation to Protein Function. In: Pac. Symp. Biocomput., pp. 310–322 (2002)Google Scholar
  12. 12.
    Hakamori, S.: Glycosylation Defining Malignancy: New Wine in an Old Bottle. PNAS 99, 10231–10233 (2002)CrossRefGoogle Scholar
  13. 13.
    Haltiwanger, R.S., Lowe, J.B.: Role of Glycosylation in Development. Annu. Rev. Biochem. 73, 491–537 (2004)CrossRefGoogle Scholar
  14. 14.
    Hansen, J.E., Lund, O., Nielsen, J.O., Brunak, S.: O-GLYCBASE: A Revised Database of O-glycosylated Proteins. Nuc. Acid. Res. 24, 248–252 (1996)CrossRefGoogle Scholar
  15. 15.
    Heitzler, P., Simpson, P.: Altered Epidermal Growth Factor-Like Sequences Provide Evi- dence for a Role of Notch as a Receptor in Cell Fate Decisions. Development 117, 1113–1123 (1993)Google Scholar
  16. 16.
    Hulo, N., Sigrist, C.J.A., Le Saux, V., Langendijk-Genevaux, P.S., Bordoli, L., Gattiker, A., De Castro, E., Bucher, P., Bairoch, A.: Recent improvements to the PROSITE database. Nucl. Acids. Res. 32, 134–137 (2004)CrossRefGoogle Scholar
  17. 17.
    Julenius, K., Molgaard, A., Gupta, R., Brunak, S.: Prediction, Conservation Analysis, and Structural Characterization of Mammalian Mucin-Type O-Glycosylation Sites. Glycobiol- ogy. 15, 153–164 (2005)CrossRefGoogle Scholar
  18. 18.
    Lin, K., May, A.C.W., Taylor, W.R.: Amino Acid Encoding Schemes from Protein Structure Alignments: Multi-Dimensional Vectors to Describe Residue Types. J. Theor. Biol. 216, 361–365 (2002)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Lis, H., Sharon, N.: Protein Glycosylation: Structural and Functional Aspects. Eur. J. Bio- chem. 218, 1–27 (1993)Google Scholar
  20. 20.
    MacKay, D.J.: A Practical Bayesian Framework for Backprop Networks. Neural Computation 4, 448–472 (1992)CrossRefGoogle Scholar
  21. 21.
    MacKay, D.J.: Bayesian Interpolation. Neural Computation 4, 415–447 (1992)CrossRefGoogle Scholar
  22. 22.
    Marshall, R.D.: Glycoproteins. Annu. Rev. Biochem., 673–702 (1972)Google Scholar
  23. 23.
    Riis, S.K., Krogh, A.: Improving Prediction of Protein Secondary Structure Using Struc- tured Neural Network and Multiple Sequence Alignments. J. Comp. Biol. 3, 163–183 (1996)CrossRefGoogle Scholar
  24. 24.
    Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)MATHGoogle Scholar
  25. 25.
    Zadeh, L.A.: Fuzzy Sets. Information and Control 8, 338–353 (1965)MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Alireza Shaneh
    • 1
  • Gregory Butler
    • 1
  1. 1.Research Laboratory for Bioinformatics Technology, Department of Computer Science and Software EngineeringConcordia UniversityMontreal, QuebecCanada

Personalised recommendations