Skip to main content

Design of String Kernel to Predict Protein Functional Sites Using Kernel-Based Classifiers

  • Chapter
  • First Online:
  • 1419 Accesses

Abstract

The prediction of functional sites in proteins is another important problem in bioinformatics. It is an important issue in protein function studies and hence, drug design.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aho AV, Corasick M (1975) Efficient string matching: an aid to bibliographic search. Commun ACM 18(6):333–340

    Article  MATH  MathSciNet  Google Scholar 

  2. Altschul SF, Boguski MS, Gish W, Wootton JC (1994) Issues in searching molecular sequence databases. Nat Genet 6(2):119–129

    Article  Google Scholar 

  3. Altschul SF, Gish W, Miller W, Myers E, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410

    Google Scholar 

  4. Arrigo P, Giuliano F, Damiani G (1991) Identification of a new Motif on nucleic acid sequence data using Kohonen’s self-organising map. Comput Appl Biosci 7(3):353–357

    Google Scholar 

  5. Aspin A (1949) Tables for use in comparisons whose accuracy involves two variances separately estimated. Biometrika 36(3–4):290–296

    MathSciNet  Google Scholar 

  6. Baldi P, Brunak S (1998) Bioinformatics: the machine learning approach. MIT Press, Cambridge

    Google Scholar 

  7. Baldi P, Pollastri G, Anderson CA, Brunak S (1995) Matching protein Beta-sheet partners by feedforward and recurrent neural networks. Proc Int Conf Intell Syst Mol Biol 8:25–36

    Google Scholar 

  8. Berry EA, Dalby AR, Yang ZR (2004) Reduced bio-basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms. Comput Biol Chem 28(1):75–85

    Article  MATH  Google Scholar 

  9. Cai YD, Chou KC (1998) Artificial neural network model for predicting HIV protease cleavage sites in protein. Adv Eng Softw 29(2):119–128

    Article  Google Scholar 

  10. Cai YD, Liu XJ, Xu XB, Chou KC (2002) Support vector machines for predicting the specificity of GalNAc-transferase. Peptides 23:205–208

    Article  Google Scholar 

  11. Chou KC (1993) A vectorised sequence-coupling model for predicting HIV protease cleavage sites in proteins. J Biol Chem 268(23):16, 938–16, 948

    Google Scholar 

  12. Chou KC (1996) Prediction of human immunodeficiency virus protease cleavage sites in proteins. Anal Biochem 233(1):1–14

    Article  Google Scholar 

  13. Dayhoff MO, Schwartz RM, Orcutt BC (1978) A model of evolutionary change in proteins. Matrices for detecting distant relationships. Atlas Protein Seq Struct 5:345–358

    Google Scholar 

  14. Duda RO, Hart PE, Stork DG (1999) Pattern classification and scene analysis. Wiley, New York

    Google Scholar 

  15. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. In: Proc Nat Acad Sci USA 89:10, 915–10, 91

    Google Scholar 

  16. Itoh M, Goto S, Akutsu T, Kanehisa M (2005) Fast and accurate database homology search using upper bounds of local alignment scores. Bioinformatics 21(7):912–921

    Article  Google Scholar 

  17. Johnson MS, Overington JP (1993) A structural basis for sequence comparisons: an evaluation of scoring methodologies. J Mol Biol 233(4):716–738

    Article  Google Scholar 

  18. Lui YM, Cheng HD (1996) A new peak selection criterion based on minimizing the classification error. Inf Sci 94(1–4):213–233

    Google Scholar 

  19. Maji P, Pal SK (2007) Protein sequence analysis using relational soft clustering algorithms. Int J Comput Math 84(5):599–617

    Article  MATH  MathSciNet  Google Scholar 

  20. Maji P, Pal SK (2007) Rough-Fuzzy C-medoids algorithm and selection of bio-basis for amino acid sequence analysis. IEEE Trans Knowl Data Eng 19(6):859–872

    Article  Google Scholar 

  21. Maji P, Das C (2010) Efficient design of bio-basis function to predict protein functional sites using Kernel-based classifiers. IEEE Trans NanoBiosci 9(4):242–249

    Article  Google Scholar 

  22. Maji P, Das C (2010) Protein functional sites prediction using modified bio-basis function and quantitative indices. IEEE Trans NanoBiosci 9(4):250–257

    Article  Google Scholar 

  23. Maji P, Pal SK (2012) Rough-fuzzy pattern recognition: applications in bioinformatics and medical imaging. Wiley-IEEE Computer Society Press, New Jersey

    Book  Google Scholar 

  24. Miller M, Schneider J, Sathayanarayana BK, Toth MV, Marshall GR, Clawson L, Selk L, Kent SBH, Wlodawer A (1989) Structure of complex of synthetic HIV-1 protease with substrate-based inhibitor at 2.3 a resolution. Science 246(4934):1149–1152

    Article  Google Scholar 

  25. Minakuchi Y, Satou K, Konagaya A (2002) Prediction of protein-protein interaction sites using support vector machines. Genome Inform 13:322–323

    Google Scholar 

  26. Narayanan A, Wu XK, Yang ZR (2002) Mining viral protease data to extract cleavage knowledge. Bioinformatics 18:5–13

    Article  Google Scholar 

  27. Pearl LH, Taylor WR (1987) A structural model for the retroviral proteases. Nature 329(6137):351–354

    Article  Google Scholar 

  28. Qian N, Sejnowski TJ (1988) Predicting the secondary structure of globular proteins using neural network models. J Mol Biol 202(4):865–884

    Article  Google Scholar 

  29. Rohn TT, Cusack SM, Kessinger SR, Oxford JT (2004) Caspase activation independent of cell death is required for proper cell dispersal and correct morphology in PC12 cells. Exp Cell Res 295(1):215–225

    Article  Google Scholar 

  30. Searls DB (1996) Sequence alignment through pictures. Trends Genet 12:35–37

    Article  Google Scholar 

  31. Searls DB, Murphy KP (1995) Automata-theoretic models of mutation and alignment. In: Proceedings of the 3rd international conference on intelligent systems for molecular biology, The AAAI Press, pp 341–349

    Google Scholar 

  32. Shannon C, Weaver W (1964) The mathematical theory of communication. University of Illinois Press, Champaign

    Google Scholar 

  33. Stojmirovic A (2004) Quasi-metric spaces with measure. Topol Proc 28(2):655–671

    MATH  MathSciNet  Google Scholar 

  34. Thompson K (1968) Regular expression search algorithm. Commun ACM 11(6):419–422

    Article  MATH  Google Scholar 

  35. Thomson R, Hodgman C, Yang ZR, Doyle AK (2003) Characterising Proteolytic cleavage site activity using bio-basis function neural network. Bioinformatics 19(14):1741–1747

    Article  Google Scholar 

  36. Vapnik V (1995) The nature of statistical learning theory. Springer-Verlag, New York

    Book  MATH  Google Scholar 

  37. Yang ZR (2004) Biological application of support vector machines. Briefings Bioinform 5(4):328–338

    Article  Google Scholar 

  38. Yang ZR (2005) Orthogonal Kernel machine for the prediction of functional sites in proteins. IEEE Trans Syst Man Cybern Part B Cybern 35(1):100–106

    Article  Google Scholar 

  39. Yang ZR (2005) Prediction of caspase cleavage sites using bayesian bio-basis function neural networks. Bioinformatics 21(9):1831–1837

    Article  Google Scholar 

  40. Yang ZR, Chou KC (2004) Predicting the O-Linkage sites in glycoproteins using bio-basis function neural networks. Bioinformatics 20(6):903–908

    Article  Google Scholar 

  41. Yang ZR, Thomson R (2005) Bio-basis function neural network for prediction of protease cleavage sites in proteins. IEEE Trans Neural Netw 16(1):263–274

    Article  MATH  Google Scholar 

  42. Yang ZR, Thomson R, McNeil P, Esnouf R (2005) RONN: use of the bio-basis function neural network technique for the detection of natively disordered regions in proteins. Bioinformatics 21(16):3369–3376

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pradipta Maji .

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Maji, P., Paul, S. (2014). Design of String Kernel to Predict Protein Functional Sites Using Kernel-Based Classifiers. In: Scalable Pattern Recognition Algorithms. Springer, Cham. https://doi.org/10.1007/978-3-319-05630-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05630-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05629-6

  • Online ISBN: 978-3-319-05630-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics