Design of String Kernel to Predict Protein Functional Sites Using Kernel-Based Classifiers

Maji, Pradipta; Paul, Sushmita

doi:10.1007/978-3-319-05630-2_3

Design of String Kernel to Predict Protein Functional Sites Using Kernel-Based Classifiers

Pradipta Maji³ &
Sushmita Paul³

Chapter
First Online: 01 January 2014

1419 Accesses

Abstract

The prediction of functional sites in proteins is another important problem in bioinformatics. It is an important issue in protein function studies and hence, drug design.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Aho AV, Corasick M (1975) Efficient string matching: an aid to bibliographic search. Commun ACM 18(6):333–340
Article MATH MathSciNet Google Scholar
Altschul SF, Boguski MS, Gish W, Wootton JC (1994) Issues in searching molecular sequence databases. Nat Genet 6(2):119–129
Article Google Scholar
Altschul SF, Gish W, Miller W, Myers E, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410
Google Scholar
Arrigo P, Giuliano F, Damiani G (1991) Identification of a new Motif on nucleic acid sequence data using Kohonen’s self-organising map. Comput Appl Biosci 7(3):353–357
Google Scholar
Aspin A (1949) Tables for use in comparisons whose accuracy involves two variances separately estimated. Biometrika 36(3–4):290–296
MathSciNet Google Scholar
Baldi P, Brunak S (1998) Bioinformatics: the machine learning approach. MIT Press, Cambridge
Google Scholar
Baldi P, Pollastri G, Anderson CA, Brunak S (1995) Matching protein Beta-sheet partners by feedforward and recurrent neural networks. Proc Int Conf Intell Syst Mol Biol 8:25–36
Google Scholar
Berry EA, Dalby AR, Yang ZR (2004) Reduced bio-basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms. Comput Biol Chem 28(1):75–85
Article MATH Google Scholar
Cai YD, Chou KC (1998) Artificial neural network model for predicting HIV protease cleavage sites in protein. Adv Eng Softw 29(2):119–128
Article Google Scholar
Cai YD, Liu XJ, Xu XB, Chou KC (2002) Support vector machines for predicting the specificity of GalNAc-transferase. Peptides 23:205–208
Article Google Scholar
Chou KC (1993) A vectorised sequence-coupling model for predicting HIV protease cleavage sites in proteins. J Biol Chem 268(23):16, 938–16, 948
Google Scholar
Chou KC (1996) Prediction of human immunodeficiency virus protease cleavage sites in proteins. Anal Biochem 233(1):1–14
Article Google Scholar
Dayhoff MO, Schwartz RM, Orcutt BC (1978) A model of evolutionary change in proteins. Matrices for detecting distant relationships. Atlas Protein Seq Struct 5:345–358
Google Scholar
Duda RO, Hart PE, Stork DG (1999) Pattern classification and scene analysis. Wiley, New York
Google Scholar
Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. In: Proc Nat Acad Sci USA 89:10, 915–10, 91
Google Scholar
Itoh M, Goto S, Akutsu T, Kanehisa M (2005) Fast and accurate database homology search using upper bounds of local alignment scores. Bioinformatics 21(7):912–921
Article Google Scholar
Johnson MS, Overington JP (1993) A structural basis for sequence comparisons: an evaluation of scoring methodologies. J Mol Biol 233(4):716–738
Article Google Scholar
Lui YM, Cheng HD (1996) A new peak selection criterion based on minimizing the classification error. Inf Sci 94(1–4):213–233
Google Scholar
Maji P, Pal SK (2007) Protein sequence analysis using relational soft clustering algorithms. Int J Comput Math 84(5):599–617
Article MATH MathSciNet Google Scholar
Maji P, Pal SK (2007) Rough-Fuzzy C-medoids algorithm and selection of bio-basis for amino acid sequence analysis. IEEE Trans Knowl Data Eng 19(6):859–872
Article Google Scholar
Maji P, Das C (2010) Efficient design of bio-basis function to predict protein functional sites using Kernel-based classifiers. IEEE Trans NanoBiosci 9(4):242–249
Article Google Scholar
Maji P, Das C (2010) Protein functional sites prediction using modified bio-basis function and quantitative indices. IEEE Trans NanoBiosci 9(4):250–257
Article Google Scholar
Maji P, Pal SK (2012) Rough-fuzzy pattern recognition: applications in bioinformatics and medical imaging. Wiley-IEEE Computer Society Press, New Jersey
Book Google Scholar
Miller M, Schneider J, Sathayanarayana BK, Toth MV, Marshall GR, Clawson L, Selk L, Kent SBH, Wlodawer A (1989) Structure of complex of synthetic HIV-1 protease with substrate-based inhibitor at 2.3 a resolution. Science 246(4934):1149–1152
Article Google Scholar
Minakuchi Y, Satou K, Konagaya A (2002) Prediction of protein-protein interaction sites using support vector machines. Genome Inform 13:322–323
Google Scholar
Narayanan A, Wu XK, Yang ZR (2002) Mining viral protease data to extract cleavage knowledge. Bioinformatics 18:5–13
Article Google Scholar
Pearl LH, Taylor WR (1987) A structural model for the retroviral proteases. Nature 329(6137):351–354
Article Google Scholar
Qian N, Sejnowski TJ (1988) Predicting the secondary structure of globular proteins using neural network models. J Mol Biol 202(4):865–884
Article Google Scholar
Rohn TT, Cusack SM, Kessinger SR, Oxford JT (2004) Caspase activation independent of cell death is required for proper cell dispersal and correct morphology in PC12 cells. Exp Cell Res 295(1):215–225
Article Google Scholar
Searls DB (1996) Sequence alignment through pictures. Trends Genet 12:35–37
Article Google Scholar
Searls DB, Murphy KP (1995) Automata-theoretic models of mutation and alignment. In: Proceedings of the 3rd international conference on intelligent systems for molecular biology, The AAAI Press, pp 341–349
Google Scholar
Shannon C, Weaver W (1964) The mathematical theory of communication. University of Illinois Press, Champaign
Google Scholar
Stojmirovic A (2004) Quasi-metric spaces with measure. Topol Proc 28(2):655–671
MATH MathSciNet Google Scholar
Thompson K (1968) Regular expression search algorithm. Commun ACM 11(6):419–422
Article MATH Google Scholar
Thomson R, Hodgman C, Yang ZR, Doyle AK (2003) Characterising Proteolytic cleavage site activity using bio-basis function neural network. Bioinformatics 19(14):1741–1747
Article Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer-Verlag, New York
Book MATH Google Scholar
Yang ZR (2004) Biological application of support vector machines. Briefings Bioinform 5(4):328–338
Article Google Scholar
Yang ZR (2005) Orthogonal Kernel machine for the prediction of functional sites in proteins. IEEE Trans Syst Man Cybern Part B Cybern 35(1):100–106
Article Google Scholar
Yang ZR (2005) Prediction of caspase cleavage sites using bayesian bio-basis function neural networks. Bioinformatics 21(9):1831–1837
Article Google Scholar
Yang ZR, Chou KC (2004) Predicting the O-Linkage sites in glycoproteins using bio-basis function neural networks. Bioinformatics 20(6):903–908
Article Google Scholar
Yang ZR, Thomson R (2005) Bio-basis function neural network for prediction of protease cleavage sites in proteins. IEEE Trans Neural Netw 16(1):263–274
Article MATH Google Scholar
Yang ZR, Thomson R, McNeil P, Esnouf R (2005) RONN: use of the bio-basis function neural network technique for the detection of natively disordered regions in proteins. Bioinformatics 21(16):3369–3376
Article Google Scholar

Download references

Author information

Authors and Affiliations

Indian Statistical Institute, Kolkata, West Bengal, India
Pradipta Maji & Sushmita Paul

Authors

Pradipta Maji
View author publications
You can also search for this author in PubMed Google Scholar
Sushmita Paul
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pradipta Maji .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Maji, P., Paul, S. (2014). Design of String Kernel to Predict Protein Functional Sites Using Kernel-Based Classifiers. In: Scalable Pattern Recognition Algorithms. Springer, Cham. https://doi.org/10.1007/978-3-319-05630-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-05630-2_3
Published: 20 March 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05629-6
Online ISBN: 978-3-319-05630-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics