Abstract
Prediction of Calmodulin-binding (CaM-binding) proteins plays a very important role in the fields of biology and biochemistry, because Calmodulin binds and regulates a multitude of protein targets affecting different cellular processes. Short linear motifs (SLiMs), on the other hand, have been effectively used as features for analyzing protein-protein interactions, though their properties have not been used in the prediction of CaM-binding proteins. In this study, we propose a new method for prediction of CaM-binding proteins based on both the total and average scores of SLiMs in protein sequences using a new scoring method, which we call Sliding Window Scoring (SWS) as features for the prediction. A dataset of 194 manually curated human CaM-binding proteins and 193 Mitochondrial proteins have been obtained and used for testing the proposed model. Multiple EM for Motif Elucidation (MEME) has been used to obtain new motifs from each of the positive and negative datasets individually (the SM approach) and from the combined negative and positive datasets (the CM approach). Moreover, the wrapper criterion with Random Forest for feature selection (FS) has been applied followed by classification using different algorithms such as k-nearest neighbor (k-NN), support vector machine (SVM), and Random Forest (RF), on a 3-fold cross-validation setup. Our proposed method shows promising prediction results and demonstrates how information contained in SLiMs is highly relevant for prediction of CaM-binding proteins.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Stevens, F.C.: Calmodulin: an introduction. Can. J. Biochem. Cell Biol. 61(8), 906–910 (1983)
Yap, K.L., Kim, J., Truong, K., Sherman, M., Yuan, T., Ikura, M.: Calmodulin target database. J. Struct. Funct. Genomics 1(1), 8–14 (2000)
Ren, S., Yang, G., He, Y., Wang, Y., Li, Y., Chen, Z.: The conservation pattern of short linear motifs is highly correlated with the function of interacting protein domains. BMC Genomics 9(1), 452 (2008)
Haslam, N.J., Niall, J., Shields, D.C.: Profile-based short linear protein motif discovery. BMC Bioinform. 13(1), 104 (2012)
Rueda, L., Pandit, M.: A model based on minimotifs for classification of stable protein-protein complexes. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Hawaii, USA (2014)
Davey, N.E., Haslam, N.J., Shields, D.C., Edwards, R.J.: SLiMFinder: a web server to find novel, significantly over-represented, short protein motifs. Nucleic Acids Res. 38, W534–W539 (2010)
Davey, N.E., Haslam, N.J., Shields, D.C., Edwards, R.J.: SLiMSearch 2.0: biological context for short linear motifs in proteins. Nucleic Acids Res. 39(2), W56–W60 (2011)
Schiller, M.R., Mi, T., Merlin, J.C., Deverasetty, S., Gryk, M.R., Bill, T.J., Brooks, A.W.: Minimotif Miner 3.0: database expansion and significantly improved reduction of false-positive predictions from consensus sequences. Nucleic Acids Res. 40, 252–260 (2011)
Bailey, T.L., Elkan, C.J.: The value of prior knowledge in discovering motifs with meme. ISMB 3, 21–29 (1995)
Bailey, T.L., Williams, N., Misleh, C., Li, W.: MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 34, W369–W373 (2006)
Pandit, M., Maleki, M., Carruthers, N.J., Stemmer, P., Rueda, L.: Prediction of calmodulin-binding proteins using canonical motifs. In: Great Lakes Bioinformatics (GLBIO), Toronto, Canada (2016)
Mruk, K., Farley, B.M., Ritacco, A.W., Kobertz, W.R.: Calmodulation meta-analysis: Predicting calmodulin binding via canonical motif clustering. J. Gen. Physiol. 144(1), 105–114 (2014)
Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley, New York (2000)
Sharma, T.C., Jain, M.: WEKA approach for comparative study of classification algorithm. Intl. J. Adv. Res. Comput. Commun. Eng. 2(4), 1925–1931 (2016)
Saeys, Y., Inza, I., Larraaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(9), 2507–2517 (2007)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Li, Y., Maleki, M., Carruthers, N.J., Rueda, L., Stemmer, P.M., Ngom, A. (2017). Prediction of Calmodulin-Binding Proteins Using Short-Linear Motifs. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2017. Lecture Notes in Computer Science(), vol 10209. Springer, Cham. https://doi.org/10.1007/978-3-319-56154-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-56154-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56153-0
Online ISBN: 978-3-319-56154-7
eBook Packages: Computer ScienceComputer Science (R0)