Prediction of Calmodulin-Binding Proteins Using Short-Linear Motifs

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10209)

Abstract

Prediction of Calmodulin-binding (CaM-binding) proteins plays a very important role in the fields of biology and biochemistry, because Calmodulin binds and regulates a multitude of protein targets affecting different cellular processes. Short linear motifs (SLiMs), on the other hand, have been effectively used as features for analyzing protein-protein interactions, though their properties have not been used in the prediction of CaM-binding proteins. In this study, we propose a new method for prediction of CaM-binding proteins based on both the total and average scores of SLiMs in protein sequences using a new scoring method, which we call Sliding Window Scoring (SWS) as features for the prediction. A dataset of 194 manually curated human CaM-binding proteins and 193 Mitochondrial proteins have been obtained and used for testing the proposed model. Multiple EM for Motif Elucidation (MEME) has been used to obtain new motifs from each of the positive and negative datasets individually (the SM approach) and from the combined negative and positive datasets (the CM approach). Moreover, the wrapper criterion with Random Forest for feature selection (FS) has been applied followed by classification using different algorithms such as k-nearest neighbor (k-NN), support vector machine (SVM), and Random Forest (RF), on a 3-fold cross-validation setup. Our proposed method shows promising prediction results and demonstrates how information contained in SLiMs is highly relevant for prediction of CaM-binding proteins.

Keywords

Calmodulin-binding proteins Short-linear motifs Sliding window scoring Classification Protein interaction 

References

  1. 1.
    Stevens, F.C.: Calmodulin: an introduction. Can. J. Biochem. Cell Biol. 61(8), 906–910 (1983)Google Scholar
  2. 2.
    Yap, K.L., Kim, J., Truong, K., Sherman, M., Yuan, T., Ikura, M.: Calmodulin target database. J. Struct. Funct. Genomics 1(1), 8–14 (2000)CrossRefGoogle Scholar
  3. 3.
    Ren, S., Yang, G., He, Y., Wang, Y., Li, Y., Chen, Z.: The conservation pattern of short linear motifs is highly correlated with the function of interacting protein domains. BMC Genomics 9(1), 452 (2008)CrossRefGoogle Scholar
  4. 4.
    Haslam, N.J., Niall, J., Shields, D.C.: Profile-based short linear protein motif discovery. BMC Bioinform. 13(1), 104 (2012)CrossRefGoogle Scholar
  5. 5.
    Rueda, L., Pandit, M.: A model based on minimotifs for classification of stable protein-protein complexes. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Hawaii, USA (2014)Google Scholar
  6. 6.
    Davey, N.E., Haslam, N.J., Shields, D.C., Edwards, R.J.: SLiMFinder: a web server to find novel, significantly over-represented, short protein motifs. Nucleic Acids Res. 38, W534–W539 (2010)CrossRefGoogle Scholar
  7. 7.
    Davey, N.E., Haslam, N.J., Shields, D.C., Edwards, R.J.: SLiMSearch 2.0: biological context for short linear motifs in proteins. Nucleic Acids Res. 39(2), W56–W60 (2011)CrossRefGoogle Scholar
  8. 8.
    Schiller, M.R., Mi, T., Merlin, J.C., Deverasetty, S., Gryk, M.R., Bill, T.J., Brooks, A.W.: Minimotif Miner 3.0: database expansion and significantly improved reduction of false-positive predictions from consensus sequences. Nucleic Acids Res. 40, 252–260 (2011)Google Scholar
  9. 9.
    Bailey, T.L., Elkan, C.J.: The value of prior knowledge in discovering motifs with meme. ISMB 3, 21–29 (1995)Google Scholar
  10. 10.
    Bailey, T.L., Williams, N., Misleh, C., Li, W.: MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 34, W369–W373 (2006)CrossRefGoogle Scholar
  11. 11.
    Pandit, M., Maleki, M., Carruthers, N.J., Stemmer, P., Rueda, L.: Prediction of calmodulin-binding proteins using canonical motifs. In: Great Lakes Bioinformatics (GLBIO), Toronto, Canada (2016)Google Scholar
  12. 12.
    Mruk, K., Farley, B.M., Ritacco, A.W., Kobertz, W.R.: Calmodulation meta-analysis: Predicting calmodulin binding via canonical motif clustering. J. Gen. Physiol. 144(1), 105–114 (2014)CrossRefGoogle Scholar
  13. 13.
    Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley, New York (2000)MATHGoogle Scholar
  14. 14.
    Sharma, T.C., Jain, M.: WEKA approach for comparative study of classification algorithm. Intl. J. Adv. Res. Comput. Commun. Eng. 2(4), 1925–1931 (2016)Google Scholar
  15. 15.
    Saeys, Y., Inza, I., Larraaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(9), 2507–2517 (2007)CrossRefGoogle Scholar
  16. 16.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Computer ScienceUniversity of WindsorWindsorCanada
  2. 2.Institute of Environmental Health SciencesWayne State UniversityDetroitUSA

Personalised recommendations