An Empirical Analysis of Instance-Based Transfer Learning Approach on Protease Substrate Cleavage Site Prediction

  • Deepak SinghEmail author
  • Dilip Singh Sisodia
  • Pradeep Singh
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 748)


Classical machine learning algorithms presume the supervised data emerged from the same domain. Transfer learning on the contrary to classical machine learning methods; utilize the knowledge acquired from the auxiliary domains to aid predictive capability of diverse data distribution in the current domain. In the last few decades, there is a significant amount of work done on the domain adaptation and knowledge transfer across the domains in the field of bioinformatics. The computational method for the classification of protease cleavage sites is significantly important in the inhibitors and drug design techniques. Matrix metalloproteases (MMP) are one such protease that has a crucial role in the disease process. However, the challenge in the computational prediction of MMPs substrate cleavage persists due to the availability of very few experimentally verified sites. The objective of this paper is to explore the cross-domain learning in the classification of protease substrate cleavage sites, such that the lack of availability of one-domain cleavage sites can be furnished by the other available domain knowledge. To achieve this objective, we employed the TrAdaBoost algorithm and its two variants: dynamic TrAdaBoost and multisource TrAdaBoost on the MMPs dataset available at PROSPER. The robustness and acceptability of the TrAdaBoost algorithms in the substrate site identification have been validated by rigorous experiments. The aim of these experiments is to compare the performances among learner. The experimental results demonstrate the potential of dynamic TrAdaBoost algorithms on the protease dataset by outperforming the fundamental and other variants of TrAdaBoost algorithms.


Instance-based learning Matrix metalloprotease (MMP) Transfer AdaBoost 


  1. 1.
    Lu, P., Takai, K., Weaver, V.M., Werb, Z.: Extracellular matrix degradation and remodeling in development and disease. Cold Spring Harb. Perspect. Biol. 3(12), 1–24 (2011)CrossRefGoogle Scholar
  2. 2.
    Coussens, L.M., Fingleton, B., Matrisian, L.M.: Matrix metalloproteinase inhibitors and cancer: trials and tribulations. Science 295(5564), 2387–2392 (2002)CrossRefGoogle Scholar
  3. 3.
    Cieplak, P., Strongin, A.Y.: Matrix metalloproteinases—from the cleavage data to the prediction tools and beyond. In: Biochimica et Biophysica Acta (BBA)—Molecular Cell Research, pp. 1–12, Jan 2017Google Scholar
  4. 4.
    Rögnvaldsson, T., Etchells, T.A., You, L., Garwicz, D., Jarman, I., Lisboa, P.J.G.: How to find simple and accurate rules for viral protease cleavage specificities. BMC Bioinform. 10, 149 (2009)Google Scholar
  5. 5.
    Yousef, M., Nebozhyn, M., Shatkay, H., Kanterakis, S., Showe, L.C., Showe, M.K.: Combining multi-species genomic data for microRNA identification using a Naïve Bayes classifier. Bioinformatics 22(11), 1325–1334 (2006)CrossRefGoogle Scholar
  6. 6.
    Wee, L.J.K., Tan, T.W., Ranganathan, S.: CASVM: web server for SVM-based prediction of caspase substrates cleavage sites. Bioinformatics 23(23), 3241–3243 (2007)CrossRefGoogle Scholar
  7. 7.
    Tan, A.C., Gilbert, D.: An empirical comparison of supervised machine learning techniques in bioinformatics, vol. 19, no. Apbc (2009)Google Scholar
  8. 8.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  9. 9.
    Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th International Conference on Machine Learning, no. 1, pp. 513–520 (2011)Google Scholar
  10. 10.
    Iqbal, M., Xue, B., Al-Sahaf, H., Zhang, M.: Cross-domain reuse of extracted knowledge in genetic programming for image classification. IEEE Trans. Evol. Comput. PP(99), 1 (2017)Google Scholar
  11. 11.
    Wang, Y., et al.: Knowledge-transfer learning for prediction of matrix metalloprotease substrate-cleavage sites. Sci. Rep. 1–15 (2017)Google Scholar
  12. 12.
    Dai, W., Yang, Q., Xue, G.-R., Yu, Y.: Boosting for transfer learning. In: Proceedings of the 24th international conference on Machine learning—ICML ’07, pp. 193–200 (2007)Google Scholar
  13. 13.
    Al-Stouhi, S., Reddy, C.K.: Adaptive boosting for transfer learning using dynamic updates. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNAI, vol. 6911, no. PART 1, pp. 60–75 (2011)Google Scholar
  14. 14.
    Yao, Y., Doretto, G.: Boosting for transfer learning with multiple sources. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1855–1862 (2010)Google Scholar
  15. 15.
    Chen, C.T., Yang, E.W., Hsu, H.J., Sun, Y.K., Hsu, W.L., Yang, A.S.: Protease substrate site predictors derived from machine learning on multilevel substrate phage display data. Bioinformatics 24(23), 2691–2697 (2008)CrossRefGoogle Scholar
  16. 16.
    Barkan, D.T., et al.: Prediction of protease substrates using sequence and structure features. Bioinformatics 26(14), 1714–1722 (2010)CrossRefGoogle Scholar
  17. 17.
    Boyd, S.E., Garcia de la Banda, M., Pike, R.N., Whisstock, J.C., Rudy, G.B.: PoPS: a computational tool for modeling and predicting protease specificity. In: Proceedings/IEEE Computational Systems Bioinformatics Conference, CSB. IEEE Computational Systems Bioinformatics Conference, no. Csb, pp. 372–381 (2004)Google Scholar
  18. 18.
    Verspurten, J., Gevaert, K., Declercq, W., Vandenabeele, P.: SitePredicting the cleavage of proteinase substrates. Trends Biochem. Sci. 34(7), 319–323 (2009)CrossRefGoogle Scholar
  19. 19.
    Song, J., et al.: Cascleave: towards more accurate prediction of caspase substrate cleavage sites. Bioinformatics 26(6), 752–760 (2010)CrossRefGoogle Scholar
  20. 20.
    Wang, M., Zhao, X.M., Tan, H., Akutsu, T., Whisstock, J.C., Song, J.: Cascleave 2.0, a new approach for predicting caspase and granzyme cleavage targets. Bioinformatics 30(1), 71–80 (2014)CrossRefGoogle Scholar
  21. 21.
    Piippo, M., Lietzén, N., Nevalainen, O.S., Salmi, J., Nyman, T.A.: Pripper: prediction of caspase cleavage sites from whole proteomes. BMC Bioinform. 11(1), 320 (2010)CrossRefGoogle Scholar
  22. 22.
    Garay-Malpartida, H.M., Occhiucci, J.M., Alves, J., Belizário, J.E.: CaSPredictor: a new computer-based tool for caspase substrate prediction. Bioinformatics 21(SUPPL. 1), 169–176 (2005)CrossRefGoogle Scholar
  23. 23.
    Backes, C., Kuentzer, J., Lenhof, H.P., Comtesse, N., Meese, E.: GraBCas: a bioinformatics tool for score-based prediction of Caspase- and granzyme B-cleavage sites in protein sequences. Nucleic Acids Res. 33(SUPPL. 2), 208–213 (2005)CrossRefGoogle Scholar
  24. 24.
    Dönnes, P., Elofsson, A.: Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinform. 3, 25 (2002)CrossRefGoogle Scholar
  25. 25.
    Widmer, C., Toussaint, N.C., Altun, Y., Kohlbacher, O., Rätsch, G.: Novel machine learning methods for MHC class I binding prediction. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNBI, vol. 6282, pp. 98–109 (2010)Google Scholar
  26. 26.
    Song, J., et al.: PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites. PLoS ONE 7(11) (2012)Google Scholar
  27. 27.
    Kumar, S., Ratnikov, B.I., Kazanov, M.D., Smith, J.W., Cieplak, P.C.: CleavPredict: a platform for reasoning about matrix metalloproteinases proteolytic events. PLoS ONE 10(5), 1–19 (2015)Google Scholar
  28. 28.
    Rawlings, N.D., Barrett, A.J., Finn, R.: Twenty years of the MEROPS database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 44(D1), D343–D350 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Deepak Singh
    • 1
    Email author
  • Dilip Singh Sisodia
    • 1
  • Pradeep Singh
    • 1
  1. 1.National Institute of Technology, RaipurRaipurIndia

Personalised recommendations