Skip to main content

Identification of DNA-Binding Proteins via Fuzzy Multiple Kernel Model and Sequence Information

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11644))

Included in the following conference series:

Abstract

DNA-binding proteins is the molecular basis for understanding the basic processes of life activities. Many diseases are associated with DNA binding proteins. The methods of detecting DNA-binding proteins are mainly realized by biochemical experiment, which is time consuming and extremely expensive. A lot of computational methods based on Machine Learning (ML) algorithm have been developed to detect DNA-binding proteins. In this study, we propose a novel DNA-binding proteins model via a Fuzzy Multiple Kernel Support Vector Machine. The multiple features of sequence and evolutionary are extracted and constructed as multiple kernels, respectively. Next, these corresponding kernels are integrated by Multiple Kernel Learning (MKL) algorithm. At last, Fuzzy Support Vector Machine (FSVM) is employed to build an effective DNA-binding protein predictor. Comparing with other outstanding methods, our proposed approach achieves good results. The accuracy of our model are 82.98% and 81.70% on PDB1075 (benchmark data set of DNA-binding proteins) and PDB186 (independent test set), respectively. Our approach is comparable to previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bhardwaj, N., Langlois, R.E., Zhao, G., Lu, H.: Kernel-based machine learning protocol for predicting DNA-binding proteins. Nucleic Acids Res. 33(20), 6486–6493 (2005)

    Article  Google Scholar 

  2. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  3. Nimrod, G., Schushan, M., Szilágyi, A., Leslie, C.: iDBPs: a web server for the identification of DNA binding proteins. Bioinformatics 26(5), 692–693 (2010)

    Article  Google Scholar 

  4. Ahmad, S., Sarai, A.: Moment-based prediction of DNA-binding proteins. J. Mol. Biol. 341(1), 65–71 (2004)

    Article  Google Scholar 

  5. Cai, Y.D., Lin, S.L.: Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence. Biochim. Biophys. Acta 1648(1), 127–133 (2003)

    Article  MathSciNet  Google Scholar 

  6. Liu, B., Xu, J., Fan, S., Xu, R., Zhou, J., Wang, X.: PseDNA-Pro: DNA-binding protein identification by combining chou’s PseAAC and physicochemical distance transformation. Mol. Inform. 34(1), 8–17 (2015)

    Article  Google Scholar 

  7. Yu, X., Cao, J., Cai, Y., Shi, T., Li, Y.: Predicting rRNA-, RNA-, and DNA-binding proteins from primary structure with support vector machines. J. Theor. Biol. 240(2), 175–184 (2006)

    Article  Google Scholar 

  8. Lipman, D.J., et al.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)

    Article  Google Scholar 

  9. Kumar, M., Gromiha, M.M., Raghava, G.P.: Identification of DNA-binding proteins using support vector machines and evolutionary profiles. BMC Bioinformatics 8, 463 (2007)

    Article  Google Scholar 

  10. Liu, B., et al.: iDNA-prot|dis: identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition. PLoS One 9, e106691 (2014)

    Article  Google Scholar 

  11. Wei, L., Tang, J., Quan, Z.: Local-DPP: an improved DNA-binding protein prediction method by exploring local evolutionary information. Inf. Sci. 384, 135–144 (2016)

    Article  Google Scholar 

  12. Lou, W., Wang, X., Chen, F., Chen, Y., Jiang, B., Zhang, H.: Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian Naïve Bayes. PLoS One 9, e86703 (2014)

    Article  Google Scholar 

  13. Li, X., Liao, B., Shu, Y., Zeng, Q., Luo, J.: Protein functional class prediction using global encoding of amino acid sequence. J. Theor. Biol. 261(2), 290–293 (2009)

    Article  Google Scholar 

  14. You, Z.H., Zhu, L., Zheng, C.H., Yu, H.J., Deng, S.P., Ji, Z.: Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set. BMC Bioinformatics 15, S9 (2014)

    Article  Google Scholar 

  15. Ding, Y.J., Tang, J.J., Guo, F.: Predicting protein-protein interactions via multivariate mutual information of protein sequences. BMC Bioinformatics 17, 398 (2016)

    Article  Google Scholar 

  16. Feng, Z.P., Zhang, C.T.: Prediction of membrane protein types based on the hydrophobic index of amino acids. J. Protein Chem. 19(4), 269–275 (2000)

    Article  Google Scholar 

  17. Jeong, J.C., Lin, X., Chen, X.W.: On position-specific scoring matrix for protein function prediction. IEEE/ACM Trans. Comput. Biol. Bioinf. 8(2), 308–315 (2011)

    Article  Google Scholar 

  18. Huang, Y.A., You, Z.H., Gao, X., Wong, L., Wang, L.: Using weighted sparse representation model combined with discrete cosine transformation to predict protein-protein interactions from protein sequence. Biomed. Res. Int. 19, 902198 (2015)

    Google Scholar 

  19. Nanni, L., Brahnam, S., Lumini, A.: Wavelet images and Chou’s pseudo amino acid composition for protein classification. Amino Acids 43, 657–665 (2012)

    Article  Google Scholar 

  20. Endres, D.M., Schindelin, J.E.: A new metric for probability distributions. IEEE Trans. Inf. Theory 49(7), 1858–1860 (2003)

    Article  MathSciNet  Google Scholar 

  21. Cristianini, N., Kandola, J., Elisseeff, A.: On kernel-target alignment. Adv. Neural. Inf. Process. Syst. 179(5), 367–373 (2001)

    Google Scholar 

  22. Cortes, C., Mohri, M., Rostamizadeh, A.: Algorithms for learning kernels based on centered alignment. J. Mach. Learn. Res. 13(2), 795–828 (2012)

    MathSciNet  MATH  Google Scholar 

  23. Lin, C.F., Wang, S.D.: Fuzzy support vector machines. IEEE Trans. Neural Networks 13(2), 464–471 (2002)

    Article  Google Scholar 

  24. Rose, P.W., Prlić, A., Bi, C., et al.: The RCSB Protein Data Bank: views of structural biology for basic and applied research and education. Nucleic Acids Res. 43(Database issue), 345–356 (2015)

    Article  Google Scholar 

  25. Lin, W., Fang, J., Xiao, X., Chou, K.: iDNA-Prot: identification of DNA binding proteins using random forest with grey model. PLoS ONE 6, e24756 (2011)

    Article  Google Scholar 

  26. Kumar, K.K., Pugalenthi, G., Suganthan, P.N.: DNA-Prot: identification of DNA binding proteins from protein sequence information using random forest. J. Biomol. Struct. Dyn. 26(6), 679–686 (2009)

    Article  Google Scholar 

  27. Liu, B., Wang, S., Wang, X.: DNA binding protein identification by combining pseudo amino acid composition and profile-based protein representation. Sci. Rep. 5, 15479 (2015)

    Article  Google Scholar 

  28. Xu, R., Zhou, J., Wang, H., He, Y., Wang, X., Liu, B.: Identifying DNA-binding proteins by combining support vector machine and PSSM distance transformation. BMC Syst. Biol. 9, S10 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by a grant from the National Science Foundation of China (NSFC 61772362).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fei Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ding, Y., Tang, J., Guo, F. (2019). Identification of DNA-Binding Proteins via Fuzzy Multiple Kernel Model and Sequence Information. In: Huang, DS., Jo, KH., Huang, ZK. (eds) Intelligent Computing Theories and Application. ICIC 2019. Lecture Notes in Computer Science(), vol 11644. Springer, Cham. https://doi.org/10.1007/978-3-030-26969-2_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26969-2_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26968-5

  • Online ISBN: 978-3-030-26969-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics