Advertisement

Predicting the Subcellular Localization of Proteins with Multiple Sites Based on Multiple Features Fusion

  • Xumi Qu
  • Yuehui Chen
  • Shanping Qiao
  • Dong Wang
  • Qing Zhao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8590)

Abstract

Protein sub-cellular localization prediction is an important and meaningful task in bioinformatics. It can provide important clues for us to study the functions of proteins and targeted drug discovery. Traditional experiment techniques which can determine the protein sub-cellular locations are almost costly and time consuming. In the last two decades, a great many machine learning algorithms and protein sub-cellular location predictors have been developed to deal with this kind of problems. However, most of the algorithms can only solve the single-location proteins. With the progress of techniques, more and more proteins which have two or even more sub-cellular locations are found, it is much more significant to study this kind of proteins for they have extremely useful implication in both basic biological research and drug discovery. If we want to improve the accuracy of prediction, we have to extract much more feature information. In this paper, we use fusion feature extraction methods to extract the feature information simultaneously, and the multi-label k nearest neighbors (ML-KNN) algorithm to predict protein sub-cellular locations, the best overall accuracy rate we got in dataset s1 in constructing Gpos-mploc is 66.1568% and 59.9206% in dataset s2 in constructing Virus-mPLoc.

Keywords

N-terminal signals pseudo amino acid composition Physicochemical properties Amino acid index distribution multi-label k nearest neighbor 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Du, P.F., Xu, C.: Predicting Multisite Protein Subcellualr Locations: Progress and Challenges. Expert Rev. Proteomics 10(3), 227–237 (2013)CrossRefGoogle Scholar
  2. 2.
    Shen, H.B., Chou, K.C.: Virus-Mploc: A Fusion Classifier for Viral Protein Subcellular Location Prediction by Incorporating Multiple Sites. J. Biomol. Struct. Dyn. 28, 175–186 (2010)CrossRefGoogle Scholar
  3. 3.
    Emanuelsson, O., Nielsen, H., Brunak, S.: Predicting Subcellular Localization of Proteins Based on Their N-Terminal Amino Acid Sequence. Mol. Biol. 300, 1005–1016 (2000)CrossRefGoogle Scholar
  4. 4.
  5. 5.
  6. 6.
    Chou, K.C., Shen, H.B.: A New Method for Predicting the Subcellular Localization of Eukaryotic Proteins with Both Single and Multiple Sites: Euk-Mploc 2.0. Plos ONE 5, E9931 (2010)Google Scholar
  7. 7.
    Zhang, M.L., Zhou, Z.H.: ML_KNN: A Lazy Learning Approach to Multi-Label Learning. Pattern Recognition 40(7), 2038–2048 (2007)CrossRefzbMATHGoogle Scholar
  8. 8.
    Zhang, S., Zhang, H.X.: Modified KNN Algorithm for Multi-Label Learning. Application Research of Computers 28(12), 4445–4446 (2011)Google Scholar
  9. 9.
    Duan, Z., Cheng, J.X., Zhang, L.: Research on Multi-Label Learning Method Based on Covering. Computer Engineering and Applications 46(14), 20–23 (2010)Google Scholar
  10. 10.
    Zhang, M.L.: Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization. IEEE Trans. Knowl. Data Eng. 18, 1338–1351 (2006)CrossRefGoogle Scholar
  11. 11.
    Nakai, K.: Protein Sorting Signals and Prediction of Subcellular Localization. Adv. Protein Chem. 54, 277–344 (2000)CrossRefGoogle Scholar
  12. 12.
    Gao, Q.B., Jin, Z.C., Ye, X.F., Wu, C., He, J.: Prediction of Nuclear Receptors with Optimal Pseudo Amino Acid Composition. Anal. Biochem. 387, 54–59 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Xumi Qu
    • 1
    • 2
  • Yuehui Chen
    • 1
  • Shanping Qiao
    • 1
    • 2
  • Dong Wang
    • 1
    • 2
  • Qing Zhao
    • 1
    • 2
  1. 1.The School of Information Science and EngineeringUniversity of JinanJinanChina
  2. 2.Shandong Provincial Key laboratory of Network Based Intelligent ComputingJinanChina

Personalised recommendations