Abstract
Protein sub-cellular localization prediction is an important and meaningful task in bioinformatics. It can provide important clues for us to study the functions of proteins and targeted drug discovery. Traditional experiment techniques which can determine the protein sub-cellular locations are almost costly and time consuming. In the last two decades, a great many machine learning algorithms and protein sub-cellular location predictors have been developed to deal with this kind of problems. However, most of the algorithms can only solve the single-location proteins. With the progress of techniques, more and more proteins which have two or even more sub-cellular locations are found, it is much more significant to study this kind of proteins for they have extremely useful implication in both basic biological research and drug discovery. If we want to improve the accuracy of prediction, we have to extract much more feature information. In this paper, we use fusion feature extraction methods to extract the feature information simultaneously, and the multi-label k nearest neighbors (ML-KNN) algorithm to predict protein sub-cellular locations, the best overall accuracy rate we got in dataset s1 in constructing Gpos-mploc is 66.1568% and 59.9206% in dataset s2 in constructing Virus-mPLoc.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Du, P.F., Xu, C.: Predicting Multisite Protein Subcellualr Locations: Progress and Challenges. Expert Rev. Proteomics 10(3), 227–237 (2013)
Shen, H.B., Chou, K.C.: Virus-Mploc: A Fusion Classifier for Viral Protein Subcellular Location Prediction by Incorporating Multiple Sites. J. Biomol. Struct. Dyn. 28, 175–186 (2010)
Emanuelsson, O., Nielsen, H., Brunak, S.: Predicting Subcellular Localization of Proteins Based on Their N-Terminal Amino Acid Sequence. Mol. Biol. 300, 1005–1016 (2000)
Chou, K.C., Shen, H.B.: A New Method for Predicting the Subcellular Localization of Eukaryotic Proteins with Both Single and Multiple Sites: Euk-Mploc 2.0. Plos ONE 5, E9931 (2010)
Zhang, M.L., Zhou, Z.H.: ML_KNN: A Lazy Learning Approach to Multi-Label Learning. Pattern Recognition 40(7), 2038–2048 (2007)
Zhang, S., Zhang, H.X.: Modified KNN Algorithm for Multi-Label Learning. Application Research of Computers 28(12), 4445–4446 (2011)
Duan, Z., Cheng, J.X., Zhang, L.: Research on Multi-Label Learning Method Based on Covering. Computer Engineering and Applications 46(14), 20–23 (2010)
Zhang, M.L.: Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization. IEEE Trans. Knowl. Data Eng. 18, 1338–1351 (2006)
Nakai, K.: Protein Sorting Signals and Prediction of Subcellular Localization. Adv. Protein Chem. 54, 277–344 (2000)
Gao, Q.B., Jin, Z.C., Ye, X.F., Wu, C., He, J.: Prediction of Nuclear Receptors with Optimal Pseudo Amino Acid Composition. Anal. Biochem. 387, 54–59 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Qu, X., Chen, Y., Qiao, S., Wang, D., Zhao, Q. (2014). Predicting the Subcellular Localization of Proteins with Multiple Sites Based on Multiple Features Fusion. In: Huang, DS., Han, K., Gromiha, M. (eds) Intelligent Computing in Bioinformatics. ICIC 2014. Lecture Notes in Computer Science(), vol 8590. Springer, Cham. https://doi.org/10.1007/978-3-319-09330-7_53
Download citation
DOI: https://doi.org/10.1007/978-3-319-09330-7_53
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09329-1
Online ISBN: 978-3-319-09330-7
eBook Packages: Computer ScienceComputer Science (R0)