Abstract
As more and more genomes have been discovered in recent years, it is an urgent need to develop a reliable method to predict protein subcellular localization for further function exploration. However many well-known prediction methods based on amino acid composition, have no ability to utilize the information of sequence-order. Here we propose a novel method, named moment descriptor (MD), which can obtain sequence order information in protein sequence without the need of the information of physicochemical properties of amino acids. The presented method first constructs three types of moment descriptors, and then applies multi-class SVM to the Chou’s dataset. Through resubstitution, jackknife and independent tests, it is shown that the MD is better than other methods based on various types of extensions of amino acid compositions. Moreover, three multi-class SVMs show similar performance except for the training time.
Chapter PDF
Similar content being viewed by others
Keywords
- Support Vector Machine
- Amino Acid Composition
- Directed Acyclic Graph
- Feature Extraction Method
- Protein Subcellular Localization
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Feng, Z.P.: An Overview on Predicting Subcellular Location of a Protein. Silico. Biol. 2, 27 (2002)
Nakashima, H., Nishikawa, K.: Discrimination of Intracellular and Extracellular Proteins Using Amino Acid Composition and Residue-Pair Frequencies. J. Mol. Biol. 238, 54–61 (1994)
Feng, Z.P., Zhang, C.T.: Prediction of the Subcellular Localization of Prokaryotic Proteins Based on the Hydrophobicity Index of Amino Acids. Int. J. Biol. Macromol. 28, 255–261 (2001)
Feng, Z.P., Zhang, C.T.: A Graphic Representation of Protein Sequence and Predicting the Subcellular Localizations of Prokaryotic Proteins. J. Biochem. Cell Biol. 34, 298–307 (2002)
Chou, K.C.: Prediction of Protein Cellular Attributes Using Pseudo – Amino – Acid –Composition. Proteins 43, 246–255 (2001)
Zhou, G.P., Doctor, K.: Subcellular Location Prediction of Apoptosis Proteins. Proteins 50, 44–48 (2003)
Cai, Y.D., Chou, K.C.: Nearest Neighbour Algorithm for Predicting Protein Subcellular by Combining Functional Domain Composition and Pseudo Amino Acid Composition. Biochem. Biophys. Res. Commun. 305, 407–411 (2003)
Chou, K.C., Cai, Y.D.: A New Hybrid Approach to Predict Subcellular Localization of Proteins by Incorporating Gene Ontology. Biochem. Biophys. Res. Commun. 311, 743–747 (2003)
Chou, K.C., Cai, Y.D.: Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location. J. Biol. Chem. 277, 45765–45769 (2002)
Pan, Y.X., Zhang, Z.Z., Guo, Z.M., Feng, G.Y., Huang, Z.D., He, L.: Application of Pseudo Amino Acid Composition for Predicting Protein Subcellular Location: Stochastic Signal Processing Approach. J. Protein Chem. 22, 395–402 (2003)
Bhasin, M., Raghava, G.P.S.: ESLpred: SVM-Based Method for Subcellular Localization of Eukaryotic Proteins Using Dipeptide Composition and PSI-BLAST. Nucleic Acids Res. 32, W414–W419 (2004)
Park, K.J., Kanehisa, M.: Prediction of Protein Subcellular Locations by Support Vector Machines Using Compositions of Amino Acids and Amino Acid Pairs. Bioinformatics 19, 1656–1663 (2003)
Cui, Q., Jiang, T., Liu, B., Ma, S.: Esub8: A Novel Tool to Predict Protein Subcellular Localizations in Eukaryotic Organisms. BMC Bioinformatics 5, 66–72 (2004)
Chou, K.C.: A Novel Approach to Predicting Protein Structural Classes in a (20-1)-D Amino Acid Composition Space. Proteins 21, 319–344 (1995)
Reinhardt, A., Hubbard, T.: Using Neural Networks for Prediction of the Subcellular Localization of Proteins. Nucleic Acids Res. 26, 2230–2236 (1998)
Chou, K.C., Elrod, D.: Protein Subcellular Localization Prediction. Protein Eng. 12, 107–118 (1999)
Yuan, Z.: Prediction of protein subcellular localizations using Markov chain models. FEBS Lett. 451, 23–26 (1999)
Huang, Y., Li, Y.D.: Prediction of protein subcellular locations using fuzzy k-NN method. Bioinformatics 20, 21–28 (2001)
Hua, S.J., Sun, Z.R.: Support Vector Machine Approach for Protein Subcellular Localization Prediction. Bioinformatics 17, 721–728 (2001)
Zhang, S.W., Pan, Q., Zhang, H.C., Shao, Z.C., Shi, J.Y.: Prediction Protein Homo-oligomer Types by Pesudo Amino Acid Composition: Approached with an Improved Feature Extraction and Naive Bayes Feature Fusion, Amino Acid (in press, 2006)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Bredensteiner, E., Bennet, K.: Multicategory Classification by Support Vector Machines. Comput. Optim. Appl. 12, 53–79 (1999)
Crammer, K., Singer, Y.: On the Algorithmic Implementation of Multiclass Kernel-Based Vector Machines. J. Mach. Learn. Res. 2, 265–292 (2001)
Kreßel, U.: Pairwise Classification and Support Vector Machines. In: Schölkopf, B., Burges, C.J., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learnings, pp. 255–268. MIT Press, Cambridge (1999)
Platt, J., Cristianini, N., Shawe-Taylor, J.: Large Margin DAGs for Multiclass Classification. In: Solla, S.A., Leen, T.K., Muller, K.-R. (eds.) Advances in Neural Information Processing Systems, vol. 12, pp. 547–553 (2000)
Hsu, C., Lin, C.J.: A Comparison of Methods for Multi-Class Support Vector Machines. IEEE. T. Neural Networks 13, 415–425 (2002)
Rifin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shi, J., Zhang, S., Liang, Y., Pan, Q. (2006). Prediction of Protein Subcellular Localizations Using Moment Descriptors and Support Vector Machine. In: Rajapakse, J.C., Wong, L., Acharya, R. (eds) Pattern Recognition in Bioinformatics. PRIB 2006. Lecture Notes in Computer Science(), vol 4146. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11818564_12
Download citation
DOI: https://doi.org/10.1007/11818564_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37446-6
Online ISBN: 978-3-540-37447-3
eBook Packages: Computer ScienceComputer Science (R0)