Using Decision Templates to Predict Subcellular Localization of Protein

  • Jianyu Shi
  • Shaowu Zhang
  • Quan Pan
  • Yanning Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4774)

Abstract

Theoretical and computational methods for the prediction of protein subcellular localization have been proposed and are developing continuously. Many representations of protein sequence are proposed but a new problem arises: how to organize them together to improve prediction. It is an available solution to serialize multiple representations to single bigger one, but is still hard to avoid calculation error derived from greatly different feature values and causes huge computational burden natively because of high dimensional feature vector. We present a novel method based on decision templates(DT) for such problems in this paper. First, a protein sequence is represented as three new types of feature vectors. Then, the feature vectors are further taken as the inputs of individual SVM classifiers respectively. Finally, the outputs of these classifiers are aggregated by decision templates. The results demonstrate that DT is superior to other methods of subcellular localization prediction.

Keywords

decision templates subcellular localization prediction multi-scale energy moment descriptor amino acid composition distribution support vector machines 

References

  1. 1.
    Nakashima, H., Nishikawa, K.: Discrimination of Intracellular and Extracellular Proteins Using Amino Acid Composition and Residue-Pair Frequencies. J. Mol. Biol. 238, 54–61 (1994)CrossRefGoogle Scholar
  2. 2.
    Reinhardt, A., Hubbard, T.: Using Neural Networks for Prediction of the Subcellular Localization of Proteins. Nucleic Acids Research 26, 2230–2236 (1998)CrossRefGoogle Scholar
  3. 3.
    Chou, K.C., Elrod, D.: Protein Subcellular Localization Prediction. Protein Eng. 12, 107–118 (1999)CrossRefGoogle Scholar
  4. 4.
    Hua, S.J., Sun, Z.R.: Support Vector Machine Approach for Protein Subcellular Localization Prediction. Bioinformatics 17, 721–728 (2001)CrossRefGoogle Scholar
  5. 5.
    Chou, K.C.: Prediction of Protein Cellular Attributes Using Pseudo-Amino Acid Composition. Proteins: Struct. Funct. Genet. 43, 246–255 (2001)CrossRefGoogle Scholar
  6. 6.
    Pan, Y.X., Zhang, Z.Z., Guo, Z.M., Feng, G.Y., Huang, Z., He, L.: Application of Pseudo Amino Acid Composition for Predicting Protein Subcellular Location: Stochastic Signal Processing Approach. Journal of Protein Chemistry 22, 395–402 (2003)CrossRefGoogle Scholar
  7. 7.
    Gao, Y., Shao, S.H., Xiao, X., Ding, Y.S., Huang, Y.S., Huang, Z.D., Chou, K.C.: Using Pseudo Amino Acid Composition to Predict Protein Subcellular Location: Approached with Lyapunov Index, Bessel Function, and Chebyshev Filter. Amino Acids 28, 373–376 (2005)CrossRefGoogle Scholar
  8. 8.
    Shi, J.Y., Zhang, S.W., Pan, Q., Cheng, Y.M., Xie, J.: Prediction of Protein Subcellular Localization by Support Vector Machines Using Multi-Scale Energy and Pseudo Amino Acid Composition. Amino Acids 33, 69–74 (2007)CrossRefGoogle Scholar
  9. 9.
    Park, K.J., Kanehisa, M.: Prediction of Protein Subcellular Locations by Support Vector Machines Using Compositions of Amino Acids and Amino Acid Pairs. Bioinformatics 19, 1656–1663 (2003)CrossRefGoogle Scholar
  10. 10.
    Cui, Q., Jiang, T., Liu, B., Ma, S.: Esub8: A Novel Tool to Predict Protein Subcellular Localizations in Eukaryotic Organisms. BMC Bioinformatics 5, 66–72 (2004)CrossRefGoogle Scholar
  11. 11.
    Bhasin, M., Raghava, G.P.S.: Eslpred: SVM-Based Method for Subcellular Localization of Eukaryotic Proteins Using Dipeptide Composition and Psi-Blast. Nucl. Acids Res. 32, W414–W419 (2004)CrossRefGoogle Scholar
  12. 12.
    Shi, J.Y., Zhang, S.W., Liang, Y., Pan, Q.: Prediction of Protein Subcellular Localizations Using Moment Descriptors and Support Vector Machine. In: Rajapakse, J.C., Wong, L., Acharya, R. (eds.) PRIB 2006. LNCS (LNBI), vol. 4146, pp. 105–114. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Shi, J.Y., Zhang, S.W., Pan, Q., Zhou, G.-P.: Amino Acid Composition Distribution: A Novel Sequence Representation for Prediction of Protein Subcellular Localization. In: The 1st IEEE International Conference on Bioinformatics and Biomedical Engineering, pp. 115–118. IEEE Computer Society Press, Los Alamitos (2007)CrossRefGoogle Scholar
  14. 14.
    Xiao, X., Shao, S.H., Ding, Y.S., Huang, Z.D., Huang, Y., Chou, K.C.: Using Complexity Measure Factor to Predict Protein Subcellular Location. Amino Acids 28, 57–61 (2005)CrossRefGoogle Scholar
  15. 15.
    Höglund, A., Dönnes, P., Blum, T., Adolph, H.-W., Kohlbacher, O.: Multiloc: Prediction of Protein Subcellular Localization Using N-Terminal Targeting Sequences, Sequence Motifs and Amino Acid Composition. Bioinformatics 22, 1158–1165 (2006)CrossRefGoogle Scholar
  16. 16.
    Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, London (1999)MATHGoogle Scholar
  17. 17.
    Kawashima, S., Ogata, H., Kanehisa, M.: AAindex: Amino Acid Index Database. Nucleic Acids Research 27, 368–369 (1999)CrossRefGoogle Scholar
  18. 18.
    Huang, Y., Li, Y.D.: Prediction of Protein Subcellular Locations Using Fuzzy K-NN Method. Bioinformatics 20, 21–28 (2004)CrossRefGoogle Scholar
  19. 19.
    Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)MATHGoogle Scholar
  20. 20.
    Kreßel, U.H.: Pairwise Classification and Support Vector Machines. In: Schölkopf, B., Burges, C.J., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learning, pp. 255–268. MIT Press, Cambridge, MA (1999)Google Scholar
  21. 21.
    Platt, J., Cristianini, N., Shawe-Taylor, J.: Large Margin Dags for Multiclass Classification. Advances in Neural Information Processing Systems 12, 547–553 (2000)Google Scholar
  22. 22.
    Hsu, C., Lin, C.J.: A Comparison of Methods for Multi-Class Support Vector Machines. IEEE Transactions on Neural Networks 13, 415–425 (2002)CrossRefGoogle Scholar
  23. 23.
    Rifin, R., Klautau, A.: In Defense of One-Vs-All Classification. Journal of Machine Learning Research 5, 101–141 (2004)Google Scholar
  24. 24.
    Kittler, J., Hatef, M., Duin, R., Matas, J.: On Combining Classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 226–239 (1998)CrossRefGoogle Scholar
  25. 25.
    Jain, A.K., Duin, R.P.W., Mao, J.: Statistical Pattern Recognition: A Review. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 4–37 (2000)CrossRefGoogle Scholar
  26. 26.
    Kuncheva, L.I.: Switching between Selection and Fusion in Combining Classifiers: An Experiment. IEEE Transactions on Systems, Man, and Cybernetics, Part B 32, 146–156 (2002)CrossRefGoogle Scholar
  27. 27.
    Kuncheva, L.I., Bezdek, J.C., Duin, R.: Decision Templates for Multiple Classifier Fusion: An Experimental Comparison. Pattern Recognition 34, 299–314 (2001)MATHCrossRefGoogle Scholar
  28. 28.
    Nakai, K., Horton, P.: Psort: A Program for Detecting the Sorting Signals of Proteins and Predicting Their Subcellular Localization. Trends Biochem. Sci. 24, 34–36 (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Jianyu Shi
    • 1
  • Shaowu Zhang
    • 2
  • Quan Pan
    • 2
  • Yanning Zhang
    • 1
  1. 1.School of Computer Science and Engineering 
  2. 2.School of Automation, Northwestern Polytechnical University, 710072 Xi anChina

Personalised recommendations