Protein Targeting Protocols pp 429-466

Part of the Methods in Molecular Biology™ book series (MIMB, volume 390) | Cite as

Computational Prediction of Subcellular Localization

  • Kenta Nakai
  • Paul Horton

{It is widely recognized that much of the information for determining the final subcellular localization of proteins is found in their amino acid sequences. Thus the prediction of protein localization sites is of both theoretical and practical interest. In most cases, the prediction has been attempted in two ways: one is based on the knowledge of experimentally characterized targeting signals, while the other utilizes the statistical differences of general sequence characteristics, such as amino acid composition, between localization sites. Both approaches have limitations, and it is recommended to check the results of various prediction methods based on different principles as well as training data. Recently, increased proteomic analyses of localization sites have provided new data to assess the current status of predictive methods. In this chapter we discuss these issues and close with an example illustrating the use of the WoLF PSORT web server for localization prediction.}

Key Words

Subcellular localization signal peptide sequence analysis 

References

  1. 1.
    Petrey, D. and Honig, B. (2005) Protein structure prediction: inroads to biology.Mol.Cell 20, 811–819.PubMedCrossRefGoogle Scholar
  2. 2.
    Chandonia, J. M. and Brenner, S. E. (2006) The impact of structural genomics: expectations and outcomes.Science 311, 347–351.PubMedCrossRefGoogle Scholar
  3. 3.
    Nakao, M., Barrero, R. A., Mukai, Y., Motono, C., Suwa, M., and Nakai, K. (2005) Large-scale analysis of human alternative protein isoforms: pattern classification and correlation with subcellular localization signals.Nucleic Acids Res. 33, 2355–2363.PubMedCrossRefGoogle Scholar
  4. 4.
    Heazlewood, J. L., Tonti-Filippini, J. S., Gout, A. M., Day, D. A., Whelan, J., and Millar, A. H. (2004) Experimental analysis of the Arabidopsis mitochondrial proteome highlights signaling and regulatory components, provides assessment of targeting prediction programs, and indicates plant-specific mitochondrial proteins.Plant Cell 16, 241–256.PubMedCrossRefGoogle Scholar
  5. 5.
    Wu, L. F., Chanal, A., and Rodrigue, A. (2000) Membrane targeting and translocation of bacterial hydrogenases.Arch. Microbiol. 173, 319–324.PubMedCrossRefGoogle Scholar
  6. 6.
    Margeot, A., Blugeon, C., Sylvestre, J., Vialette, S., Jacq, C., and Corral-Debrinski, M. (2002) In Saccharomyces cerevisiae, ATP2 mRNA sorting to the vicinity of mitochondria is essential for respiratory function.EMBO J. 21, 6893–6904.PubMedCrossRefGoogle Scholar
  7. 7.
    Muslin, A. J. and Xing, H. (2000) 14-3-3 proteins: regulation of subcellular localization by molecular interference.Cell Signal 12, 703–709.PubMedCrossRefGoogle Scholar
  8. 8.
    Nakai, K. (2000) Protein sorting signals and prediction of subcellular localization.Adv. Protein Chem. 54, 277–344.PubMedCrossRefGoogle Scholar
  9. 9.
    Nakai, K. (2001) Review: prediction of in vivo fates of proteins in the era of genomics and proteomics.J. Struct. Biol. 134,103–116.PubMedCrossRefGoogle Scholar
  10. 10.
    Emanuelsson, O. and von Heijne, G. (2001) Prediction of organellar targeting signals.Biochim. Biophys. Acta 1541, 114–119.PubMedCrossRefGoogle Scholar
  11. 11.
    Emanuelsson, O. (2002) Predicting protein subcellular localisation from amino acid sequence information.Brief Bioinform. 3, 361–376.PubMedCrossRefGoogle Scholar
  12. 12.
    Donnes, P. and Hoglund, A. (2004) Predicting protein subcellular localization: past, present, and future.Genom. Proteom. Bioinform. 2, 209–215.Google Scholar
  13. Horton, P., Mukai, Y., and Nakai, K. (2004) Protein subcellular localization prediction, inPractical Bioinformatician(Wong, L., ed.), World Scientific Publishing Co., pp. 193–216.Google Scholar
  14. 14.
    Schneider, G. and Fechner, U. (2004) Advances in the prediction of protein targeting signals.Proteomics 4, 1571–1580.PubMedCrossRefGoogle Scholar
  15. 15.
    Nakai, K. (2002) Signal peptides, inCell-Penetrating Peptides: Processes and Applications(Langel, U., ed.), CRC Press, Boca Raton, FL, pp. 295–324.Google Scholar
  16. 16.
    Halic, M. and Beckmann, R. (2005) The signal recognition particle and its interactions during protein targeting.Curr. Opin. Struct. Biol. 15, 116–125.PubMedCrossRefGoogle Scholar
  17. 17.
    von Heijne, G. (1983) Patterns of amino acids near signal-sequence cleavage sites.Eur. J. Biochem. 133, 17–21.CrossRefGoogle Scholar
  18. 18.
    Chou, K. C. (2002) Prediction of protein signal sequences.Curr. Protein Pept. Sci. 3, 615–622.PubMedCrossRefGoogle Scholar
  19. 19.
    von Heijne, G. (1986) A new method for predicting signal sequence cleavage sites.Nucleic Acids Res. 14, 4683–4690.CrossRefGoogle Scholar
  20. 20.
    Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997) Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites.Protein Eng. 10, 1–6.PubMedCrossRefGoogle Scholar
  21. 21.
    Bendtsen, J. D., Nielsen, H., von Heijne, G., and Brunak, S. (2004) Improved prediction of signal peptides: SignalP 3.0.J. Mol. Biol. 340, 783–795.PubMedCrossRefGoogle Scholar
  22. 22.
    Hiller, K., Grote, A., Scheer, M., Munch, R., and Jahn, D. (2004) PrediSi: prediction of signal peptides and their cleavage positions.Nucleic Acids Res. 32, W375–379.PubMedCrossRefGoogle Scholar
  23. 23.
    von Heijne, G. (1998) Life and death of a signal peptide.Nature 396, 111, 113.Google Scholar
  24. 24.
    Kall, L., Krogh, A., and Sonnhammer, E. L. (2004) A combined transmembrane topology and signal peptide prediction method.J. Mol. Biol. 338, 1027–1036.PubMedCrossRefGoogle Scholar
  25. 25.
    Yuan, Z., Davis, M. J., Zhang, F., and Teasdale, R.D. (2003) Computational differentiation of N-terminal signal peptides and transmembrane helices.Biochem. Biophys. Res.Commun. 312, 1278–1283.PubMedCrossRefGoogle Scholar
  26. 26.
    Chen, Y., Yu, P., Luo, J., and Jiang, Y. (2003) Secreted protein prediction system combining CJ-SPHMM, TMHMM, and PSORT.Mamm. Genome 14, 859–865.PubMedCrossRefGoogle Scholar
  27. 27.
    Grimmond, S. M., Miranda, K. C., Yuan, Z., Davis, M. J., et al. (2003) The mouse secretome: functional classification of the proteins secreted into the extracellular environment.Genome Res. 13, 1350–1359.PubMedCrossRefGoogle Scholar
  28. 28.
    Bendtsen, J. D., Jensen, L. J., Blom, N., Von Heijne, G., and Brunak, S. (2004) Feature-based prediction of non-classical and leaderless protein secretion.Protein Eng. Des Sel. 17, 49–356.CrossRefGoogle Scholar
  29. 29.
    Bendtsen, J.D., Kiemer, L., Fausboll, A., and Brunak, S. (2005) Non-classical protein secretion in bacteria.BMC Microbiol. 5, 58.PubMedCrossRefGoogle Scholar
  30. 30.
    Martoglio, B. and Dobberstein, B. (1998) Signal sequences: more than just greasy peptides.Trends Cell Biol. 8, 410–415.PubMedCrossRefGoogle Scholar
  31. 31.
    von Heijne, G. (1989) The structure of signal peptides from bacterial lipoproteins.Protein Eng. 2, 531–534.CrossRefGoogle Scholar
  32. 32.
    Juncker, A. S., Willenbrock, H., Von Heijne, G., Brunak, S., Nielsen, H., and Krogh, A. (2003) Prediction of lipoprotein signal peptides in Gram-negative bacteria.Protein Sci. 12, 1652–1662.PubMedCrossRefGoogle Scholar
  33. 33.
    Gonnet, P., Rudd, K. E., and Lisacek, F. (2004) Fine-tuning the prediction of sequences cleaved by signal peptidase II: a curated set of proven and predicted lipoproteins of Escherichia coli K-12.Proteomics 4, 1597–1613.PubMedCrossRefGoogle Scholar
  34. 34.
    Setubal, J. C., Reis, M., Matsunaga, J., and Haake, D. A. (2006) Lipoprotein computational prediction in spirochaetal genomes.Microbiol.ogy 152, 113–121.Google Scholar
  35. 35.
    Berks, B. C., Palmer, T., and Sargent, F. (2005) Protein targeting by the bacterial twin-arginine translocation (Tat) pathway.Curr. Opin. Microbiol. 8, 174–181.PubMedCrossRefGoogle Scholar
  36. 36.
    Bendtsen, J. D., Nielsen, H., Widdick, D., Palmer, T., and Brunak, S. (2005) Prediction of twin-arginine signal peptides.BMC Bioinform. 6, 167.CrossRefGoogle Scholar
  37. 37.
    de Gier, J. W., and Luirink, J. (2001) Biogenesis of inner membrane proteins in Escherichia coli.Mol. Microbiol. 40, 314–322.PubMedCrossRefGoogle Scholar
  38. 38.
    Peabody, C. R., Chung, Y. J., Yen, M. R., Vidal-Ingigliardi, D., Pugsley, A. P., and Saier, M. H., Jr. (2003) Type II protein secretion and its relationship to bacterial type IV pili and archaeal flagella.Microbiology 149, 3051–3072.PubMedCrossRefGoogle Scholar
  39. 39.
    Koehler, C. M. (2004) New developments in mitochondrial assembly.Annu. Rev. Cell Dev. Biol. 20, 309–335.PubMedCrossRefGoogle Scholar
  40. 40.
    Taylor, R. D. and Pfanner, N. (2004) The protein import and assembly machinery of the mitochondrial outer membrane.Biochim. Biophys. Acta 1658, 37–43.PubMedCrossRefGoogle Scholar
  41. 41.
    Rapaport, D. (2003) Finding the right organelle. Targeting signals in mitochondrial outer-membrane proteins.EMBO Rep. 4, 948–952.PubMedCrossRefGoogle Scholar
  42. 42.
    Paschen, S. A., Neupert, W., and Rapaport, D. (2005) Biogenesis of beta-barrel membrane proteins of mitochondria.Trends Biochem. Sci. 30, 575–582.PubMedCrossRefGoogle Scholar
  43. 43.
    Andreoli, C., Prokisch, H.,Hortnagel, K., et al. (2004) MitoP2, an integrated database on mitochondrial proteins in yeast and man.Nucleic Acids Res. 32, D459–462.PubMedCrossRefGoogle Scholar
  44. 44.
    Prokisch, H.,Andreoli, C.,Ahting, U., et al. (2006) MitoP2: the mitochondrial proteome database–now including mouse data.Nucleic Acids Res. 34, D705–711.PubMedCrossRefGoogle Scholar
  45. 45.
    Heazlewood, J. L. and Millar, A. H. (2005) AMPDB: the Arabidopsis Mitochondrial Protein Database.Nucleic Acids Res. 33, D605–610.PubMedCrossRefGoogle Scholar
  46. 46.
    Mueller, J. C.,Andreoli, C.,Prokisch, H., and Meitinger, T. (2004) Mechanisms for multiple intracellular localization of human mitochondrial proteins.Mitochondrion 3, 315–325.PubMedCrossRefGoogle Scholar
  47. 47.
    Nakai, K. and Kanehisa, M. (1992) A knowledge base for predicting protein localization sites in eukaryotic cells.Genomics 14, 897–911.PubMedCrossRefGoogle Scholar
  48. 48.
    Emanuelsson, O.,Nielsen, H.,Brunak, S., and von Heijne, G. (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence.J. Mol. Biol. 300, 1005–1016.PubMedCrossRefGoogle Scholar
  49. 49.
    Small, I.,Peeters, N.,Legeai, F., and Lurin, C. (2004) Predotar: A tool for rapidly screening proteomes for N-terminal targeting sequences.Proteomics 4, 1581–1590.PubMedCrossRefGoogle Scholar
  50. 50.
    Guda, C.,Fahy, E., and Subramaniam, S. (2004) MITOPRED: a genome-scale method for prediction of nucleus-encoded mitochondrial proteins.Bioinformatics 20, 1785–1794.PubMedCrossRefGoogle Scholar
  51. 51.
    Guda, C.,Guda, P.,Fahy, E., and Subramaniam, S. (2004) MITOPRED: a web server for the prediction of mitochondrial proteins.Nucleic Acids Res. 32, W372–374.PubMedCrossRefGoogle Scholar
  52. 52.
    Kumar, M.,Verma, R., and Raghava, G.P. (2006) Prediction of mitochondrial proteins using support vector machine and hidden markov model.J. Biol. Chem. 281, 5357–5363.PubMedCrossRefGoogle Scholar
  53. 53.
    Claros, M. G. and Vincens, P. (1996) Computational method to predict mitochondrially imported proteins and their targeting sequences.Eur. J. Biochem. 241, 779–786.PubMedCrossRefGoogle Scholar
  54. 54.
    Cameron, J. M.,Hurd, T., and Robinson, B. H. (2005) Computational identification of human mitochondrial proteins based on homology to yeast mitochondrially targeted proteins.Bioinformatics 21, 1825–1830.PubMedCrossRefGoogle Scholar
  55. 55.
    Reumann, S.,Inoue, K., and Keegstra, K. (2005) Evolution of the general protein import pathway of plastids (review).Mol. Membr. Biol. 22, 73–86.PubMedCrossRefGoogle Scholar
  56. 56.
    Emanuelsson, O.,Nielsen, H., and von Heijne, G. (1999) ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites.Protein Sci. 8, 978–984.PubMedCrossRefGoogle Scholar
  57. 57.
    Schein, A. I .,Kissinger, J. C., and Ungar, L. H. (2001) Chloroplast transit peptide prediction: a peek inside the black box.Nucleic Acids Res. 29, E92.CrossRefGoogle Scholar
  58. 58.
    Bannai, H.,Tamada, Y., Maruyama, O., Nakai, K., and Miyano, S. (2002) Extensive feature detection of N-terminal protein sorting signals.Bioinformatics 18, 298–305.PubMedCrossRefGoogle Scholar
  59. 59.
    Richly, E. and Leister, D. (2004) An improved prediction of chloroplast proteins reveals diversities and commonalities in the chloroplast proteomes of Arabidopsis and rice.Gene 329, 11–16.PubMedCrossRefGoogle Scholar
  60. 60.
    Westerlund, I., Von Heijne, G., and Emanuelsson, O. (2003) LumenP–a neural network predictor for protein localization in the thylakoid lumen.Protein Sci. 12, 2360–2366.PubMedCrossRefGoogle Scholar
  61. 61.
    Christophe, D., Christophe-Hobertus, C., and Pichon, B. (2000) Nuclear targeting of proteins: how many different signals?Cell Signal 12, 337–341.PubMedCrossRefGoogle Scholar
  62. 62.
    Pemberton, L. F. and Paschal, B. M. (2005) Mechanisms of receptor-mediated nuclear import and nuclear export.Traffic 6, 187–198.PubMedCrossRefGoogle Scholar
  63. 63.
    Cokol, M., Nair, R., and Rost, B. (2000) Finding nuclear localization signals.EMBO Rep. 1, 411–415.PubMedCrossRefGoogle Scholar
  64. 64.
    Nair, R., Carter, P., and Rost, B. (2003) NLSdb: database of nuclear localization signals.Nucleic Acids Res. 31, 397–399.PubMedCrossRefGoogle Scholar
  65. 65.
    Heddad, A., Brameler, M., and MacCallum, R. M. (2004) Evolving regular expression-based sequence classifiers for protein nuclear localisation.Lecture Notes Computer Sci. 3005, 31–40.CrossRefGoogle Scholar
  66. 66.
    Kutay, U. and Guttinger, S. (2005) Leucine-rich nuclear-export signals: born to be weak.Trends Cell Biol. 15, 121–124.PubMedCrossRefGoogle Scholar
  67. 67.
    la Cour, T., Gupta, R., Rapacki, K., Skriver, K., Poulsen, F. M., and Brunak, S. (2003) NESbase version 1.0: a database of nuclear export signals. Nucleic Acids Res.31, 393–396.PubMedCrossRefGoogle Scholar
  68. 68.
    la Cour, T., Kiemer, L., Molgaard, A., Gupta, R., Skriver, K., and Brunak, S. (2004) Analysis and prediction of leucine-rich nuclear export signals.Protein Eng. Des. Sel. 17, 527–536.PubMedCrossRefGoogle Scholar
  69. 69.
    Baker, A. and Sparkes, I.A. (2005) Peroxisome protein import: some answers, more questions.Curr. Opin. Plant Biol. 8, 640–647.PubMedCrossRefGoogle Scholar
  70. 70.
    Neuberger, G., Maurer-Stroh, S., Eisenhaber, B., Hartig, A., and Eisenhaber, F. (2003) Motif refinement of the peroxisomal targeting signal 1 and evaluation of taxon-specific differences.J. Mol. Biol. 328, 567–579.PubMedCrossRefGoogle Scholar
  71. 71.
    Neuberger, G., Maurer-Stroh, S., Eisenhaber, B., Hartig, A., and Eisenhaber, F. (2003) Prediction of peroxisomal targeting signal 1 containing proteins from amino acid sequence.J. Mol. Biol. 328, 581–592.PubMedCrossRefGoogle Scholar
  72. 72.
    Emanuelsson, O., Elofsson, A., von Heijne, G., and Cristobal, S. (2003) In silico prediction of the peroxisomal proteome in fungi, plants and animals.J. Mol. Biol. 330, 443–456.PubMedCrossRefGoogle Scholar
  73. 73.
    Kurochkin, I. V., Nagashima, T., Konagaya, A., and Schonbach, C. (2005) Sequence-based discovery of the human and rodent peroxisomal proteome.Appl. Bioinform. 4, 93–104.CrossRefGoogle Scholar
  74. 74.
    Neuberger, G., Kunze, M., Eisenhaber, F., Berger, J., Hartig, A., and Brocard, C. (2004) Hidden localization motifs: naturally occurring peroxisomal targeting signals in non-peroxisomal proteins.Genome Biol. 5, R97.PubMedCrossRefGoogle Scholar
  75. 75.
    Petriv, O. I., Tang, L., Titorenko, V. I., and Rachubinski, R. A. (2004) A new definition for the consensus sequence of the peroxisome targeting signal type 2.J. Mol. Biol. 341, 119–134.PubMedCrossRefGoogle Scholar
  76. 76.
    Reumann, S. (2004) Specification of the peroxisome targeting signals type 1 and type 2 of plant peroxisomes by bioinformatics analyses.Plant Physiol. 135, 783–800.PubMedCrossRefGoogle Scholar
  77. 77.
    Reumann, S., Ma, C., Lemke, S., and Babujee, L. (2004) AraPerox. A database of putative Arabidopsis proteins from plant peroxisomes.Plant Physiol. 136, 2587–2608.PubMedCrossRefGoogle Scholar
  78. 78.
    Ton-That, H., Marraffini, L. A., and Schneewind, O. (2004) Protein sorting to the cell wall envelope of Gram-positive bacteria.Biochim. Biophys. Acta 1694, 269–278.PubMedCrossRefGoogle Scholar
  79. 79.
    Boekhorst, J., de Been, M. W., Kleerebezem, M., and Siezen, R .J. (2005) Genome-wide detection and analysis of cell wall-bound proteins with LPxTG-like sorting motifs.J. Bacteriol. 187, 4928–4934.PubMedCrossRefGoogle Scholar
  80. 80.
    Terashima, H., Fukuchi, S., Nakai, K., et al. (2002) Sequence-based approach for identification of cell wall proteins in Saccharomyces cerevisiae.Curr. Genet. 40, 311–316.PubMedCrossRefGoogle Scholar
  81. 81.
    Rodriguez-Boulan, E., and Musch, A. (2005) Protein sorting in the Golgi complex: shifting paradigms.Biochim. Biophys. Acta 1744, 455–464.PubMedCrossRefGoogle Scholar
  82. 82.
    Yuan, Z. and Teasdale, R.D. (2002) Prediction of Golgi Type II membrane proteins based on their transmembrane domains.Bioinformatics 18, 1109–1115.PubMedCrossRefGoogle Scholar
  83. 83.
    Eisenhaber, B., Eisenhaber, F., Maurer-Stroh, S., and Neuberger, G. (2004) Prediction of sequence signals for lipid post-translational modifications: insights from case studies.Proteomics 4, 1614–1625.PubMedCrossRefGoogle Scholar
  84. 84.
    Maurer-Stroh, S., Eisenhaber, B., and Eisenhaber, F. (2002) N-terminal N-myristoylation of proteins: refinement of the sequence motif and its taxon-specific differences.J. Mol. Biol. 317, 523–540.PubMedCrossRefGoogle Scholar
  85. 85.
    Eisenhaber, B., Maurer-Stroh, S., Novatchkova, M., Schneider, G., and Eisenhaber, F. (2003) Enzymes and auxiliary factors for GPI lipid anchor biosynthesis and post-translational transfer to proteins.Bioessays 25, 367–385.PubMedCrossRefGoogle Scholar
  86. 86.
    Maurer-Stroh, S., Eisenhaber, B., and Eisenhaber, F. (2002) N-Terminal N-myristoylation of proteins: prediction of substrate proteins from amino acid sequence.J. Mol. Biol. 317, 541–557.PubMedCrossRefGoogle Scholar
  87. 87.
    Eisenhaber, F., Eisenhaber, B., Kubina, W., et al. (2003) Prediction of lipid posttranslational modifications and localization signals from protein sequences: big-Pi, NMT and PTS1.Nucleic Acids Res. 31, 3631–3634.PubMedCrossRefGoogle Scholar
  88. 88.
    Eisenhaber, B., Wildpaner, M., Schultz, C. J., Borner, G. H., Dupree, P., and Eisenhaber, F. (2003) Glycosylphosphatidylinositol lipid anchoring of plant proteins. Sensitive prediction from sequence- and genome-wide studies for Arabidopsis and rice.Plant Physiol. 133, 1691–1701.PubMedCrossRefGoogle Scholar
  89. 89.
    Eisenhaber, B., Schneider, G., Wildpaner, M., and Eisenhaber, F. (2004) A sensitive predictor for potential GPI lipid modification sites in fungal protein sequences and its application to genome-wide studies for Aspergillus nidulans, Candida albicans, Neurospora crassa, Saccharomyces cerevisiae and Schizosaccharomyces pombe.J. Mol. Biol. 337, 243–253.PubMedCrossRefGoogle Scholar
  90. 90.
    Nishikawa, K. and Ooi, T. (1982) Correlation of the amino acid composition of a protein to its structural and biological characters.J. Biochem. (Tokyo) 91, 1821–1824.Google Scholar
  91. 91.
    Andrade, M. A., O’Donoghue, S. I., and Rost, B. (1998) Adaptation of protein surfaces to subcellular location.J. Mol. Biol. 276, 517–525.PubMedCrossRefGoogle Scholar
  92. 92.
    Nakashima, H. and Nishikawa, K. (1994) Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies.J. Mol. Biol. 238, 54–61.PubMedCrossRefGoogle Scholar
  93. 93.
    Cedano, J., Aloy, P., Perez-Pons, J. A., and Querol, E. (1997) Relation between amino acid composition and cellular location of proteins.J. Mol. Biol. 266, 594–600.PubMedCrossRefGoogle Scholar
  94. 94.
    Reinhardt, A. and Hubbard, T. (1998) Using neural networks for prediction of the subcellular location of proteins.Nucleic Acids Res. 26, 2230–2236.PubMedCrossRefGoogle Scholar
  95. 95.
    Yuan, Z. (1999) Prediction of protein subcellular locations using Markov chain models.FEBS Lett. 451, 23–26.PubMedCrossRefGoogle Scholar
  96. 96.
    Hua, S. and Sun, Z. (2001) Support vector machine approach for protein subcellular localization prediction.Bioinformatics 17, 721–728.PubMedCrossRefGoogle Scholar
  97. 97.
    Feng, Z. P. and Zhang, C. T. (2001) Prediction of the subcellular location of prokaryotic proteins based on the hydrophobicity index of amino acids.Int. J. Biol. MacroMol. 28, 255–261.PubMedCrossRefGoogle Scholar
  98. 98.
    Cai, Y. D., Liu, X. J., Xu, X. B., and Chou, K. C. (2000) Support vector machines for prediction of protein subcellular location.Mol. Cell Biol. Res. Commun. 4, 230–233.PubMedCrossRefGoogle Scholar
  99. Stapley, B. J., Kelley, L. A., and Sternberg, M. J. (2002) Predicting the sub-cellular location of proteins from text using support vector machines.Pac. Symp. Biocomput.374–385.Google Scholar
  100. 100.
    Park, K. J., and Kanehisa, M. (2003) Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs.Bioinformatics 19, 1656–1663.PubMedCrossRefGoogle Scholar
  101. 101.
    Yu, C. S., Lin, C. J., and Hwang, J. K. (2004) Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions.Protein Sci. 13, 1402–1406.PubMedCrossRefGoogle Scholar
  102. 102.
    Bhasin, M. and Raghava, G.P. (2004) ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST.Nucleic Acids Res. 32, W414–419.PubMedCrossRefGoogle Scholar
  103. 103.
    Bhasin, M., Garg, A., and Raghava, G. P. (2005) PSLpred: prediction of subcellular localization of bacterial proteins.Bioinformatics 21, 2522–2524.PubMedCrossRefGoogle Scholar
  104. 104.
    Garg, A., Bhasin, M., and Raghava, G. P. (2005) Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search.J. Biol. Chem 280, 14427–14432.PubMedCrossRefGoogle Scholar
  105. 105.
    Gardy, J. L., Spencer, C., Wang, K., et al. (2003) PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria.Nucleic Acids Res.bi 31, 3613–3617.PubMedCrossRefGoogle Scholar
  106. Horton, P., Park, K. J., Kobayashi, T., and Nakai, K. (2006) Protein subcellular localization prediction with WoLF PSORT, in4th Asia-Pacific Bioinformatics Conference(T. Jiang, et al., eds.), Imperial College Press, London, pp. 39–48,Google Scholar
  107. 107.
    Nair, R. and Rost, B. (2002) Sequence conserved for subcellular localization.Protein Sci. 11, 2836–2847.PubMedCrossRefGoogle Scholar
  108. 108.
    Chou, K. C. and Cai, Y. D. (2002) Using functional domain composition and support vector machines for prediction of protein subcellular location.J. Biol. Chem. 277, 45765–45769.PubMedCrossRefGoogle Scholar
  109. 109.
    Cai, Y. D. and Chou, K. C. (2003) Nearest neighbour algorithm for predicting protein subcellular location by combining functional domain composition and pseudo-amino acid composition.Biochem. Biophys. Res.Commun.305, 407–411.PubMedCrossRefGoogle Scholar
  110. 110.
    Guda, C. and Subramaniam, S. (2005) pTARGET (corrected) a new method for predicting protein subcellular localization in eukaryotes.Bioinformatics 21, 3963–3969.PubMedCrossRefGoogle Scholar
  111. 111.
    Xie, D., Li, A., Wang, M., Fan, Z., and Feng, H. (2005) LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST.Nucleic Acids Res. 33, W105–110.PubMedCrossRefGoogle Scholar
  112. 112.
    Gorlich, D. (1997) Nuclear protein import.Curr. Opin. Cell Biol. 9, 412–419.Google Scholar
  113. 113.
    Nair, R. and Rost, B. (2003) LOC3D: annotate sub-cellular localization for protein structures.Nucleic Acids Res. 31, 3337–3340.PubMedCrossRefGoogle Scholar
  114. 114.
    Boeckmann, B., Bairoch, A., Apweiler, R., et al. (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.Nucleic Acids Res. 31, 365–370.PubMedCrossRefGoogle Scholar
  115. 115.
    Eisenhaber, F. and Bork, P. (1999) Evaluation of human-readable annotation in biomolecular sequence databases with biological rule libraries.Bioinformatics 15, 528–535.PubMedCrossRefGoogle Scholar
  116. 116.
    Nair, R. and Rost, B. (2002) Inferring sub-cellular localization through automated lexical analysis.Bioinformatics 18 (Suppl. 1), S78–86.PubMedGoogle Scholar
  117. 117.
    Lu, Z., Szafron, D., Greiner, R., et al. (2004) Predicting subcellular localization of proteins using machine-learned classifiers.Bioinformatics 20, 547–556.PubMedCrossRefGoogle Scholar
  118. 118.
    Murphy, R. F., Boland, M. V., and Velliste, M. (2000) Towards a systematics for protein subcelluar location: quantitative description of protein localization patterns and automated analysis of fluorescence microscope images.Proc. Int. Conf. Intell. Syst. Mol. Biol. 8, 251–259.PubMedGoogle Scholar
  119. 119.
    Boland, M. V. and Murphy, R. F. (2001) A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells.Bioinformatics 17, 1213–1223.PubMedCrossRefGoogle Scholar
  120. 120.
    Drawid, A. and Gerstein, M. (2000) A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome.J. Mol.Biol. 301, 1059–1075.PubMedCrossRefGoogle Scholar
  121. 121.
    Matsuda, S., Vert, J. P., Saigo, H., Ueda, N., Toh, H., and Akutsu, T. (2005) A novel representation of protein sequences for prediction of subcellular location using support vector machines.Protein Sci. 14, 2804–2813.PubMedCrossRefGoogle Scholar
  122. 122.
    Vapnik, V. (1998)Statistical Learning Theory, Wiley-Interscience, New York.Google Scholar
  123. 123.
    Cristianini, N. and Shawe-Taylor, J. (2000)An Introduction to Support Vector Machines, Cambridge University Press, Cambridge, UK.Google Scholar
  124. 124.
    Scholkopf, B. and Smola, A. J. (2002)Learning with Kernels, MIT Press, Cambridge, MA.Google Scholar
  125. 125.
    Joachims, T. (1999) Making large-scale SVM learning practical, inAdvances in Kernel Methods—Support Vector Learning(Scholkopf, B., Burges, C., and Smola, A., eds.), MIT Press, Cambridge, MA.Google Scholar
  126. 126.
    Chang, C.-C. and Lin, C.-J. (2001) LIBSVM: a library for support vector machines.Google Scholar
  127. 127.
    Duda, R. O., Hart, P. E., and Stork, D. G. (2000)Pattern Classification, 2nd ed., John Wiley & Sons, New York.Google Scholar
  128. 128.
    Horton, P. and Nakai, K. (1997) Better prediction of protein cellular localization sites with the k nearest neighbors classifier.Proc. Int. Conf. Intell. Syst. Mol.Biol. 5, 147–152.PubMedGoogle Scholar
  129. 129.
    Huang, Y. and Li, Y. (2004) Prediction of protein subcellular locations using fuzzy k-NN method.Bioinformatics 20, 21–28.PubMedCrossRefGoogle Scholar
  130. 130.
    Nakai, K. and Kanehisa, M. (1991) Expert system for predicting protein localization sites in gram-negative bacteria.Proteins 11, 95–110.PubMedCrossRefGoogle Scholar
  131. 131.
    Horton, P. and Nakai, K. (1996) A probabilistic classification system for predicting the cellular localization sites of proteins.Proc. Int. Conf. Intell. Syst. Mol. Biol. 4, 109–115.PubMedGoogle Scholar
  132. 132.
    Nair, R. and Rost, B. (2005) Mimicking cellular sorting improves prediction of subcellular localization.J. Mol. Biol. 348, 85–100.PubMedCrossRefGoogle Scholar
  133. 133.
    Goldfarb, D. S., Gariepy, J., Schoolnik, G., and Kornberg, R. D. (1986) Synthetic peptides as nuclear localization signals.Nature 322, 641–644.PubMedCrossRefGoogle Scholar
  134. 134.
    Klug, A. and Schwabe, J. W. (1995) Protein motifs 5. Zinc fingers.FASEB J. 9, 597–604.PubMedGoogle Scholar
  135. 135.
    Mingot, J. M., Espeso, E. A., Diez, E., and Penalva, M. A. (2001) Ambient pH signaling regulates nuclear localization of the Aspergillus nidulans PacC transcription factor.Mol. Cell Biol. 21, 1688–1699.PubMedCrossRefGoogle Scholar
  136. 136.
    LaCasse, E. C. and Lefebvre, Y. A. (1995) Nuclear localization signals overlap DNA- or RNA-binding domains in nucleic acid-binding proteins.Nucleic Acids Res. 23, 1647–1656.PubMedCrossRefGoogle Scholar
  137. 137.
    Lim, A. and Li, B. F. (1996) The nuclear targeting and nuclear retention properties of a human DNA repair protein O6-methylguanine-DNA methyltransferase are both required for its nuclear localization: the possible implications.EMBO J. 15, 4050–4060.PubMedGoogle Scholar
  138. 138.
    Ashburner, M., Ball, C. A., Blake, J. A., et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.Nat. Genet. 25, 25–29.PubMedCrossRefGoogle Scholar
  139. 139.
    The Treacher Collins Syndrome Collaborative Group. (1996) Positional cloning of a gene involved in the pathogenesis of Treacher Collins syndrome.Nat. Genet. 12, 130–136.CrossRefGoogle Scholar
  140. 140.
    Wise, C. A., Chiang, L. C., Paznekas, W. A., et al. (1997) TCOF1 gene encodes a putative nucleolar phosphoprotein that exhibits mutations in Treacher Collins Syndrome throughout its coding region.Proc. Natl. Acad. Sci. USA 94, 3110–3115.PubMedCrossRefGoogle Scholar
  141. 141.
    Winokur, S. T. and Shiang, R. (1998) The Treacher Collins syndrome (TCOF1) gene product, treacle, is targeted to the nucleolus by signals in its C-terminus.Hum. Mol.Genet. 7, 1947–1952.PubMedCrossRefGoogle Scholar
  142. 142.
    Marsh, K. L., Dixon, J., and Dixon, M. J. (1998) Mutations in the Treacher Collins syndrome gene lead to mislocalization of the nucleolar protein treacle.Hum. Mol.Genet. 7, 1795–1800.PubMedCrossRefGoogle Scholar
  143. 143.
    Isaac, C., Marsh, K. L., Paznekas, W. A., et al. (2000) Characterization of the nucleolar gene product, treacle, in Treacher Collins syndrome.Mol. Biol. Cell 11, 3061–3071.PubMedGoogle Scholar
  144. 144.
    Taagepera, S., McDonald, D., Loeb, J. E., et al. (1998) Nuclear-cytoplasmic shuttling of C-ABL tyrosine kinase.Proc. Natl. Acad. Sci. USA 95, 7457–7462.PubMedCrossRefGoogle Scholar
  145. 145.
    Antelmann, H., Tjalsma, H., Voigt, B., et al. (2001) A proteomic view on genome-based signal peptide predictions.Genome Res. 11, 1484–1502.PubMedCrossRefGoogle Scholar
  146. 146.
    Tjalsma, H., Antelmann, H., Jongbloed, J. D., et al. (2004) Proteomics of protein secretion by Bacillus subtilis: separating the "secrets" of the secretome.Microbiol. Mol. Biol. Rev. 68, 207–233.PubMedCrossRefGoogle Scholar
  147. 147.
    Tjalsma, H. and van Dijl, J.M. (2005) Proteomics-based consensus prediction of protein retention in a bacterial membrane.Proteomics 5,4472–4482.PubMedCrossRefGoogle Scholar
  148. Nakai, K. (1996) Refinement of the prediction methods of signal peptides for the genome analyses of Saccharomyces cerevisiae and Bacillus subtilis, inGenome Informatics Workshop(Akutsu, T., et al., eds.), Universal Academy Press, Tokyo, pp. 72–81.Google Scholar
  149. 149.
    Lewenza, S., Gardy, J. L., Brinkman, F. S., and Hancock, R. E. (2005) Genome-wide identification of Pseudomonas aeruginosa exported proteins using a consensus computational strategy combined with a laboratory-based PhoA fusion screen.Genome Res. 15, 321–329.PubMedCrossRefGoogle Scholar
  150. 150.
    looseness-1Rey, S., Gardy, J. L., and Brinkman, F. S. (2005) Assessing the precision of high-throughput computational and laboratory approaches for the genome-wide identification of protein subcellular localization in bacteria.BMC Genomics 6, 162.PubMedCrossRefGoogle Scholar
  151. 151.
    Warnock, D. E., Fahy, E., and Taylor, S. W. (2004) Identification of protein associations in organelles, using mass spectrometry-based proteomics.Mass Spectrom. Rev. 23, 259–280.PubMedCrossRefGoogle Scholar
  152. 152.
    Kumar, A., Agarwal, S., Heyman, J. A., et al. (2002) Subcellular localization of the yeast proteome.Genes Dev. 16, 707–719.PubMedCrossRefGoogle Scholar
  153. 153.
    Huh, W. K., Falvo, J. V., Gerke, L. C., et al. (2003) Global analysis of protein localization in budding yeast.Nature 425, 686–691.PubMedCrossRefGoogle Scholar
  154. 154.
    Clark, H. F., Gurney, A. L., Abaya, E., et al. (2003) The secreted protein discovery initiative (SPDI), a large-scale effort to identify novel human secreted and transmembrane proteins: a bioinformatics assessment.Genome Res. 13, 2265–2270.PubMedCrossRefGoogle Scholar
  155. 155.
    Millar, A. H., Heazlewood, J. L., Kristensen, B. K., Braun, H. P., and Moller, I. M. (2005) The plant mitochondrial proteome.Trends Plant Sci. 10, 36–43.PubMedCrossRefGoogle Scholar
  156. 156.
    Heazlewood, J. L., Tonti-Filippini, J., Verboom, R. E., and Millar, A. H. (2005) Combining experimental and predicted datasets for determination of the subcellular location of proteins in Arabidopsis.Plant Physiol. 139, 598–609.PubMedCrossRefGoogle Scholar
  157. 157.
    Schmitt, S., Prokisch, H., Schlunck, T., et al. (2006) Proteome analysis of mitochondrial outer membrane from Neurospora crassa.Proteomics 6, 72–80.PubMedCrossRefGoogle Scholar
  158. 158.
    Peltier, J..B., Emanuelsson, O., Kalume, D. E., et al. (2002) Central functions of the lumenal and peripheral thylakoid proteome of Arabidopsis determined by experimentation and genome-wide prediction.Plant Cell 14, 211–236.CrossRefGoogle Scholar
  159. 159.
    Friso, G., Giacomelli, L., Ytterberg, A. J., et al. (2004) In-depth analysis of the thylakoid membrane proteome of Arabidopsis thaliana chloroplasts: new proteins, new functions, and a plastid proteome database.Plant Cell 16, 478–499.PubMedCrossRefGoogle Scholar
  160. 160.
    Sun, Q., Emanuelsson, O., and van Wijk, K. J. (2004) Analysis of curated and predicted plastid subproteomes of Arabidopsis. Subcellular compartmentalization leads to distinctive proteome properties.Plant Physiol. 135, 723–734.PubMedCrossRefGoogle Scholar
  161. 161.
    Bayer, E. M., Bottrill, A. R., Walshaw, J., et al. (2006) Arabidopsis cell wall proteome defined using multidimensional protein identification technology.Proteomics 6, 301–311.PubMedCrossRefGoogle Scholar
  162. 162.
    Schwacke, R., Flugge, U. I., and Kunze, R. (2004) Plant membrane proteome databases.Plant Physiol. Biochem. 42, 1023–1034.PubMedCrossRefGoogle Scholar
  163. 163.
    Hwang, S. I., Lundgren, D. H., Mayya, V., et al. (2006) Systematic characterization of nuclear proteome from human T leukemia cells: a quantitative proteomic study during apoptosis by differential extraction and stable isotope labeling.Mol. Cell Proteomics.5, 1131–1145.PubMedCrossRefGoogle Scholar
  164. 164.
    Nair, R. and Rost, B. (2004) LOCnet and LOCtarget: sub-cellular localization for structural genomics targets.Nucleic Acids Res. 32, W517–521.PubMedCrossRefGoogle Scholar
  165. 165.
    Nakai, K. and Horton, P. (1999) PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization.Trends Biochem. Sci. 24, 34–36.PubMedCrossRefGoogle Scholar
  166. 166.
    Gardy, J. L., Laird, M. R., Chen, F., et al. (2005) PSORTb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis.Bioinformatics 21, 617–623.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press Inc. 2007

Authors and Affiliations

  • Kenta Nakai
    • 1
  • Paul Horton
    • 2
  1. 1.Laboratory of Functional Analysis in silico, Human Genome Center, The Institute of Medical ScienceThe University of TokyoTokyoJapan
  2. 2.AIST Computational Biology Research CenterTokyoJapan

Personalised recommendations