Introduction

Knowing the location of a protein within its cellular environment is critical for understanding the regulatory mechanisms by which it is controlled. The accurate function of proteins and their interaction networks relies greatly on the proper localization of each protein component. A conventional method to identify protein–protein interactions at the single-cell level is to trace the mutual localization of proteins under physiological conditions (Relic et al. 1998; Surapureddi et al. 2000). Another common strategy in the study of regulation and interaction networks is to determine whether the localization of proteins is altered by the intentional disruption of the networks (Zuckerbraun et al. 2003). The aberrant translocation of proteins often correlates with pathological changes in cell physiology and accounts for the clinical manifestations of several genetic diseases such as primary hyperoxaluria (Danpure et al. 1993). A growing list of diseases caused by the improper localization of proteins makes protein translocation a promising target for the development of therapeutic agents (Besemer et al. 2005; Garrison et al. 2005).

Computational biologists have made extensive efforts to develop programs to predict the subcellular localization of proteins. Numerous software suites have been released in this field, based on various biological concepts and computational methods. Presently, four leading methods are commonly used. The first uses the overall protein amino acid composition. For example, SubLoc predicts protein localization based on the fact that proteins with different subcellular localizations usually have different amino acid compositions (Hua and Sun 2001). The second type of method utilizes known targeting sequences. One of the most important principles of the protein sorting mechanism is the existence of a targeting signal in the amino acid sequence that leads proteins to different organelles or out of the cell. Hence, several computational approaches focus on predicting the presence of certain targeting motifs in protein sequences, e.g. signal peptides (SPs), the mitochondrial targeting peptide (mTP), nuclear localization signals (NLS) and transmembrane alpha helices (Bannai et al. 2002; Claros and Vincens 1996; Emanuelsson 2002). A third approach uses sequence homology and/or motifs. For example, the Proteome Analyst Subcellular Localization Server (PA-SUB) utilizes keywords from the protein database SWISS-PROT and the annotation of homologous proteins (Lu et al. 2004). Finally, a combination of the information obtained from the three categories described above has been used in prediction tools such as WoLF-PSORT (updated version of PSORT II) and the most recent, SherLoc2 (Horton et al. 2007; Briesemeister et al. 2009).

Due to their automated and high-throughput nature, computational methods are appealing for the large-scale assignment of protein subcellular locations. Regardless of the algorithm used, however, computational predictions have always been based on available biological knowledge, which is far from complete. The enormous complexity of the protein sorting process, the existence of alternative transportation pathways and the lack of complete data for every organelle still limit the application of computational methods. For instance, very few current predictors can deal with multi-site localization of a protein, with the exception of WoLF-PSORT and Hum-mPLoc (Shen and Chou 2009).

Due to the uncertain effectiveness of the available methods, particularly on a random protein dataset, we performed a comparative analysis between experimentally obtained subcellular localization data for 52 human Chr.21 proteins (Hu et al. 2006) and in silico prediction results, with the aim of evaluating the reliability of the bioinformatics approaches. Nine leading computational programs were included in the analysis, mainly due to their variable prediction strategies and the user-friendly web services that they provide.

Materials and methods

The materials and methods for the experimental characterization of protein subcellular localizations were reported previously (Hu et al. 2006). The computational predictions were performed on the internet website interfaces provided by each prediction program. A positive prediction was counted if the program gave the same site as at least one of the experimentally determined localizations for a given protein. The web addresses of the prediction programs used in this study are as follows: SherLoc2: http://www-bs.informatik.uni-tuebingen.de/Services/SherLoc2; WoLF-PSORT: http://wolfpsort.org/; pTARGET: http://bioapps.rit.albany.edu/pTARGET/; ProtComp8: http://linux1.softberry.com/berry.phtml?topic=protcompan&group=programs&subgroup=proloc; PA-SUB v2.5: http://pasub.cs.ualberta.ca:8080/pa/Subcellular; MultiLoc2: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc2/; ESLPred2: http://www.imtech.res.in/raghava/eslpred2/; BaCelLo: http://gpcr.biocomp.unibo.it/bacello/; SubLoc: http://www.bioinfo.tsinghua.edu.cn/SubLoc/.

Results

We divided the nine programs into two groups according to their prediction resolutions: low-resolution four-site prediction (nucleus, cytoplasm, mitochondrion and secretory pathway) and high-resolution organelle prediction that can further assign a secretory pathway protein to specific subcellular organelles such as the ER, Golgi apparatus, peroxisome and lysosome, as well as the plasma membrane and extracellular secretion. The prediction principles and capabilities of the nine programs are summarized in Table 1.

Table 1 Comparison of the protein localization prediction software programs used in the study

The prediction results for the 52 Chr.21 proteins are summarized in Tables 2, 3; they were compared to the experimentally determined localization patterns described previously (Hu et al. 2006). If one of the actual localization sites of a protein was predicted by a program, we counted a full positive prediction. This means, for example, that a prediction of “extracellular/secretory” in a low-resolution group was considered to reflect good performance in predicting the localization of plasma membrane, ER, Golgi and lysosomal proteins (in total, 15 proteins in this study). This loose criterion for the secretory pathway, however, was not applied to the high-resolution predictors that can classify proteins into specific organelle locations. For all of the predictors, however, a prediction of either “cytoplasm” or “nucleus” was counted as a full positive hit for the 12 Chr.21 proteins with “cyto-nuc” (cytoplasm and nucleus) dual localization. These calculations significantly raised the overall success rates for all nine of the predictors, but they should have no impact on comparisons of the relative performances of predictors with the same resolution, as none of the nine predictors showed a dual-localization prediction for any of the 52 proteins tested in this study.

Table 2 Comparison of experimental localization results for 52 Chr.21 proteins to in silico low-resolution predictions
Table 3 Comparison of experimental localization results for 52 Chr.21 proteins to in silico high-resolution predictions

The total number of positive predictions consistent with the experimental findings was summarized for each program; the percentage of prediction accuracy is shown next to the name of the prediction program in Figs. 1, 2. Among the low-resolution predictors, the three recently published programs MultiLoc2, ESLPred2 and BaCelLo were found to have similar prediction accuracies, with 75% (MultiLoc2-LowReso, ESLPred2) and 71% (BaCelLo) agreement with the experimental data. A relatively low percentage of positive prediction, 60%, was observed for SubLoc, which was written in 2001.

Fig. 1
figure 1

Comparison of the prediction performances of five computational predictors with high resolution. Prediction performance varied among the different programs. SherLoc2 and WoLF-PSORT rendered the highest accuracy with the experimental results (indicated as Hek), at 83% and 75%, respectively, which was significantly better than pTARGET (60%), ProtComp8 (56%) and PA-SUB v2.5 (54%). Prediction accuracy was found to be associated with the specific localization site. Abbreviations: Nuc nucleus, Cyto cytoplasm, PM plasma membrane, ER endoplasmic reticulum, Lyso lysosome and endosome. *For the proteins with dual localization sites, all five of the predictors predicted only one site but such predictions were still counted as a full correct prediction

Fig. 2
figure 2

Comparison of the prediction performances of four computational predictors with low resolution. The recently developed predictors were found to have similar prediction accuracies, with 75% (MultiLoc2-LowReso, ESLPred2) and 71% (BaCelLo) agreement with the experimental data (indicated as Hek). A relatively low percentage of positive prediction, 60%, was observed for SubLoc, which was developed in 2001. Prediction accuracy was found to be associated with the specific localization site. Abbreviations: Nuc nucleus, Cyto cytoplasm, Secr. path. secretory pathway protein (including plasma membrane, ER, Golgi and lysosomal proteins in this study)

The high-resolution predictors were found to have huge differences in accuracy. SherLoc2 and WoLF-PSORT displayed the highest accuracy, at 83 and 75%, respectively, which was significantly better than pTARGET (60%), ProtComp8 (56%) and PA-SUB v2.5 (54%). This variation in performance may originate from the different prediction methods that each program utilizes. There is a commonality among the two best predictors in both resolution groups (MultiLoc2 and ESLPred2, and SherLoc2 and WoLF-PSORT) in that they all utilize a wide range of prediction methods based on amino acid sequence composition, sorting signals and homology similarity. This finding indicates that the combination of homology information with sequence-based prediction can greatly improve the accuracy of protein localization prediction. On the other hand, the low success rate of PA-SUB (54%) suggested that searching for the localization of homologs alone is not powerful enough to create a high-standard prediction. The main problem of an approach based only on homology is that the prediction results can be ambiguous if there are no homologous proteins available with annotated localizations. In this study the localization of 10 out of 52 proteins could not be predicted using PA-SUB. This incompleteness creates a significant challenge when using homolog-based programs for genome-wide predictions of protein localization.

To evaluate whether prediction performance was associated with the specific localization site, the prediction results were grouped into different categories based on the experimental localization results. The number of predictions consistent with the experimental data was counted for each localization category and is shown in Figs. 1, 2. For the low-resolution predictors, the localization sites appeared to be irrelevant to prediction performance; the only exception was SubLoc, which could only predict seven out of 16 cytoplasmic proteins, a much smaller number than obtained with the other three programs. The performance similarity of these programs seemed reasonable because about 30% of the test proteins fell into the secretory pathway category.

When we looked at the data from the high-resolution predictors, the prediction accuracies were found to be closely correlated with the localization sites. For example, PA-SUB showed high accuracy in predicting cytoplasmic proteins (13 out of 16) but failed to predict all 12 of the plasma membrane proteins, of which over 80% could be predicted by the other four predictors. ProtComp8 and pTARGET, on the other hand, tended to have lower accuracy in predicting cytoplasmic proteins, scoring below 40%. A different trend was observed for the prediction of ER proteins. Interestingly, in spite of the existence of a signal peptide (SP)—the first and most extensively studied protein sorting signal—all five of the predictors tended to miss the proteins residing in the ER. Instead, the ER proteins (e.g., C21orf69 and TMPRSS3a) were often misclassified as extracellular secretory and plasma membrane proteins. This is very likely due to the biological fact that most secretory and plasma membrane proteins also carry an SP in their amino acid sequences.

Discussion

The localization site-dependent performance shown by the different prediction programs may be attributable to the different prediction strategies utilized by each particular program and the level of knowledge available about protein trafficking mechanisms. For example, the sequence and structure of the signal peptide (SP), a motif that directs proteins to the ER membrane, are well studied as compared to nuclear localization signals (NLS), thus facilitating the prediction of proteins destined for the ER-associated secretory pathway (e.g., ER, Golgi, plasma membrane, lysosome/endosome and secretory proteins). This contributes to the high accuracy of low-resolution predictors that do not distinguish between specific localization sites within the pathway. For the high-resolution predictors, however, difficulties remain regarding how to classify the different organelles in relation to the secretory pathway. Hence, further studies on protein targeting motifs and their underlying mechanisms should contribute to the improvement of the accuracy of protein localization predictions.

The present results demonstrate that prediction performance varies between different programs and different localization categories. Consequently, it might be advisable to use multiple localization predictors that utilize different prediction methods. Moreover, special attention should be paid to the relative confidence scores assigned to the different localization sites. Generally, a large difference between the second best score and the best one implies a reliable prediction, whereas similar scores obtained for different locations may reflect the unreliability of the prediction or may indicate that the protein has multiple localization patterns. A good example of this in our study is the C21orf7 protein. The C21orf7 (TAK1-like) gene shares homology with the human TAK1 (TGF-beta activated kinase) gene, which plays a critical role in the TGF-beta signal transduction pathway. Even though it was classified as a cytoplasmic protein by most of the predictors, ESLPred2 predicted the nucleus as the most plausible localization site; moreover, WoLF-PSORT suggested a dual localization in the cytoplasm and nucleus with 19.8% probability, second to a 24% probability of localization in the cytoplasm alone. In our previous transfected-cell array experiments (Hu et al. 2006), the actual localization of this protein was found to be quite dynamic, with a distribution in both the cytoplasm and the nucleus.

In some cases the predictions may still be incorrect even though the majority of the predictors report the same localization. In this study the actual localization of several proteins was in disagreement with most of the predictions. For example, the WDR4 gene encodes a member of the WD-repeat protein family and is a candidate for some disorders mapped to 21q22.3 and for Down syndrome phenotypes (Michaud et al. 2000). Despite the fact that BaCelLo and ESLPred2 predicted it as a nuclear protein, the other seven programs predicted that it is either cytoplasmic protein or is exported outside of the cell. In the actual experiment, WDR4 proteins were found to reside in the nucleus, distributed within the nucleoplasm. The yeast homolog of WDR4, Trm82, has been previously reported to be required for 7-methylguanosine modification of tRNA (Alexandrov et al. 2002). Because this pre-tRNA processing is known to take place in the nucleoplasm before the resulting mature tRNAs are transported out to the cytoplasm (Lodish et al. 2000), Trm82 was expected to localize in the nucleus, especially in the nucleoplasm, as we observed for WDR4. Although the functional role of WDR4 in human cells has not been experimentally verified, Alexandrov et al. have found that WDR4, in a complex with METTL1, is required for the 7-methylguanosine modification of yeast tRNA (Alexandrov et al. 2002). In conjunction with our localization results, this finding suggests that human WDR4 performs a similar tRNA-processing function as does its yeast homolog.

Taken together, despite the relatively small number of proteins analyzed in this study, our results indicate a generally lower percentage of prediction accuracy (54–83%) than claimed by recently published predictors; for instance, ESLPred2 was claimed to have an accuracy of over 90% (Garg and Raghava 2008). Nevertheless, SherLoc2, MultiLoc2, ESLPred2 and WoLF-PSORT showed significantly better performance than the other programs evaluated in our study. The predictors that showed the best performance were SherLoc2 and WoLF-PSORT. Both programs can carry out high-resolution predictions of at least nine subcellular localizations, which is an extra merit in addition to their high prediction accuracy. Their outstanding capabilities are likely related to the multi-dimensional biological information they integrate into their prediction strategies, ranging from amino acid composition and the presence of sorting signals and targeting motifs to homology profiles and Gene Ontology terms.

Taken together, the differences in the accuracy of subcellular protein localization predictions presented in this study strongly suggest that the outcomes of in silico localization predictions should be treated with caution, and that it is always beneficial to compare the results provided by different prediction algorithms.