jEcho: an Evolved weight vector to CHaracterize the protein’s posttranslational modification mOtifs

Protein’s posttranslational modification (PTM) represents a major dynamic regulation of protein functions after the translation of polypeptide chains from mRNA molecule. Compared with the costly and labor-intensive wet laboratory characterization of PTMs, the computer-based detection of PTM residues has been a major complementary technique in recent years. Previous studies demonstrated that the PTM-flanking positions convey different contributions to the computational detection of PTM residue, but did not directly translate this observation into the in silico PTM prediction. We propose a weight vector to represent the variant contributions of the PTM-flanking positions and use an evolutionary algorithm to optimize the vector. Even a simple nearest neighbor algorithm with the incorporated optimal weight vector outperforms the currently available algorithms. The algorithm is implemented as an easy-to-use computer program, jEcho version 1.0. The implementation language, Java, makes jEcho platform-independent and visually interactive. The predicted results may be directly exported as publication-quality images or text files. jEcho may be downloaded from http://www.healthinformaticslab.org/supp/. Electronic supplementary material The online version of this article (doi:10.1007/s12539-015-0260-2) contains supplementary material, which is available to authorized users.


Supplementary Table S1
The optimized weight vectors used in the program jEcho version 1.0.

jEcho's user guide
In order to facilitate the user-friendly usage of these models, we implement the Echo algorithm and its nine classification models as the computer program jEcho version 1.0 ( Figure 1).
The programming environment is Eclipse version 4.2.1, with the Java Running Environment version 1.7.0_21. We do not use any latest technology. So jEcho version 1.0 should run under any Java versions across different operating systems.
By following the suggested functionalities of a PTM prediction server/program [1], we design the user interface of jEcho as the style of all functions in one window, as shown in Figure 1 and Supplementary Figure S1 (a). The user may get a quick demo by clicking the "Example" button, which loads two protein sequences from Acetobacter pasteurianus IFO 3283-01 into the query sequence box in the top right corner of the window. To make the demo simpler, the selection of kinase MAPK8 will also be automatically chosen, as shown in Supplementary Figure S1 (b).
After clicking the "Submit" button, 9 predicted phosphoserine/threonine residues are detected in the two protein sequences, and listed in the result table box in the bottom right corner.
A number of productivity tools are provided to facilitate biologist's exploration of the predicted PTM residues. The function of "Sort" is provided to help the investigation of residues modified by different PTM types, as shown in Supplementary Figure S1 (c). The user may investigate all the PTM residues in a given protein, by sorting the prediction result lines with the column "Query ID". An investigation of all the residues modified by a given PTM may be conducted by clicking the column "Enzyme". The function of "residue locating" is provided to help the fast locating of the candidate PTM residue and its flanking peptide, by clicking the given line of predicted PTM residue, as shown in Figure S1 (d). Sometimes there may be dozens of predicted PTM residues in multiple user-input sequences, and this function will help the user easily find the predicted residue and the flanking region. The function of "PTM residue plotting" is provided to demonstrate the distribution of predicted PTM residues in the current protein sequence, as shown in Figure S1 (e). The function of "PTM type searching" is provided to help the user quickly find the PTM types of interest, as shown in Supplementary Figure S1 (f).
After the final choices of PTM types and target protein sequences, all the results in jEcho may be exported for further analysis or publication purpose. The candidate PTM residues in the bottom right box may be exported as a TEXT file or a PDF file by clicking the "Export result" button. The results in the TEXT format may be used for large-scale association analysis. The distribution plot of candidate PTM residues in a protein may be saved as a vector image (SVG format) or a pixel image (JPG format). The SVG image may be processed and saved as a pixel image with publication quality resolution (e.g. 300 dpi) by the computer programs Inkscape (free GPL2 license) or Adobe Illustrator (commercial license). The JPG format may be processed by the programs ImageMagik (free Apache 2.0 license) or Adobe Photoshop (commercial license).