Methods for the bioinformatic identification of bacterial lipoproteins encoded in the genomes of Gram-positive bacteria
- 955 Downloads
Bacterial lipoproteins are a diverse and functionally important group of proteins that are amenable to bioinformatic analyses because of their unique signal peptide features. Here we have used a dataset of sequences of experimentally verified lipoproteins of Gram-positive bacteria to refine our previously described lipoprotein recognition pattern (G+LPP). Sequenced bacterial genomes can be screened for putative lipoproteins using the G+LPP pattern. The sequences identified can then be validated using online tools for lipoprotein sequence identification. We have used our protein sequence datasets to evaluate six online tools for efficacy of lipoprotein sequence identification. Our analyses demonstrate that LipoP (http://www.cbs.dtu.dk/services/LipoP/) performs best individually but that a consensus approach, incorporating outputs from predictors of general signal peptide properties, is most informative.
KeywordsLipoproteins Signal peptides Bioinformatics Genomics Firmicutes Actinobacteria
The authors thank Northumbria University for financial support from the ‘Research into Teaching’ programme.
- Nielsen H, Krogh A (1998) Prediction of signal peptides and signal anchors by a hidden Markov model. In: Proceedings of the sixth international conference on intelligent systems for molecular biology (ISMB 6), AAAI Press, Menlo Park, California, pp 122–130Google Scholar
- Sutcliffe IC, Harrington DJ (2002) Pattern searches for the identification of putative lipoprotein genes in Gram-positive bacterial genomes. Microbiology 148:2065–2077Google Scholar
- Sutcliffe IC, Russell RRB (1995) Lipoproteins of Gram-positive bacteria. J Bacteriol 177:1123–1128Google Scholar
- Sutcliffe IC, Tao L, Ferretti JJ, Russell RRB (1993) MsmE, a lipoprotein involved in sugar transport in Streptococcus mutans. J Bacteriol 175:1853–1855Google Scholar
- Taylor PD, Toseland CP, Attwood CK, Flower DR (2006) LIPPRED: a web server for accurate prediction of lipoprotein signal sequences and cleavage sites. Bioinformation 1:176–179Google Scholar