Codon usage and base composition inRickettsia prowazekii
- Cite this article as:
- Andersson, S.G.E. & Sharp, P.M. J Mol Evol (1996) 42: 525. doi:10.1007/BF02352282
Codon usage and base composition in sequences from the A + T-rich genome ofRickettsia prowazekii, a member of the alpha Proteobacteria, have been investigated. Synonymous codon usage patterns are roughly similar among genes, even though the data set includes genes expected to be expressed at very different levels, indicating that translational selection has been ineffective in this species. However, multivariate statistical analysis differentiates genes according to their G + C contents at the first two codon positions. To study this variation, we have compared the amino acid composition patterns of 21R. prowazekii proteins with that of a homologous set of proteins fromEscherichia coli. The analysis shows that individual genes have been affected by biased mutation rates to very different extents: genes encoding proteins highly conserved among other species being the least affected. Overall, protein coding and intergenic spacer regions have G + C content values of 32.5% and 21.4%, respectively. Extrapolation from these values suggests thatR. prowazekii has around 800 genes and that 60–70% of the genome may be coding.