Abstract
The parasite Plasmodium falciparum, responsible for the most deadly form of human malaria, is one of the extremely AT-rich genomes sequenced so far and known to possess many atypical characteristics. Using multivariate statistical approaches, the present study analyzes the amino acid usage pattern in 5038 annotated protein-coding sequences in P. falciparum clone 3D7. The amino acid composition of individual proteins, though dominated by the directional mutational pressure, exhibits wide variation across the proteome. The Asn content, expression level, mean molecular weight, hydropathy, and aromaticity are found to be the major sources of variation in amino acid usage. At all stages of development, frequencies of residues encoded by GC-rich codons such as Gly, Ala, Arg, and Pro increase significantly in the products of the highly expressed genes. Investigation of nucleotide substitution patterns in P. falciparum and other Plasmodium species reveals that the nonsynonymous sites of highly expressed genes are more conserved than those of the lowly expressed ones, though for synonymous sites, the reverse is true. The highly expressed genes are, therefore, expected to be closer to their putative ancestral state in amino acid composition, and a plausible reason for their sequences being GC-rich at nonsynonymous codon positions could be that their ancestral state was less AT-biased. Negative correlation of the expression level of proteins with respective molecular weights supports the notion that P. falciparum, in spite of its intracellular parasitic lifestyle, follows the principle of cost minimization.
Similar content being viewed by others
References
Akashi H, Gojobori T (2002) Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci USA 99:3695–3700
Alvarez F, Robello C, Vignali M (1994) Evolution of codon usage and base contents in kinetoplastid protozoans. Mol Biol Evol 11:790–802
Bahl A, Brunk B, Crabtree J, et al. (2003) PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data. Nucleic Acids Res 31:212–215
Bastien O, Lespinats S, Roy S, Métayer K, Fertil B, Codani J-J, Maréshal E (2004) Analysis of compositional biases in Plasmodium falciparum genome and proteome using Arabidopsis thaliana as a reference. Gene 336:163–173
Foster PG, Jermin LS, Hickey DA (1997) Nucleotide composition bias affects amino acid content in proteins coded by animal mitochondria. J Mol Evol 44:282–288
Garat B, Musto H (2000) Trends of amino acid usage in the proteins from the unicellular parasite Giardia lamblia. Biochem Biophys Res Commun 279:996–1000
Gardner MJ, Hall N, Fung E, et al. (2002) Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419:498–511
Gouy M, Gautier C (1982) Codon usage in bacteria: correlation with gene expressivity. Nucleic Acids Res 10:7055–7074
Greenacre M (1984) Theory and application of correspondence analysis. Academic. London
Ikeumura T (1981) Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J Mol Biol 151:389–409
Knight RD, Feeland SJ, Landweber LF (2001) A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol 2:research0010.1–0010.13
Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157:105–132
Lobry JR (1997) Influence of genomic G+C-content on average amino-acid composition of proteins from 59 bacterial species. Gene 205:309–316
Lobry JR, Gautier C (1994) Hydrophobicity, expressivity and aromaticity are the major trends of amino-acid usage in 999 Escherichia coli chromosome-encoded genes. Nucleic Acids Res 22:3174–3180
Musto H, Rodriguez-Maseda H, Bernard! G (1995) Compositional properties of nuclear genes from Plasmodium falciparum. Gene 152:127–132
Musto H, Romero H, Zavala A, Jabbari K, Bernardi G (1999) Synonymous codon choices in the extremely GC-poor genome of Plasmodium falciparum: compositional constraints and translational selection. J Mol Evol 49:27–35
Naya H, Romero H, Carels N, Zavala A, Musto H (2001) Translational selection shapes codon usage in the GC-rich genome of Chlamydomonas reinhardii. FEES Lett 501:127–130
Nei M, Gojobori T (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 1986;3:418–426
Palacios C, Wernegreen JJ (2002) A strong effect of AT mutational bias on amino acid usage in Buchnera is mitigated at high-expression genes. Mol Biol Evol 19:575–1584
Peixoto L, Fern£ndez V, Musto H (2004) The effect of expression level on codon usage in Plasmodium falciparum. Parasitology 128:245–251
Pizzi E, Frontali C (2001) Low-complexity regions in Plasmodium falcivarum proteins. Genome Res 11:218–229
Reeder JC, Brown GV (1996) Antigenic variation and immune evasion in Plasmodium falciparum malaria. Immunol Cell Biol 74:546–554
Rispe C, Delmotte F, van Ham RC, Moya A (2004) Mutational and selective pressures on codon and amino acid usage in Buchnera endosymbiotic bacteria of aphids. Genome Res 14:44–53
Romero H, Zavala A, Musto H (2000) Compositional pressure and translational selection determine codon usage in extremely GC-poor unicellular eukaryote Entamoeba histolytica. Gene 242:307–311
Seligmann H (2003) Cost-minimization of amino acid usage. J Mol Evol 56:151–161
Sharp PM, Devine K (1989) Codon usage and gene expression level in Dictyostelium discoideum highly expressed genes do “prefer” optimal codons. Nucleic Acids Res 17:5029–5039
Sharp PM, Li WH (1987) The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281–1295
Singh GP, Chandra BR, Bhattacharya A, Akhouri RR, Singh SK, Sharma A (2004) Hyperexpansion of asparagines correlates with an abundance of proteins with prion-like domains in Plasmodium falciparum. Mol Biochem Parasitol 137:307–319
Xue HY, Forsdyke DR (2003) Low-complexity segments Plasmodium falciparum proteins are primarily nucleic acid level adaptations. Mol Biochem Parasitol 128:21–32
Acknowledgments
This work was supported by the Council of Scientific and Industrial Research (Project CMM 0017) and Department of Biotechnology, Government of India (Grant BT/BI/04/055-2001). We thank Mr. S. Chatterjee and Mr. S. Paul of Bioinformatics Centre, IICB, for their technical support and Ms. S. Ghosh, Human Genetics & Genomics Group, IICB, for critical reading of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
[Reviewing Editor : Dr. Richard Kliman]
Rights and permissions
About this article
Cite this article
Chanda, I., Pan, A. & Dutta, C. Proteome Composition in Plasmodium falciparum: Higher Usage of GC-Rich Nonsynonymous Codons in Highly Expressed Genes. J Mol Evol 61, 513–523 (2005). https://doi.org/10.1007/s00239-005-0023-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-005-0023-5