Abstract
The separation of plant and fungal sequences in EST pools by bioinformatic methods is difficult because of sequence similarities between plants and fungi, lack of enough sequence information, and the short length of the isolated fragments. An algorithm and software that utilize the differences in codon usage bias to discriminate between plant and fungal sequences are described. The software (PF-IND) includes five pairs of fungi and their host plants that can be used to analyze a large number of related species. Analysis of a sequence provides an arbitrary value that defines the likelihood that a sequence will be a fungal or a plant gene. The software can distinguish between homologous fungal and plant genes and it helps identify the correct reading frame of unknown expressed sequence tags (ESTs) for which BLAST analyses do not provide clear information. Short sequences of 100–150 bp can be analyzed with high confidence. PF-IND analysis of 100 sequences derived from fungal infected plants identified the origin of 94 sequences. Only 66 sequences were identified by a BLASTX analysis of the same 100 ESTs. Overall, PF-IND is a novel bioinformatic tool aimed at assisting the research of fungus–plant interactions.
References
Akashi H (1997) Codon bias evolution in Drosophila. Population genetics of mutation–selection drift. Gene 205:269–278
Bennetzen J, Hall B (1982) Codon selection in yeast. J Biol Chem 257:3026–3031
Chiapello H, Lisacek F, Caboche M, Henaut A (1998) Codon usage and gene function are related in sequences of Arabidopsis thaliana. Gene 209:GC1–GC38
Coghlan A, Wolfe K (2000) Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast 16:1131–1145
Comeron J, Aguade M (1998) An evaluation of measures of synonymous codon usage bias. J Mol Evol 47:268–274
Ermolaeva M (2001) Synonymous codon usage in bacteria. Curr Issues Mol Biol 3:91–97
Fennoy S, Bailey-Serres J (1993) Synonymous codon usage in Zea mays L. nuclear genes is varied by levels of C- and G-ending codons. Nucleic Acids Res 21:5294–5300
Gold S, Garcia-Pedrajas M, Martinez-Espinoza A (2001) New (and used) approaches to the study of fungal pathogenicity. Annu Rev Phytopathol 39:337–365
Hill M, Lyon K, Lyon B (1999) Identification of disease response genes expressed in Gossypium hirsutum upon infection with the wilt pathogen Verticillium dahliae. Plant Mol Biol 40:289–296
Jin S, Xu R, Wei Y, Goodwin P (1999) Increased expression of a plant actin gene during a biotrophic interaction between round-leaved mallow, Malva pusilla, and Colletotrichum gloeosporioides f. sp. malvae. Planta 209:487–494
Kanaya S, Kinouchi M, Abe T, Kudo Y, Yamada Y, Nishi T, Mori H, Ikemura T (2001) Analysis of codon usage diversity of bacterial genes with a self-organizing map (SOM): characterization of horizontally transferred genes with emphasis on the E. coli O157 genome. Gene 276:89–99
Karlin S (2001) Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Trends Microbiol 9:335–343
Kim Y, Park J, Kim K, Ko M, Cheong S, Oh B (2002) A thaumatin-like gene in nonclimacteric pepper fruits used as molecular marker in probing disease resistance, ripening, and sugar accumulation. Plant Mol Biol 49:125–135
Kruger W, Pritsch C, Chao S, Muehlbauer G (2002) Functional and comparative bioinformatic analysis of expressed genes from wheat spikes infected with Fusarium graminearum. Mol Plant-Microbe Interact 15:445–455
Mankel A, Krause K, Kothe E (2002) Identification of a hydrophobin gene that is developmentally regulated in the ectomycorrhizal fungus Tricholoma terreum. Appl Environ Microbiol 68:1408–1413
Mazeyrat F, Mouzeyar S, Nicolas P, Tourvieille de Labrouhe D, Ledoigt G (1998) Cloning, sequence and characterization of a sunflower (Helianthus annuus L.) pathogen-induced gene showing sequence homology with auxin-induced genes from plants. Plant Mol Biol 38:899–903
Moriyama E, Powell J (1997) Codon usage bias and tRNA abundance in Drosophila. J Mol Evol 45:514–523
Moriyama E, Powell J (1998) Gene length and codon usage bias in Drosophila melanogaster, Saccharomyces cerevisiae and Escherichia coli. Nucleic Acids Res 26:3188–3193
Moszer I, Rocha E, Danchin A (1999) Codon usage and lateral gene transfer in Bacillus subtilis. Curr Opin Microbiol 2:524–528
Nakamura Y, Gojobori T, Ikemura T (2000) Codon usage tabulated from international DNA sequence databases: status for the year 2000. Nucleic Acids Res 28:292
Powell J, Moriyama E (1997) Evolution of codon usage bias in Drosophila. Proc Natl Acad Sci USA 94:7784–7790
Seehaus K, Tenhaken R (1998) Cloning of genes by mRNA differential display induced during the hypersensitive reaction of soybean after inoculation with Pseudomonas syringae pv. glycinea. Plant Mol Biol 38:1225–1234
Sharp P, Li W (1987) The codon adaptation index—a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281–1395
Silke J (1997) The majority of long non-stop reading frames on the antisense strand can be explained by biased codon usage. Gene 194:143–155
Wang H, Badger J, Kearney P, Mo L (2001) Analysis of codon usage patterns of bacterial genomes using the self-organizing map. Mol Biol Evol 18:792–800
Wang J, Zhang C (2001) Analysis of the codon usage pattern in the Vibrio cholerae genome. J Biomol Struct Dyn 18:872–880
Wang T, Cheng W, Lee B (1998) A simple program to calculate codon bias index. Mol Biotechnol 10:103–106
Wright F (1990) The 'effective number of codons' used in a gene. Gene 87:23–29
Xu J, Xue C (2002) Time for a blast: genomics of Magnaporthe grisea. Mol Plant Pathol 3:173–176
Acknowledgement
This work was supported by The Israeli Ministry of Sciences, Grant 1336.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by S. Hohmann
Rights and permissions
About this article
Cite this article
Maor, R., Kosman, E., Golobinski, R. et al. PF-IND: probability algorithm and software for separation of plant and fungal sequences. Curr Genet 43, 296–302 (2003). https://doi.org/10.1007/s00294-003-0394-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00294-003-0394-3