, Volume 20, Issue 4, pp 446-453

Test Data Sets and Evaluation of Gene Prediction Programs on the Rice Genome

Purchase on Springer.com

$39.95 / €34.95 / £29.95*

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access

Abstract

With several rice genome projects approaching completion gene prediction/finding by computer algorithms has become an urgent task. Two test sets were constructed by mapping the newly published 28,469 full-length KOME rice cDNA to the RGP BAC clone sequences of Oryza sativa ssp. japonica: a single-gene set of 550 sequences and a multi-gene set of 62 sequences with 271 genes. These data sets were used to evaluate five ab initio gene prediction programs: RiceHMM, GlimmerR, GeneMark, FGENSH and BGF. The predictions were compared on nucleotide, exon and whole gene structure levels using commonly accepted measures and several new measures. The test results show a progress in performance in chronological order. At the same time complementarity of the programs hints on the possibility of further improvement and on the feasibility of reaching better performance by combining several gene-finders.