Automated interpretation of metabolic capacity from genome and metagenome sequences
The KEGG pathway maps are widely used as a reference data set for inferring high-level functions of the organism or the ecosystem from its genome or metagenome sequence data. The KEGG modules, which are tighter functional units often corresponding to subpathways in the KEGG pathway maps, are designed for better automation of genome interpretation. Each KEGG module is represented by a simple Boolean expression of KEGG Orthology (KO) identifiers (K numbers), enabling automatic evaluation of the completeness of genes in the genome. Here we focus on metabolic functions and introduce reaction modules for improving annotation and signature modules for inferring metabolic capacity.We also describe how genome annotation is performed in KEGG using the manually created KO database and the computationally generated SSDB database. The resulting KEGG GENES database with KO (K number) annotation is a reference sequence database to be compared for automated annotation and interpretation of newly determined genomes.
Keywordsmetabolic pathway functional module genome annotation genome interpretation KEGG database
- 9.Kanehisa, M. (2013) Chemical and genomic evolution of enzymecatalyzed reaction networks. FEBS Lett., doi: 10.1016/j.febslet.2013.06.026.Google Scholar
- 10.Maeder, D. L., Weiss, R. B., Dunn, D. M., Cherry, J. L., González, J. M., DiRuggiero, J. and Robb, F. T. (1999) Divergence of the hyperthermophilic archaea Pyrococcus furiosus and P. horikoshii inferred from complete genomic sequences. Genetics, 152, 1299–1305.Google Scholar
© Higher Education Press and Springer-Verlag GmbH 2013