Abstract
Genome sequencing projects generate vast amounts of data of a wide variety of types and complexities, and at a growing pace. Traditionally, the annotation of such sequences was difficult to share with other researchers. Despite the fact that this has improved with the development and application of biological ontologies, such annotation efforts remain isolated since the amount of information that can be used from other annotation projects is limited. In addition to this, they do not benefit from the translational information available for the genomic sequences. In this paper, we describe a system that supports genome annotation processes by providing useful information about orthologous genes and the genetic disorders which can be associated with a gene identified in a sequence. The seamless integration of such data will be facilitated by an ontological infrastructure which, following best practices in ontology engineering, will reuse existing biological ontologies like Sequence Ontology or Ontological Gene Orthology.
Similar content being viewed by others
References
Stein, L., Genome annotation: From sequence to biology, Nat. Rev. Genet. 2(7):493–503, 2001.
Eilbeck, K., Lewis, S., Mungall, C., Yandell, M., Stein, L., Durbin, R., and Ashburner, M., The sequence ontology: A tool for the unification of genome annotations. Genome Biol. 6(5):R44, 2005.
Moore, B., Fan, G., and Eilbeck, K., SOBA: Sequence ontology bioinformatics analysis. Nucleic Acids Res. 38(5):W161–W164, 2010.
Holt, C., and Yandell, M., MAKER2: An annotation pipeline and genome-database management tool, for second-generation genome projects. BMC Bioinformatics 12(1):491–505, 2011.
The Gene Ontology Consortium: Gene ontology: Tool for the unification of biology. Nat. Genet. 25:25–29, 2000.
Turner, F., Clutterbuck, D., and Semple, C., Pocus: Mining genomic sequence annotation to predict disease genes. Genome Biol. 4(R75), 2003.
Mott, R., Annotation, genetics and transcriptomics. Modern Genome Annotation, pp. 123–138. Springer Verlag, 2008.
Osborne, J., Flatow, J., Holko, M., Lin, S., Kibbe, W., Zhu, L., Danila, M., Feng, G., and Chisholm, R., Annotating the human genome with disease ontology. BMC Genomics 10(Suppl 1):S6, 2009.
Miñarro Gimenez, J.A., Madrid, M., and Fernandez-Breis, J.T., OGO: An ontological approach for integrating knowledge about orthology. BMC Bioinformatics 10(Suppl 10):S13, 2009.
The Quest for Orthologous Consortium, http://questfororthologs.org/ (Last accessed: October 2012).
Dessimoz, C., Gabaldón, T., Roos, D.S., Sonnhammer, E.L.L., and Herrero, J., The Quest for Orthologs Consortium: Toward community standards in the quest for orthologs. Bioinformatics 28(6):900–904, 2012.
Schmitt, T., Messina, D.N., Schreiber, F., and Sonnhammer, E.L., Letter to the editor: Seqxml and orthoxml: Standards for sequence and, orthology information. Brief. Bioinform. 12(5):485–488, 2011.
Wu, F., Mueller, L., Crouzillat, D., Petiard, V., and Tanksley, S., Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: A test case in the Euasterid Plant Clade. Genetics 174(3):1407–1420, 2006.
Antezana, E., Egaña, M., Blondé, W., Illarramendi, A., Bilbao, I.N., De Baets, B., Stevens, R., Mironov, V., and Kuiper, M., The cell cycle ontology: An application ontology for the representation and integrated analysis of the cell cycle process. Genome Biol. 10(5):R58, 2009.
Harvey, K., Pfleger, C., and Hariharan, I., The drosophila mst ortholog, hippo, restricts growth and cell proliferation and promotes apoptosis. Cell 114:457–467, 2003.
Miñarro-Giménez, J., Egaña Aranguren, M., Martínez-Béjar, R., Fernández-Breis, J., and Madrid, M., Semantic integration of information about orthologs and diseases: The ogo system. Journal of Biomedical Informatics 44:1020–1031, 2011.
Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., and Apweiler, R., The Gene Ontology Annotation (GOA) database: Sharing knowledge in Uniprot with gene ontology. Nucleic Acids Res. 32:D262–D266, 2004.
Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Federhen, S., et al., Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 39(suppl 1):D38–D51, 2011.
Smith, B., Ceusters, W., Klagges, B., Kohler, J., Kumar, A., Lomax, J., Mungall, C., Neuhaus, F., Rector, A., and Rosse, C., Relations in biomedical ontologies. Genome Biol. 6(R46), 2005.
Bard, J., Rhee, S., and Ashburner, M., An ontology for cell types. Genome Biol. 6:21–26, 2005.
Robinson, P., Khler, S., Bauer, S., Seelow, D., Horn, D., and Mundlos, S., The human phenotype ontology: A tool for annotating and analyzing human hereditary disease. Am. J. Hum. Genet. 83:610–615, 2008.
Tatusov, R., Fedorova, N., Jackson, J., Jacobs, A., Kiryutin, B., Koonin, E., Krylov, D., Mazumder, R., Mekhedov, S., Nikolskaya, A., Rao, S., Smirnov, S., Sverdlov, A., Vasudevan, S., Wolf, Y., Yin, J., and Natale, D., The cog database: An updated version includes eukaryotes. BMC Bioinformatics 4:41–55, 2003.
Nagaraj, S.H., Gasser, R.B., and Ranganathan, S., A hitchhiker’s guide to expressed sequence tag (EST) analysis. Brief. Bioinform. 8(1):6–21, 2007.
Egaña Aranguren, M., Stevens, R., Antezana, E., Fernández-Breis, J., Kuiper, M., and Mironov, V., Technologies and best practices for building bio-ontologies. In: Knowledge-based bioinformatics: From analysis to interpretation, pp. 68–86. Wiley, 2010.
Bizer, C., Heath, T., and Berners-Lee, T., Linked data - The story so far. International Journal on Semantic Web and Information Systems 5(3):1–22, 2009.
Miñarro-Giménez, J.A., Egaña Aranguren, M., Villazón-Terrazas, B., and Fernández-Breis, J.T., Publishing orthology and diseases information in the linked open data cloud. Current Bioinformatics 7(3):255–266, 2012. doi:10.2174/157489312802460811.
Identifiers.org, http://identifiers.org/. Last accessed: October 2012
Miñarro Gimenez, J., Egaña Aranguren, M., Garcia-Sanchez, F., and Fernández-Breis, J., A semantic query interface for the ogo platform. Lect. Notes Comput. Sci. 6266:128–142, 2010.
Acknowledgements
This project has been possible thanks to the funding of the Spanish Ministry of Science and Innovation through grant TIN2010-21388-C02-02 and cofunded by the FEDER Programme. M.C. Legaz-García is supported by the Fundación Séneca through fellowship 15555/FPI/2010. M. Madrid is supported by the Spanish Ministry of Science and Innovation through fellowship JCI-2010-07513.
Conflict of interest The authors declare that they have no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Legaz-García, M., Miñarro-Giménez, J.A., Madrid, M. et al. Linking Genome Annotation Projects with Genetic Disorders using Ontologies. J Med Syst 36 (Suppl 1), 11–23 (2012). https://doi.org/10.1007/s10916-012-9890-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10916-012-9890-7