Abstract
The dynamic structure and functions of genomes are being revealed simultaneously with the progress of technologies and researches in genomics. Evidence indicating genome regional characteristics (genome annotations in a broad sense) provide the basis for further analyses. Target listing and screening can be effectively performed in silico using such data. Here, we describe steps to obtain publicly available genome annotations or to construct new annotations based on your own analyses, as well as an overview of the types of available genome annotations and corresponding resources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM et al (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491(7422):56–65
Li W, Manktelow E, von Kirchbach JC, Gog JR, Desselberger U, Lever AM (2010) Genomic analysis of codon, sequence and structural conservation with selective biochemical-structure mapping reveals highly conserved and dynamic structures in rotavirus RNAs with potential cis-acting functions. Nucleic Acids Res 38(21):7718–7735
Kageyama Y, Kondo T, Hashimoto Y (2011) Coding vs non-coding: translatability of short ORFs found in putative non-coding transcripts. Biochimie 93(11):1981–1986
Abugessaisa I, Saevarsdottir S, Tsipras G, Lindblad S, Sandin C, Nikamo P et al (2014) Accelerating translational research by clinically driven development of an informatics platform—a case study. PLoS One 9(9):e104382
Harbers M, Carninci P (2005) Tag-based approaches for transcriptome research and genome annotation. Nat Methods 2(7):495–502
Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38(6):1767–1771
Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami M et al (2006) CAGE: cap analysis of gene expression. Nat Methods 3(3):211–222
Shiraki T, Kondo S, Katayama S, Waki K, Kasukawa T, Kawaji H et al (2003) Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc Natl Acad Sci U S A 100(26):15776–15781
Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1):57–63
Forrest AR, Kawaji H, Rehli M et al (2014) A promoter-level mammalian expression atlas. Nature 507(7493):462–470
Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M et al (2014) An atlas of active enhancers across human cell types and tissues. Nature 507(7493):455–461
Lockhart DJ, Winzeler EA (2000) Genomics, gene expression and DNA arrays. Nature 405(6788):827–836
Cawley S, Bekiranov S, Ng HH, Kapranov P, Sekinger EA, Kampa D et al (2004) Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116(4):499–509
Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G et al (2007) Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448(7153):553–560
Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S et al (2012) ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res 22(9):1813–1831
Rhee HS, Pugh BF (2011) Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell 147(6):1408–1419
Ndlovu MN, Denis H, Fuks F (2011) Exposing the DNA methylome iceberg. Trends Biochem Sci 36(7):381–387
Bannister AJ, Kouzarides T (2011) Regulation of chromatin by histone modifications. Cell Res 21(3):381–395
Huebert DJ, Bernstein BE (2005) Genomic views of chromatin. Curr Opin Genet Dev 15(5):476–481
Lan X, Adams C, Landers M, Dudas M, Krissinger D, Marnellos G et al (2011) High resolution detection and analysis of CpG dinucleotides methylation using MBD-Seq technology. PLoS One 6(7):e22226
Aberg KA, McClay JL, Nerella S, Xie LY, Clark SL, Hudson AD et al (2012) MBD-seq as a cost-effective approach for methylome-wide association studies: demonstration in 1500 case–control samples. Epigenomics 4(6):605–621
Hoffman MM, Ernst J, Wilder SP, Kundaje A, Harris RS, Libbrecht M et al (2013) Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res 41(2):827–841
Li Y, Tollefsbol TO (2011) DNA methylation detection: bisulfite genomic sequencing analysis. Methods Mol Biol 791:11–21
Portela A, Liz J, Nogales V, Setien F, Villanueva A, Esteller M (2013) DNA methylation determines nucleosome occupancy in the 5′-CpG islands of tumor suppressor genes. Oncogene 32(47):5421–5428
Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A et al (2009) Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326(5950):289–293
Paulsen J, Rodland EA, Holden L, Holden M, Hovig E (2014) A statistical model of ChIA-PET data for accurate detection of chromatin 3D interactions. Nucleic Acids Res 42(18):e143
Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N et al (2005) The transcriptional landscape of the mammalian genome. Science 309(5740):1559–1563
Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS et al (2004) Ultraconserved elements in the human genome. Science 304(5675):1321–1325
Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D (2003) Evolution’s cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A 100(20):11484–11489
Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A (2010) Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res 20(1):110–121
Marigorta UM, Gibson G (2014) A simulation study of gene-by-environment interactions in GWAS implies ample hidden effects. Front Genet 5:225
Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D et al (2011) COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res 39(Database issue):D945–D950
Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM et al (2014) ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res 42(Database issue):D980–D985
Kuehn BM (2008) 1000 Genomes Project promises closer look at variation in human genome. JAMA 300(23):2715
International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437(7063):1299–1320
International HapMap Consortium, Altshuler DM, Gibbs RA, Peltonen L, Altshuler DM, Gibbs RA et al (2010) Integrating common and rare genetic variation in diverse human populations. Nature 467(7311):52–58
Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM et al (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29(1):308–311
ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74
Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, Meissner A et al (2010) The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol 28(10):1045–1048
Zhang J, Baran J, Cros A, Guberman JM, Haider S, Hsu J et al (2011) International Cancer Genome Consortium Data Portal—a one-stop shop for cancer genomics data. Database 2011:bar026
Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA et al (2013) The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45(10):1113–1120
Rastogi A, Gupta D (2014) GFF-Ex: a genome feature extraction package. BMC Res Notes 7:315
Kuhn RM, Haussler D, Kent WJ (2013) The UCSC genome browser and associated tools. Brief Bioinform 14(2):144–161
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA et al (2011) The variant call format and VCFtools. Bioinformatics 27(15):2156–2158
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841–842
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16):2078–2079
Stalker J, Gibbins B, Meidl P, Smith J, Spooner W, Hotz HR et al (2004) The Ensembl Web site: mechanics of a genome browser. Genome Res 14(5):951–955
Donlin MJ (2009) Using the Generic Genome Browser (GBrowse). Current protocols in bioinformatics/editoral board, Andreas D. Baxevanis [et al.] Chapter 9:Unit 9
Severin J, Lizio M, Harshbarger J, Kawaji H, Daub CO, Hayashizaki Y et al (2014) Interactive visualization and analysis of large-scale sequencing datasets using ZENBU. Nat Biotechnol 32(3):217–219
Thorvaldsdottir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14(2):178–192
Kasprzyk A (2011) BioMart: driving a paradigm change in biological data management. Database 2011:bar049
Raney BJ, Dreszer TR, Barber GP, Clawson H, Fujita PA, Wang T et al (2014) Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser. Bioinformatics 30(7):1003–1005
De Siervi A, De Luca P, Byun JS, Di LJ, Fufa T, Haggerty CM et al (2010) Transcriptional autoregulation by BRCA1. Cancer Res 70(2):532–542
Li H, Homer N (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform 11(5):473–483
Bailey T, Krajewski P, Ladunga I, Lefebvre C, Li Q, Liu T et al (2013) Practical guidelines for the comprehensive analysis of ChIP-seq data. PLoS Comput Biol 9(11):e1003326
Acknowledgments
This work was supported by a Research Grant for the RIKEN Genome Exploration Research Project provided by the Ministry of Education, Culture, Sports, Science and Technology (MEXT), a grant to the Genome Network Project from MEXT, a Research Grant from MEXT to the RIKEN Center for Life Science Technologies, a Research Grant to RIKEN Preventive Medicine and a Diagnosis Innovation Program from MEXT to Yoshihide Hayashizaki.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this protocol
Cite this protocol
Abugessaisa, I., Kasukawa, T., Kawaji, H. (2017). Genome Annotation. In: Keith, J. (eds) Bioinformatics. Methods in Molecular Biology, vol 1525. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6622-6_5
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6622-6_5
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6620-2
Online ISBN: 978-1-4939-6622-6
eBook Packages: Springer Protocols