Skip to main content
Log in

Annotation of the domestic dog genome sequence: finding the missing genes

  • Published:
Mammalian Genome Aims and scope Submit manuscript

Abstract

There are over 350 genetically distinct breeds of domestic dog that present considerable variation in morphology, physiology, and disease susceptibility. The genome sequence of the domestic dog was assembled and released in 2005, providing an estimated 20,000 protein-coding genes that are a great asset to the scientific community that uses the dog system as a genetic biomedical model and for comparative and evolutionary studies. Although the canine gene set had been predicted using a combination of ab initio methods, homology studies, motif analysis, and similarity-based programs, it still requires a deep annotation of noncoding genes, alternative splicing, pseudogenes, regulatory regions, and gain and loss events. Such analyses could benefit from new sequencing technologies (RNA-Seq) to better exploit the advantages of the canine genetic system in tracking disease genes. Here, we review the catalog of canine protein-coding genes and the search for missing genes, and we propose rationales for an accurate identification of noncoding genes though next-generation sequencing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Abitbol M, Thibaud JL, Olby NJ, Hitte C, Puech JP, Maurer M, Pilot-Storck F et al (2010) A canine Arylsulfatase G (ARSG) mutation leading to a sulfatase deficiency is associated with neuronal ceroid lipofuscinosis. Proc Natl Acad Sci USA 107:14775–14780

    Article  PubMed  CAS  Google Scholar 

  • Akey JM, Ruhe AL, Akey DT, Wong AK, Connelly CF, Madeoy J, Nicholas TJ et al (2010) Tracking footprints of artificial selection in the dog genome. Proc Natl Acad Sci USA 107:1160–1165

    Article  PubMed  CAS  Google Scholar 

  • Alekseyev MA, Pevzner PA (2007) Are there rearrangement hotspots in the human genome? PLoS Comput Biol 3:e209

    Article  PubMed  Google Scholar 

  • Bannasch D, Young A, Myers J, Truve K, Dickinson P, Gregg J, Davis R et al (2010) Localization of canine brachycephaly using an across breed mapping approach. PLoS One 5:e9632

    Article  PubMed  Google Scholar 

  • Bartel DP (2004) MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116:281–297

    Article  PubMed  CAS  Google Scholar 

  • Bartel DP (2009) MicroRNAs: target recognition and regulatory functions. Cell 136:215–233

    Article  PubMed  CAS  Google Scholar 

  • Beggs AH, Bohm J, Snead E, Kozlowski M, Maurer M, Minor K, Childers MK et al (2010) MTM1 mutation associated with X-linked myotubular myopathy in Labrador Retrievers. Proc Natl Acad Sci USA 107:14697–14702

    Article  PubMed  CAS  Google Scholar 

  • Birney E, Clamp M, Durbin R (2004) GeneWise and Genomewise. Genome Res 14:988–995

    Article  PubMed  CAS  Google Scholar 

  • Blanco E, Parra G, Guigo R (2007) Using geneid to identify genes. Curr Protoc Bioinformatics Chapter 4:Unit 4.3

  • Boguski MS, Lowe TM, Tolstoshev CM (1993) dbEST–database for “expressed sequence tags”. Nat Genet 4:332–333

    Article  PubMed  CAS  Google Scholar 

  • Boyko AR, Quignon P, Li L, Schoenebeck JJ, Degenhardt JD, Lohmueller KE, Zhao K et al (2010) A simple genetic architecture underlies morphological variation in dogs. PLoS Biol 8:e1000451

    Article  PubMed  Google Scholar 

  • Breen M, Hitte C, Lorentzen TD, Thomas R, Cadieu E, Sabacan L, Scott A et al (2004) An integrated 4249 marker FISH/RH map of the canine genome. BMC Genomics 5:65

    Article  PubMed  Google Scholar 

  • Brockdorff N, Ashworth A, Kay GF, McCabe VM, Norris DP, Cooper PJ, Swift S et al (1992) The product of the mouse Xist gene is a 15 kb inactive X-specific transcript containing no conserved ORF and located in the nucleus. Cell 71:515–526

    Article  PubMed  CAS  Google Scholar 

  • Cadieu E, Neff M, Quignon P, Walsh K, Chase K, Parker HG, Vonholdt BM et al (2009) Coat variation in the domestic dog is governed by variants in three genes. Science 326(5949):150–153

    Article  PubMed  CAS  Google Scholar 

  • Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R et al (2005) The transcriptional landscape of the mammalian genome. Science 309:1559–1563

    Article  PubMed  CAS  Google Scholar 

  • Ciaudo C, Bourdet A, Cohen-Tannoudji M, Dietz HC, Rougeulle C, Avner P (2006) Nuclear mRNA degradation pathway(s) are implicated in Xist regulation and X chromosome inactivation. PLoS Genet 2:e94

    Article  PubMed  Google Scholar 

  • Clark MB, Amaral PP, Schlesinger FJ, Dinger ME, Taft RJ, Rinn JL, Ponting CP et al (2011) The reality of pervasive transcription. PLoS Biol 9:e1000625

    Article  PubMed  CAS  Google Scholar 

  • Cloonan N, Xu Q, Faulkner GJ, Taylor DF, Tang DT, Kolle G, Grimmond SM (2009) RNA-MATE: a recursive mapping strategy for high-throughput RNA-sequencing data. Bioinformatics 25:2615–2616

    Article  PubMed  CAS  Google Scholar 

  • Daughters RS, Tuttle DL, Gao W, Ikeda Y, Moseley ML, Ebner TJ, Swanson MS et al (2009) RNA gain-of-function in spinocerebellar ataxia type 8. PLoS Genet 5:e1000600

    Article  PubMed  Google Scholar 

  • Denoeud F, Aury JM, Da Silva C, Noel B, Rogier O, Delledonne M, Morgante M et al (2008) Annotating genomes with massive-scale RNA sequencing. Genome Biol 9:R175

    Article  PubMed  Google Scholar 

  • Derrien T, Andre C, Galibert F, Hitte C (2007a) Analysis of the unassembled part of the dog genome sequence: chromosomal localization of 115 genes inferred from multispecies comparative genomics. J Hered 98:461–467

    Article  PubMed  CAS  Google Scholar 

  • Derrien T, Andre C, Galibert F, Hitte C (2007b) AutoGRAPH: an interactive web server for automating and visualizing comparative genome maps. Bioinformatics 23:498–499

    Article  PubMed  CAS  Google Scholar 

  • Derrien T, Theze J, Vaysse A, Andre C, Ostrander EA, Galibert F, Hitte C (2009) Revisiting the missing protein-coding gene catalog of the domestic dog. BMC Genomics 10:62

    Article  PubMed  Google Scholar 

  • Drogemuller C, Karlsson EK, Hytonen MK, Perloski M, Dolf G, Sainio K, Lohi H et al (2008) A mutation in hairless dogs implicates FOXI3 in ectodermal development. Science 321:1462

    Article  PubMed  Google Scholar 

  • Enard D, Depaulis F, Roest Crollius H (2010) Human and non-human primate genomes share hotspots of positive selection. PLoS Genet 6:e1000840

    Article  PubMed  Google Scholar 

  • Faghihi MA, Modarresi F, Khalil AM, Wood DE, Sahagan BG, Morgan TE, Finch CE et al (2008) Expression of a noncoding RNA is elevated in Alzheimer’s disease and drives rapid feed-forward regulation of beta-secretase. Nat Med 14:723–730

    Article  PubMed  CAS  Google Scholar 

  • Flicek P, Amode MR, Barrell D, Beal K, Brent S, Chen Y, Clapham P et al (2011) Ensembl 2011. Nucleic Acids Res 39:D800–D806

    Article  PubMed  Google Scholar 

  • Gabory A, Ripoche MA, Le Digarcher A, Watrin F, Ziyyat A, Forne T, Jammes H et al (2009) H19 acts as a trans regulator of the imprinted gene network controlling growth in mice. Development 136:3413–3421

    Article  PubMed  CAS  Google Scholar 

  • Galibert F, André C (2006) The dog genome. Genome Dyn 2:46–59

    Article  PubMed  CAS  Google Scholar 

  • Gingeras TR (2007) Origin of phenotypes: genes and transcripts. Genome Res 17:682–690

    Article  PubMed  CAS  Google Scholar 

  • Goodstadt L, Ponting CP (2006) Phylogenetic reconstruction of orthology, paralogy, and conserved synteny for dog and human. PLoS Comput Biol 2:e133

    Article  PubMed  Google Scholar 

  • Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, Fan L et al (2010) Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol 28:503–510

    Article  PubMed  CAS  Google Scholar 

  • Guyon R, Lorentzen TD, Hitte C, Kim L, Cadieu E, Parker HG, Quignon P et al (2003) A 1-Mb resolution radiation hybrid map of the canine genome. Proc Natl Acad Sci USA 100:5296–5301

    Article  PubMed  CAS  Google Scholar 

  • Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106:9362–9367

    Article  PubMed  CAS  Google Scholar 

  • Hitte C, Madeoy J, Kirkness EF, Priat C, Lorentzen TD, Senger F, Thomas D et al (2005) Facilitating genome navigation: survey sequencing and dense radiation-hybrid gene mapping. Nat Rev Genet 6:643–648

    Article  PubMed  CAS  Google Scholar 

  • Hitte C, Kirkness EF, Ostrander EA, Galibert F (2008) Survey sequencing and radiation hybrid mapping to construct comparative maps. Methods Mol Biol 422:65–77

    Article  PubMed  CAS  Google Scholar 

  • Huarte M, Guttman M, Feldser D, Garber M, Koziol MJ, Kenzelmann-Broz D, Khalil AM et al (2010) A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell 142:409–419

    Article  PubMed  CAS  Google Scholar 

  • Hurst LD, Pal C, Lercher MJ (2004) The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet 5:299–310

    Article  PubMed  CAS  Google Scholar 

  • Ishii N, Ozaki K, Sato H, Mizuno H, Saito S, Takahashi A, Miyamoto Y et al (2006) Identification of a novel non-coding RNA, MIAT, that confers risk of myocardial infarction. J Hum Genet 51:1087–1099

    Article  PubMed  CAS  Google Scholar 

  • Jones P, Chase K, Martin A, Davern P, Ostrander EA, Lark KG (2008) Single-nucleotide-polymorphism-based association mapping of dog stereotypes. Genetics 179:1033–1044

    Article  PubMed  CAS  Google Scholar 

  • Kawaji H, Severin J, Lizio M, Waterhouse A, Katayama S, Irvine KM, Hume DA et al (2009) The FANTOM web resource: from mammalian transcriptional landscape to its dynamic regulation. Genome Biol 10:R40

    Article  PubMed  Google Scholar 

  • Khalil AM, Guttman M, Huarte M, Garber M, Raj A, Rivea Morales D, Thomas K et al (2009) Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci USA 106:11667–11672

    Article  PubMed  CAS  Google Scholar 

  • Kirkness EF, Bafna V, Halpern AL, Levy S, Remington K, Rusch DB, Delcher AL et al (2003) The dog genome: survey sequencing and comparative analysis. Science 301:1898–1903

    Article  PubMed  Google Scholar 

  • Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami M, Sasaki D et al (2006) CAGE: cap analysis of gene expression. Nat Methods 3:211–222

    Article  PubMed  CAS  Google Scholar 

  • Lee RC, Feinbaum RL, Ambros V (1993) The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75:843–854

    Article  PubMed  CAS  Google Scholar 

  • Lequarré AS, Andersson L, André C, Fredholm M, Hitte C, Leeb T, Lohi H et al (2011) LUPA: A European initiative taking advantage of the canine genome architecture for unravelling complex disorders in both human and dogs. Vet J 189:155–159

    Article  PubMed  Google Scholar 

  • Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M et al (2005) Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438:803–819

    Article  PubMed  CAS  Google Scholar 

  • Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753

    Article  PubMed  CAS  Google Scholar 

  • Mattick JS (2009) The genetic signatures of noncoding RNAs. PLoS Genet 5:e1000459

    Article  PubMed  Google Scholar 

  • Mattick JS, Taft RJ, Faulkner GJ (2010) A global view of genomic information–moving beyond the gene and the master regulator. Trends Genet 26:21–28

    Article  PubMed  CAS  Google Scholar 

  • Merveille AC, Davis EE, Becker-Heck A, Legendre M, Amirav I, Bataille G, Belmont J et al (2011) CCDC39 is required for assembly of inner dynein arms and the dynein regulatory complex and for normal ciliary motility in humans and dogs. Nat Genet 43:72–78

    Article  PubMed  CAS  Google Scholar 

  • Meyer IM, Durbin R (2002) Comparative ab initio prediction of gene structures using pair HMMs. Bioinformatics 18:1309–1318

    Article  PubMed  CAS  Google Scholar 

  • Mohammad F, Mondal T, Guseva N, Pandey GK, Kanduri C (2010) Kcnq1ot1 noncoding RNA mediates transcriptional gene silencing by interacting with Dnmt1. Development 137:2493–2499

    Article  PubMed  CAS  Google Scholar 

  • Mosher DS, Quignon P, Bustamante CD, Sutter NB, Mellersh CS, Parker HG, Ostrander EA (2007) A mutation in the myostatin gene increases muscle mass and enhances racing performance in heterozygote dogs. PLoS Genet 3:e79

    Article  PubMed  Google Scholar 

  • Muffato M, Louis A, Poisnel CE, Roest Crollius H (2010) Genomicus: a database and a browser to study gene synteny in modern and ancestral genomes. Bioinformatics 26:1119–1121

    Article  PubMed  CAS  Google Scholar 

  • Nagano T, Mitchell JA, Sanz LA, Pauler FM, Ferguson-Smith AC, Feil R, Fraser P (2008) The Air noncoding RNA epigenetically silences transcription by targeting G9a to chromatin. Science 322:1717–1720

    Article  PubMed  CAS  Google Scholar 

  • Ng P, Wei CL, Sung WK, Chiu KP, Lipovich L, Ang CC, Gupta S et al (2005) Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat Methods 2:105–111

    Article  PubMed  CAS  Google Scholar 

  • Olsson M, Meadows JR, Truve K, Rosengren Pielberg G, Puppo F, Mauceli E, Quilez J et al (2011) A novel unstable duplication upstream of HAS2 predisposes to a breed-defining skin phenotype and a periodic fever syndrome in Chinese Shar-Pei dogs. PLoS Genet 7:e1001332

    Article  PubMed  CAS  Google Scholar 

  • Orom UA, Derrien T, Beringer M, Gumireddy K, Gardini A, Bussotti G, Lai F et al (2010) Long noncoding RNAs with enhancer-like function in human cells. Cell 143:46–58

    Article  PubMed  CAS  Google Scholar 

  • Parker HG, Kim LV, Sutter NB, Carlson S, Lorentzen TD, Malek TB, Johnson GS et al (2004) Genetic structure of the purebred domestic dog. Science 304:1160–1164

    Article  PubMed  CAS  Google Scholar 

  • Parker HG, Kukekova AV, Akey DT, Goldstein O, Kirkness EF, Baysac KC, Mosher DS et al (2007) Breed relationships facilitate fine-mapping studies: a 7.8-kb deletion cosegregates with Collie eye anomaly across multiple dog breeds. Genome Res 17:1562–1571

    Article  PubMed  CAS  Google Scholar 

  • Parker HG, VonHoldt BM, Quignon P, Margulies EH, Shao S, Mosher DS, Spady TC et al (2009) An expressed fgf4 retrogene is associated with breed-defining chondrodysplasia in domestic dogs. Science 325:995–998

    Article  PubMed  CAS  Google Scholar 

  • Project Consortium ENCODE, Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH et al (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799–816

    Article  Google Scholar 

  • Project Consortium ENCODE, Myers RM, Stamatoyannopoulos J, Snyder M, Dunham I, Hardison RC, Bernstein BE et al (2011) A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 9:e1001046

    Article  Google Scholar 

  • Semon M, Duret L (2006) Evolutionary origin and maintenance of coexpressed gene clusters in mammals. Mol Biol Evol 23:1715–1723

    Article  PubMed  CAS  Google Scholar 

  • Seppala EH, Jokinen TS, Fukata M, Fukata Y, Webster MT, Karlsson EK, Kilpinen SK et al (2011) LGI2 Truncation causes a remitting focal epilepsy in dogs. PLoS Genet 7:e1002194

    Article  PubMed  CAS  Google Scholar 

  • Sutter NB, Ostrander EA (2004) Dog star rising: the canine genetic system. Nat Rev Genet 5:900–910

    Article  PubMed  CAS  Google Scholar 

  • Taft RJ, Pang KC, Mercer TR, Dinger M, Mattick JS (2010) Non-coding RNAs: regulators of disease. J Pathol 220:126–139

    Article  PubMed  CAS  Google Scholar 

  • Tiwari S, Ramachandran S, Bhattacharya A, Bhattacharya S, Ramaswamy R (1997) Prediction of probable genes by Fourier analysis of genomic sequences. Comput Appl Biosci 13:263–270

    PubMed  CAS  Google Scholar 

  • Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515

    Article  PubMed  CAS  Google Scholar 

  • van Bakel H, Nislow C, Blencowe BJ, Hughes TR (2010) Most “dark matter” transcripts are associated with known genes. PLoS Biol 8:e1000371

    Article  PubMed  Google Scholar 

  • Vaysse A, Ratnakumar A, Derrien T, Axelsson E, Rosengren Pielberg G, Sigurdsson S, Fall T et al (2011) Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping. PLoS Genet 7(10):e1002316

    Article  PubMed  CAS  Google Scholar 

  • Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF et al (2008) Alternative isoform regulation in human tissue transcriptomes. Nature 456:470–476

    Article  PubMed  CAS  Google Scholar 

  • Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63

    Article  PubMed  CAS  Google Scholar 

  • Wilbe M, Jokinen P, Truve K, Seppala EH, Karlsson EK, Biagi T, Hughes A et al (2010) Genome-wide association mapping identifies multiple loci for a canine SLE-related disease complex. Nat Genet 42:250–254

    Article  PubMed  CAS  Google Scholar 

  • Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13:555–556

    PubMed  CAS  Google Scholar 

  • Yang Z, dos Reis M (2011) Statistical properties of the branch-site test of positive selection. Mol Biol Evol 28:1217–1228

    Article  PubMed  CAS  Google Scholar 

  • Yu W, Gius D, Onyango P, Muldoon-Jacobs K, Karp J, Feinberg AP, Cui H (2008) Epigenetic silencing of tumour suppressor gene p15 by its antisense RNA. Nature 451:202–206

    Article  PubMed  CAS  Google Scholar 

  • Zhou Y, Zhong Y, Wang Y, Zhang X, Batista DL, Gejman R, Ansell PJ et al (2007) Activation of p53 by MEG3 non-coding RNA. J Biol Chem 282:24731–24742

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We acknowledge the Centre National de la Recherche Scientifique, the University of Rennes 1 for funding. TD was supported by the Conseil Régional de Bretagne and AV was supported by the European Commission (FP7-LUPA, GA-201370). We thank Jocelyn Plassais for the dog photographs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christophe Hitte.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Derrien, T., Vaysse, A., André, C. et al. Annotation of the domestic dog genome sequence: finding the missing genes. Mamm Genome 23, 124–131 (2012). https://doi.org/10.1007/s00335-011-9372-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00335-011-9372-0

Keywords

Navigation