Abstract
In the present study, we describe the deep sequencing and structural analysis of the Holstein breed bull genome. Our aim was to receive a high-quality Holstein bull genome reference sequence and to describe different types of variations in its genome compared to Hereford breed as a reference. We generated four mate-paired libraries and one fragment library from 30 μg of genomic DNA. Colour space fasta were mapped and paired to the reference cow (Bos taurus) genome assembly from Oct. 2011 (Baylor 4.6.1/bosTau7). Initial sequencing resulted in the 4,864,054,296 of 50-bp reads. Average mapping efficiency was 71.7 % and altogether 3,494,534,136 reads and 157,928,163,086 bp were successfully mapped, resulting in 60 × coverage. This is the highest coverage for bovine genome published so far. Tertiary analysis found 6,362,988 SNPs in the bull’s genome, 4,045,889 heterozygous and 2,317,099 homozygous variants. Annotation revealed that 4,330,337 of all discovered SNPs were annotated in the dbSNP database (build 137) and therefore 2,032,651 SNPs were novel. Large indel variations accounted for the 245,947,845 bp of the variation in entire genome and their number was 312,879. We also found that small indels (number was 633,310) accounted for the total variation of 2,542,552 nucleotides in the genome. Only 106,768 small indels were listed in the dbSNP. Finally, we identified 2,758 inversions in the genome of the bull covering in total 23,099,054 bp of genome’s variation. The largest inversion was 87,440 bp in size. In conclusion, the present study discovered different types of novel variants in bull’s genome after high-coverage sequencing. Better knowledge of the functions of these variations is needed.
Similar content being viewed by others
References
Albarran-Portillo B, Pollott GE (2013) The relationship between fertility and lactation characteristics in Holstein cows on United Kingdom commercial dairy farms. J Dairy Sci 96:635–646
Berry DP, Buckley F, Dillon P, Evans RD, Rath M et al (2003a) Genetic parameters for body condition score, body weight, milk yield, and fertility estimated using random regression models. J Dairy Sci 86:3704–3717
Berry DP, Buckley F, Dillon P, Evans RD, Rath M et al (2003b) Genetic relationships among body condition score, body weight, milk yield, and fertility in dairy cows. J Dairy Sci 86:2193–2204
Canavez FC, Luche DD, Stothard P, Leite KR, Sousa-Canavez JM et al (2012) Genome sequence and assembly of Bos indicus. J Hered 103:342–348
Eck SH, Benet-Pages A, Flisikowski K, Meitinger T, Fries R et al (2009) Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discovery. Genome Biol 10:R82
Elsik CG, Tellam RL, Worley KC, Gibbs RA, Muzny DM et al (2009) The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science 324:522–528
Gibbs RA, Taylor JF, Van Tassell CP, Barendse W, Eversole KA et al (2009) Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds. Science 324:528–532
Kõks S, Lilleoja R, Reimann E, Salumets A, Reemann P et al (2013) Sequencing and annotated analysis of the Holstein cow genome. Mamm Genome 24:309–321
Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R et al (2009) Circos: an information aesthetic for comparative genomics. Genome Res 19:1639–1645
Larkin DM, Daetwyler HD, Hernandez AG, Wright CL, Hetrick LA et al (2012) Whole-genome resequencing of two elite sires for the detection of haplotypes under selection in dairy cattle. Proc Natl Acad Sci U S A 109:7693–7698
Levy S, Sutton G, Ng PC, Feuk L, Halpern AL et al (2007) The diploid genome sequence of an individual human. PLoS Biol 5:e254
Lilleoja R, Sarapik A, Reimann E, Reemann P, Jaakma U et al (2011) Sequencing and annotated analysis of an Estonian human genome. Gene 493:69–76
Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB et al (2005) Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438:803–819
Liu GE, Matukumalli LK, Sonstegard TS, Shade LL, Van Tassell CP (2006) Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences. BMC Genome 7:140
Liu J, Zhang Y, Lei X, Zhang Z (2008) Natural selection of protein structural and functional properties: a single nucleotide polymorphism perspective. Genome Biol 9:R69
Liu Y, Qin X, Song XZ, Jiang H, Shen Y et al (2009) Bos taurus genome assembly. BMC Genomics 10:180
McKernan KJ, Peckham HE, Costa GL, McLaughlin SF, Fu Y et al (2009) Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res 19:1527–1541
Parnell TJ, Viering MM, Skjesol A, Helou C, Kuhn EJ et al (2003) An endogenous suppressor of hairy-wing insulator separates regulatory domains in Drosophila. Proc Natl Acad Sci USA 100:13436–13441
Schuster SC, Miller W, Ratan A, Tomsho LP, Giardine B et al (2010) Complete Khoisan and Bantu genomes from southern Africa. Nature 463:943–947
Seroussi E, Glick G, Shirak A, Yakobson E, Weller JI et al (2010) Analysis of copy loss and gain variations in Holstein cattle autosomes using BeadChip SNPs. BMC Genomics 11:673
Stothard P, Choi JW, Basu U, Sumner-Thomson JM, Meng Y et al (2011) Whole genome resequencing of black Angus and Holstein cattle for SNP and CNV discovery. BMC Genomics 12:559
Van Tassell CP, Smith TP, Matukumalli LK, Taylor JF, Schnabel RD et al (2008) SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries. Nat Methods 5:247–252
Venter JC, Adams MD, Myers EW, Li PW, Mural RJ et al (2001) The sequence of the human genome. Science 291:1304–1351
Wang J, Wang W, Li R, Li Y, Tian G et al (2008) The diploid genome sequence of an Asian individual. Nature 456:60–65
Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164
Yang Y, Chang TC, Yasue H, Bharti AK, Retzel EF et al (2011) ZNF280BY and ZNF280AY: autosome derived Y-chromosome gene families in Bovidae. BMC Genomics 12:13
Zhan B, Fadista J, Thomsen B, Hedegaard J, Panitz F et al (2011) Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping. BMC Genomics 12:557
Zimin AV, Delcher AL, Florea L, Kelley DR, Schatz MC et al (2009) A whole-genome assembly of the domestic cow Bos taurus. Genome Biol 10:R42
Acknowledgments
The support provided by Mr. Lauri Anton and Mr. Martin Loginov from the High-Performance Computing Centre of the University of Tartu is highly acknowledged. This study was financially supported by P8001VLVL from the Estonian University of Life Sciences, by EU29023 and EU30200 from the Enterprise Estonia, and by a grant from the European Regional Development Fund (Centre of Translational Medicine, University of Tartu).
Disclosures
All authors declare that they do not have any competing interests.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Kõks, S., Reimann, E., Lilleoja, R. et al. Sequencing and annotated analysis of full genome of Holstein breed bull. Mamm Genome 25, 363–373 (2014). https://doi.org/10.1007/s00335-014-9511-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00335-014-9511-5