Epigenetics in Male Reproduction: A Practical Introduction to the Informatics of Next Generation Sequencing

  • Adrian E. Platts
  • Claudia Lalancette
  • Stephen A. KrawetzEmail author
Part of the Epigenetics and Human Health book series (EHH)


At fertilization, the male germ cell conveys a richly layered genetic landscape consisting of both DNA and its associated epigenetic information. A systems level understanding of these forms of information could reveal some of the origins of idiopathic male infertility. Characterizing the genetic and epigenetic contributions to fertilization could also offer insight into the root causes of aberrant development. Perhaps some of these elements reflect the fetal origins of adult disease. As a host of new tools and techniques emerge, we have the opportunity to reassess our models of gametogenesis in the male. The challenge is no longer to construct biological models from sparse data but to assimilate a wealth of data being generated by high throughput technologies. By aggregating data from multiple high throughput and targeted experiments, bioinformatics offers potential insight into how genetic and epigenetic information are utilized in the sperm-oocyte system. In this chapter, we will review online resources that can aid in conducting an epigenetic investigation as well as describing approaches to managing second and third generation deep sequencing data.


Bioinformatics Epigenetics Imprinting Male gamete Micro RNA NGS-Next generation sequencing RNA 



Centers for Disease Control and Prevention


CG dinucleotides (linked by a phosphor diester bond hence CpG)


Differential gene expression


Department of Energy


Genomic DNA


Histone acetyltransferase


Histone deacetylase


Histone methyltransferase


Imprinting control region


Joint genome initiative


Micro RNA


Next generation sequencing, also referred to as second generation sequencing or “now generation” sequencing (Illumina)


Piwi-interacting RNA


Repeat associated small interacting RNAs


Small non coding RNA


Small nucleolar RNA


Single nucleotide polymorphism


Tera Byte, 1012 bytes, 1,000 gigabytes or 1,000,000 megabytes


Whole genome sequencing



This work is supported in part by the NIH grant HD36512, the Presidential Research Enhancement Program in Computational Biology and the Charlotte B. Failing Professorship to SAK. We gratefully acknowledge the use of the UCSC genome browser ( and the Ensembl genome browser ( that were used in the creation of some of the illustrations.


  1. Amoreira C, Hindermann W et al (2003) An improved version of the DNA methylation database (MethDB). Nucleic Acids Res 31(1):75–77PubMedCrossRefGoogle Scholar
  2. Barker DJ (1997) Maternal nutrition, fetal nutrition, and disease in later life. Nutrition 13(9):807–813PubMedCrossRefGoogle Scholar
  3. Barker DJ (2004) The developmental origins of well-being. Philos Trans R Soc Lond B Biol Sci 359(1449):1359–1366PubMedCrossRefGoogle Scholar
  4. Bartolomei MS (2009) Genomic imprinting: employing and avoiding epigenetic processes. Genes Dev 23(18):2124–2133PubMedCrossRefGoogle Scholar
  5. Bateman A, Quackenbush J (2009) Bioinformatics for next generation sequencing. Bioinformatics 25(4):429PubMedCrossRefGoogle Scholar
  6. Benetti R, Gonzalo S et al (2008) A mammalian microRNA cluster controls DNA methylation and telomere recombination via Rbl2-dependent regulation of DNA methyltransferases. Nat Struct Mol Biol 15(9):998PubMedCrossRefGoogle Scholar
  7. Benson DA, Karsch-Mizrachi I et al (2003) GenBank. Nucleic Acids Res 31(1):23–27PubMedCrossRefGoogle Scholar
  8. Betel D, M Wilson et al (2008) The resource: targets and expression. Nucleic Acids Res 36(Database issue): D149–D153Google Scholar
  9. Bird A (2002) DNA methylation patterns and epigenetic memory. Genes Dev 16(1):6–21PubMedCrossRefGoogle Scholar
  10. Bonnet E, Wuyts J et al (2004) Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences. Bioinformatics 20(17):2911–2917PubMedCrossRefGoogle Scholar
  11. Chen K, Wallis JW et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6(9):677–681PubMedCrossRefGoogle Scholar
  12. Church GM, Kieffer-Higgins S (1988) Multiplex DNA sequencing. Science 240(4849):185–188PubMedCrossRefGoogle Scholar
  13. Diguistini S, Liao NY et al (2009) De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data. Genome Biol 10(9):R94PubMedCrossRefGoogle Scholar
  14. Edwards CA, Ferguson-Smith AC (2007) Mechanisms regulating imprinted genes in clusters. Curr Opin Cell Biol 19(3):281–289PubMedCrossRefGoogle Scholar
  15. Ewing B, Hillier L et al (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8(3):175–185PubMedGoogle Scholar
  16. Friedlander MR, Chen W et al (2008) Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotechnol 26(4):407–415PubMedCrossRefGoogle Scholar
  17. Fullwood MJ, Ruan Y (2009) ChIP-based methods for the identification of long-range chromatin interactions. J Cell Biochem 107(1):30–39PubMedCrossRefGoogle Scholar
  18. Ge YC, Dudoit S et al (2003) Resampling-based multiple testing for microarray data analysis. Test 12(1):1–77CrossRefGoogle Scholar
  19. Gordon D, Abajian C et al (1998) Consed: a graphical tool for sequence finishing. Genome Res 8(3):195–202PubMedGoogle Scholar
  20. Grandjean V, Gounon P et al (2009) The miR-124-Sox9 paramutation: RNA-mediated epigenetic control of embryonic and adult growth. Development 136(21):3647–3655PubMedCrossRefGoogle Scholar
  21. Griffiths-Jones S, Saini HK et al (2008) miRBase: tools for microRNA genomics. Nucleic Acids Res 36(Database issue): D154–D158Google Scholar
  22. Grimes SR Jr, van Wert J et al (1997) Regulation of transcription of the testis-specific histone H1t gene by multiple promoter elements. Mol Biol Rep 24(3):175–184PubMedCrossRefGoogle Scholar
  23. Guerrero-Bosagna CM, Skinner MK (2009) Epigenetic transgenerational effects of endocrine disruptors on male reproduction. Semin Reprod Med 27(5):403–408PubMedCrossRefGoogle Scholar
  24. Hackenberg M, Sturm M, et al (2009) miRanalyzer: a microRNA detection and analysis tool for next-generation sequencing experiments. Nucleic Acids Res 37(Web Server issue): W68–W76Google Scholar
  25. Hammoud SS, Nix DA et al (2009) Distinctive chromatin in human sperm packages genes for embryo development. Nature 460(7254):473–478PubMedGoogle Scholar
  26. Han L, Witmer PD et al (2007) DNA methylation regulates microRNA expression. Cancer Biol Ther 6(8):1284–1288PubMedGoogle Scholar
  27. Han T, Manoharan AP et al (2009) 26G endo-siRNAs regulate spermatogenic and zygotic gene expression in Caenorhabditis elegans. Proc Natl Acad Sci USA 106(44):18674–18679PubMedCrossRefGoogle Scholar
  28. He S, Liu C et al (2008) NONCODE v2.0: decoding the non-coding. Nucleic Acids Res 36(Database issue): D170–D172Google Scholar
  29. Honda BM, Dixon GH et al (1975) Sites of in vivo histone methylation in developing trout testis. J Biol Chem 250(22):8681–8685PubMedGoogle Scholar
  30. Horner DS, Pavesi G et al (2010) Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing. Brief Bioinform 11(2):181–197PubMedCrossRefGoogle Scholar
  31. Huang TH, Fan B et al (2007) MiRFinder: an improved approach and software implementation for genome-wide fast microRNA precursor scans. BMC Bioinform 8:341CrossRefGoogle Scholar
  32. Hubbard TJ, Aken BL et al (2009) Ensembl 2009. Nucleic Acids Res 37(Database issue): D690–D697Google Scholar
  33. Huse SM, Huber JA et al (2007) Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol 8(7):R143PubMedCrossRefGoogle Scholar
  34. Jirtle RL, Skinner MK (2007) Environmental epigenomics and disease susceptibility. Nat Rev Genet 8(4):253–262PubMedCrossRefGoogle Scholar
  35. Kall L, Storey JD et al (2009) QVALITY: non-parametric estimation of q-values and posterior error probabilities. Bioinformatics 25(7):964–966PubMedCrossRefGoogle Scholar
  36. Kang SC, Lee BM (2005) DNA methylation of estrogen receptor alpha gene by phthalates. J Toxicol Environ Health A 68(23–24):1995–2003PubMedCrossRefGoogle Scholar
  37. Kawai K, Nozaki T et al (2003) Aggressive behavior and serum testosterone concentration during the maturation process of male mice: the effects of fetal exposure to bisphenol A. Environ Health Perspect 111(2):175–178PubMedCrossRefGoogle Scholar
  38. Kent WJ, Sugnet CW et al (2002) The human genome browser at UCSC. Genome Res 12(6):996–1006PubMedGoogle Scholar
  39. Lander ES, Linton LM et al (2001) Initial sequencing and analysis of the human genome. Nature 409(6822):860–921PubMedCrossRefGoogle Scholar
  40. Landgraf P, Rusu M et al (2007) A mammalian microRNA expression atlas based on small RNA library sequencing. Cell 129(7):1401–1414PubMedCrossRefGoogle Scholar
  41. Langmead B, Trapnell C et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25PubMedCrossRefGoogle Scholar
  42. Lee ML, Whitmore GA (2002) Power and sample size for DNA microarray studies. Stat Med 21(23):3543–3570PubMedCrossRefGoogle Scholar
  43. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25(14):1754–1760PubMedCrossRefGoogle Scholar
  44. Li H, Ruan J et al (2008a) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18(11):1851–1858PubMedCrossRefGoogle Scholar
  45. Li R, Li Y et al (2008b) SOAP: short oligonucleotide alignment program. Bioinformatics 24(5):713–714PubMedCrossRefGoogle Scholar
  46. Lieberman-Aiden E, van Berkum NL et al (2009) Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326(5950):289–293PubMedCrossRefGoogle Scholar
  47. Lindner H, Sarg B et al (2003) Capillary electrophoresis analysis of histones, histone variants, and their post-translationally modified forms: a review. J Capill Electrophor Microchip Technol 8(3–4):59–67PubMedGoogle Scholar
  48. Lister R, Pelizzola M et al (2009) Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462(7271):315–322PubMedCrossRefGoogle Scholar
  49. Lu R, Markowetz F et al (2009) Systems-level dynamic analyses of fate change in murine embryonic stem cells. Nature 462(7271):358–362PubMedCrossRefGoogle Scholar
  50. Marmorstein R (2001) Protein modules that manipulate histone tails for chromatin regulation. Nat Rev Mol Cell Biol 2(6):422–432PubMedCrossRefGoogle Scholar
  51. Martins RP, Krawetz SA (2005) Towards understanding the epigenetics of transcription by chromatin structure and the nuclear matrix. Gene Ther Mol Biol 9:229–246PubMedGoogle Scholar
  52. Martorell MR, Navarro J et al (1997) Hypomethylation of human sperm pronuclear chromosomes. Cytogenet Cell Genet 76(3–4):123–127PubMedCrossRefGoogle Scholar
  53. Mersfelder EL, Parthun MR (2006) The tale beyond the tail: histone core domain modifications and the regulation of chromatin structure. Nucleic Acids Res 34(9):2653–2662PubMedCrossRefGoogle Scholar
  54. Meyer M, Stenzel U et al (2007) Targeted high-throughput sequencing of tagged nucleic acid samples. Nucleic Acids Res 35(15):e97PubMedCrossRefGoogle Scholar
  55. Miranda KC, Huynh T et al (2006) A pattern-based method for the identification of microRNA binding sites and their corresponding heteroduplexes. Cell 126(6):1203–1217PubMedCrossRefGoogle Scholar
  56. Moazed D (2009) Small RNAs in transcriptional gene silencing and genome defence. Nature 457(7228):413–420PubMedCrossRefGoogle Scholar
  57. Morison IM, Paton CJ et al (2001) The imprinted gene and parent-of-origin effect database. Nucleic Acids Res 29(1):275–276PubMedCrossRefGoogle Scholar
  58. Mortazavi A, Williams BA et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):621–628PubMedCrossRefGoogle Scholar
  59. Mylchreest E, Sar M et al (1999) Disruption of androgen-regulated male reproductive development by di(n-butyl) phthalate during late gestation in rats is different from flutamide. Toxicol Appl Pharmacol 156(2):81–95PubMedCrossRefGoogle Scholar
  60. Nadeau JH (2009) Transgenerational genetic effects on phenotypic variation and disease risk. Hum Mol Genet 18(R2):R202–R210PubMedCrossRefGoogle Scholar
  61. Osier MV, Zhao H et al (2004) Handling multiple testing while interpreting microarrays with the Gene Ontology Database. BMC Bioinform 5:124CrossRefGoogle Scholar
  62. Ostermeier GC, Dix DJ et al (2002a) A bioinformatic strategy to rapidly characterize cDNA libraries. Bioinformatics 18(7):949–952PubMedCrossRefGoogle Scholar
  63. Ostermeier GC, Dix DJ et al (2002b) Spermatozoal RNA profiles of normal fertile men. Lancet 360(9335):772–777PubMedCrossRefGoogle Scholar
  64. Ostermeier GC, Goodrich RJ et al (2005) A suite of novel human spermatozoal RNAs. J Androl 26(1):70–74PubMedGoogle Scholar
  65. Page GP, Edwards JW et al (2006) The PowerAtlas: a power and sample size atlas for microarray experimental design and research. BMC Bioinform 7:84CrossRefGoogle Scholar
  66. Pang KC, Stephen S et al (2005) RNAdb–a comprehensive mammalian noncoding RNA database. Nucleic Acids Res 33(Database issue): D125–D130Google Scholar
  67. Rice P, Longden I et al (2000) EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet 16(6):276–277PubMedCrossRefGoogle Scholar
  68. Rozowsky J, Euskirchen G et al (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol 27(1):66–75PubMedCrossRefGoogle Scholar
  69. Rumble SM, Lacroute P et al (2009) SHRiMP: accurate mapping of short color-space reads. PLoS Comput Biol 5(5):e1000386PubMedCrossRefGoogle Scholar
  70. Sai Lakshmi S, Agrawal S (2008) piRNABank: a web resource on classified and clustered Piwi-interacting RNAs. Nucleic Acids Res 36(Database issue): D173–D177Google Scholar
  71. Schatz MC (2009) CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 25(11):1363–1369PubMedCrossRefGoogle Scholar
  72. Schones DE, Zhao K (2008) Genome-wide approaches to studying chromatin modifications. Nat Rev Genet 9(3):179–191PubMedCrossRefGoogle Scholar
  73. Shendure J, Porreca GJ et al (2005) Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309(5741):1728–1732PubMedCrossRefGoogle Scholar
  74. Simpson JT, Wong K et al (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19(6):1117–1123PubMedCrossRefGoogle Scholar
  75. Singh GB, Krawetz SA (1995) DNAView: a quality assessment tool for the visualization of large sequenced regions. Comput Appl Biosci 11(3):317–319PubMedGoogle Scholar
  76. Sinkkonen L, Hugenschmidt T et al (2008) MicroRNAs control de novo DNA methylation through regulation of transcriptional repressors in mouse embryonic stem cells. Nat Struct Mol Biol 15(3):259–267PubMedCrossRefGoogle Scholar
  77. Smith AD, Chung WY et al (2009) Updates to the RMAP short-read mapping software. Bioinformatics 25(21):2841–2842PubMedCrossRefGoogle Scholar
  78. Sullivan S, Sink DW et al (2002) The histone database. Nucleic Acids Res 30(1):341–342PubMedCrossRefGoogle Scholar
  79. Taft RJ, Kaplan CD et al (2009) Evolution, biogenesis and function of promoter-associated RNAs. Cell Cycle 8(15):2332–2338PubMedCrossRefGoogle Scholar
  80. Takai D, Jones PA (2003) The CpG island searcher: a new WWW resource. In Silico Biol 3(3):235–240PubMedGoogle Scholar
  81. Thomas LB (2009). Highly scalable short read alignment with the Burrows–Wheeler transform and cloud computing. Computer Science, University of Maryland, MDGoogle Scholar
  82. Valeri N, Vannini I et al (2009) Epigenetics, miRNAs, and human cancer: a new chapter in human gene regulation. Mamm Genome 20(9–10):573–580PubMedCrossRefGoogle Scholar
  83. van Roijen HJ, Ooms MP et al (1998) Immunoexpression of testis-specific histone 2B in human spermatozoa and testis tissue. Hum Reprod 13(6):1559–1566PubMedCrossRefGoogle Scholar
  84. Varshney RK, Nayak SN et al (2009) Next-generation sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol 27(9):522–530PubMedCrossRefGoogle Scholar
  85. Venter JC, Adams MD et al (2001) The sequence of the human genome. Science 291(5507):1304–1351PubMedCrossRefGoogle Scholar
  86. Waddington CH (1940) Organisers and genes. Cambridge University Press, CambridgeGoogle Scholar
  87. Wang H, Veldink JH et al (2009) Markov Models for inferring copy number variations from genotype data on Illumina platforms. Hum Hered 68(1):1–22PubMedCrossRefGoogle Scholar
  88. Wang Y, Jorda M et al (2006) Functional CpG methylation system in a social insect. Science 314(5799):645–647PubMedCrossRefGoogle Scholar
  89. Wilhelm BT, Landry JR (2009) RNA-Seq-quantitative measurement of expression through massively parallel RNA-sequencing. Methods 48(3):249–257PubMedCrossRefGoogle Scholar
  90. Williamson LL, Borlee BR et al (2005) Intracellular screen to identify metagenomic clones that induce or inhibit a quorum-sensing biosensor. Appl Environ Microbiol 71(10):6335–6344PubMedCrossRefGoogle Scholar
  91. Wolfsberg TG (2007) “Using the NCBI Map Viewer to browse genomic sequence data.” Curr Protoc Bioinformatics  Chapter 1 : Unit 1 5
  92. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18(5):821–829PubMedCrossRefGoogle Scholar
  93. Zhang Y, Liu T et al (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9(9):R137PubMedCrossRefGoogle Scholar
  94. Zhang Y, Lv J et al (2010) HHMD: the human histone modification database. Nucleic Acids Res 38 (Database issue): D149–D154Google Scholar
  95. Zuker M (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31(13):3406–3415PubMedCrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2011

Authors and Affiliations

  • Adrian E. Platts
    • 1
    • 2
  • Claudia Lalancette
    • 1
    • 2
  • Stephen A. Krawetz
    • 1
    • 2
    • 3
    Email author
  1. 1.Center for Molecular Medicine and GeneticsWayne State University School of MedicineDetroitUSA
  2. 2.Department of Obstetrics and GynecologyWayne State University School of MedicineDetroitUSA
  3. 3.Institute for Scientific ComputingWayne State UniversityDetroitUSA

Personalised recommendations