Introduction

Hypospadias is a common birth defect in boys characterized by the urethral opening forming on the ventral side of the penis rather than at the meatus. Hypospadias may further be characterized by the presence or absence of chordee, a curvature of the penis resulting from fibrous tissue and skin tethering. The etiology of hypospadias is incompletely understood, but genetics and environmental factors have been implicated in its development1. Emerging evidence suggests that genetic and environmental factors involved in the etiology of hypospadias may be heterogeneous by severity2,3,4,5. Therefore, clinical characteristics, including chordee and severity, should be considered to identify developmental pathways that lead to hypospadias.

The differentiation of a single-cell zygote into a complex, multicellular organism is controlled in part through dynamic epigenetic modifications that regulate gene transcription during embryogenesis6. The genital tubercle forms as an undifferentiated outgrowth that sexually differentiates into an anlage for either the male penis or female clitoris at 8–14 weeks of gestation. During sexual differentiation, androgens in a male fetus signal fusion of the urethral folds that form what will become the urethra. DNA methylation is one epigenetic mechanism that could regulate the transcription of developmental genes. Methyl groups are added to cytosine-guanine (CpG) dinucleotides in regulatory regions of the genome and can affect the transcription of DNA without altering its sequence variation. Genomic methylation is a dynamic state that can be influenced by the underlying sequence variation and altered by environmental exposures. Studies have also shown that altered DNA methylation can be inherited through generations to affect health risk despite the complete epigenetic reprogramming of embryonic germ cells7.

Candidate gene approaches for hypospadias have focused on genes related to sex hormones8 and development of the genital tubercle9. However, genome-wide association studies (GWAS) of hypospadias do not replicate variation in these candidate genes at the genome-wide level (P < 5e−8), rather they have identified other genomic regions that do not fully explain hypospadias risk10. Environmental exposures and maternal metabolic traits have been widely assessed for association with hypospadias2,3,11,12, which may also affect DNA methylation. Thus, DNA methylation may serve as a functional link between the influence of genes and environment on human disease and is of particular interest for its role in regulating human development. One previous epigenome-wide association study (EWAS) by Choudhry et al. identified 14 CpGs associated with hypospadias (P < 1e−4) among a sample of 12 affected boys and 8 boys without hypospadias13. After correction for multiple testing, Choudhry et al. identified methylation within SCARB1 and MYBPH as significantly associated with hypospadias, but these findings have yet to be replicated in larger samples. Additionally, no study to date has used an epigenome-wide approach to identify CpGs differentially methylated in relationship to chordee or hypospadias severity.

We sought to characterize methylation among moderate to severe cases of hypospadias using an integrative analytic approach. We first conducted epigenome-wide association studies for hypospadias (case–control), chordee (case-only), and severity (case-only). Genetic determinants were estimated for CpGs suggestively associated with hypospadias characteristics. We then used genetic instrumental variable analysis for differentially methylated CpGs and genes within regions of differential methylation to characterize causal relationships in the etiology of multiple features of hypospadias development.

Materials and methods

Subjects

Our study population has been previously described5. Boys with second- or third-degree hypospadias and unaffected controls were recruited at Arkansas Children’s Hospital, a region where incidence of hypospadias is increasing14. Cases were identified as males undergoing hypospadias-corrective surgery and unaffected males were infants undergoing circumcision at 10.1 and 10.7 months of age, on average, respectively. All subjects reported non-Hispanic white ancestry. Hypospadias was classified based on degree of severity by one of three pediatric urologists at the institution: (1) second-degree hypospadias was defined as a urethral opening in the mid penile to subcoronal area, and (2) third-degree hypospadias was classified based on a urethral opening in the perineum to proximal penile area. For consistency with other reports of hypospadias, we refer to second-degree cases as moderate and third-degree cases as severe. The study protocol was approved by the University of Arkansas for Medical Sciences Institutional Review Board and carried out in accordance with all relevant guidelines and regulations. Informed consent was obtained from all parents and/or legal guardians.

As previously reported5, preputial tissue was collected to measure both DNA methylation and genetic variation. For all subjects preputial specimens were collected at the time of surgery and preserved in liquid nitrogen within 30 min from the time of harvest in the operating room. Tissue samples were stored at − 80 °C at the Arkansas Center for Birth Defects Research and Prevention DNA Bank. DNA was extracted from a 1–2 mm section of the frozen tissue shredded with a scalpel prior to processing for DNA isolation using the PerkinElmer Prepito Tissue10 kit according to the manufacturer’s protocol.

DNA methylation and quality control

DNA methylation of preputial tissue was measured at the Avera Institute for Human Genetics using the Illumina Infinium MethylationEPIC BeadChip (San Diego, CA). All data processing, analyses, and graphics were performed using R15. Quality control was performed using the minfi package16. Samples were filtered to have detection P > 0.01 and one imputed sex mismatch was excluded from the control group. Probes were filtered out that had < 3 beads in ≥ 5% of total samples, non-CpG probes, and SNP-associated probes. After quality control, methylation beta values were normalized using NOOB17, which performs background correction and dye-bias normalization. Cell type mixture of preputial tissue was estimated using the EpiDISH package18 which accounts for epithelial, fibroblast, and blood cell (B cells, NK cells, CD4 T cells, monocytes, neutrophils, and eosinophils) proportions in samples of mixed tissue types. To control for unmeasured batch effects in EWAS, we estimated surrogate variables (SVs) using the SmartSVA package19. Methylation principal components (mePCs) were estimated using the irlba package.

Genotyping and quality control

DNA concentration was determined by use of the Qubit fluorometer and the Qubit dsDNA BR Assay kit (Life Technologies). Genotyping was performed at the Avera Institute for Human Genetics on the Infinium Global Screening Array (Illumina, San Diego, CA). Quality control methods and imputation for these genotype data are previously reported5.

Epigenome-wide association analyses of hypospadias

We performed three EWAS using a hypospadias case–control study design and two case-only studies: (1) hypospadias cases compared to unaffected controls, (2) hypospadias cases with chordee compared to cases without chordee, and (3) severe hypospadias cases compared to moderate hypospadias cases. All EWAS were analyzed using the R package CpGassoc20. We assessed SVs, mePCs, estimated cell types, tissue sample age, and subject age as technical covariates and adjusted each EWAS model as appropriate to best control the genomic inflation. We corrected for multiple comparisons within each EWAS to identify suggestive associations using the false discovery rate (FDR). Due to the limited sample sizes of this rare outcome in each respective analysis, we selected CpGs from the larger hypospadias case–control EWAS with FDR < 0.2 (N = 91) and CpGs from the chordee and severity case-only EWAS with FDR < 0.15 (N = 45). We elected a lower FDR threshold in the case-only analyses to limit false positives that could be driven by smaller comparison groups. Each of the following functional assessments were additionally performed to validate CpGs suggestively associated with the development of hypospadias.

Assessment of previous CpGs associated with hypospadias

Within our hypospadias case–control EWAS, we assessed the association of 14 CpGs measured on the Illumina 450 k methylation array previously reported by Choudhry et al13. Since CpG sites on the older 450 k array are not entirely captured on the current EPIC array, a complete comparison between the two arrays is not possible. We therefore assessed the Choudhry CpGs for (1) presence on the EPIC array, (2) performance in our quality control pipeline, and (3) association with hypospadias.

Methylation quantitative trait loci analyses

Methylation at any particular CpG may be influenced by surrounding genetic variation. Methylation within regions identified by GWAS may functionally link reported genetic associations with a phenotype. Additionally, genetic variants predictive of methylation can also serve as instrumental variables and facilitate causal testing in epidemiologic studies. Therefore, to further validate our findings we performed chromosome-wide methylation quantitative trait loci (meQTL) analyses for CpGs suggestively associated with hypospadias. For statistical stability, we transformed normalized methylation beta values to methylation M values. For each CpG with suggestive association in an EWAS, we assessed the association of each single nucleotide polymorphism (SNP) on the same chromosome using SNPTEST v2.5.421 and adjusted meQTL models for mePCs. We used P < 1e−3 and r2 < 0.2 to create independent sets of meQTLs for each EWAS CpG.

Methylation-mediated genetic associations

Methylation within genetic regions previously associated with a phenotype may be causally related to that trait since genetic variation is fixed at conception and methylation fluctuates throughout life. To characterize the potential role of methylation in previously known genetic associations of hypospadias, we identified 22 GWAS regions as those reported by Geller et al.10. We then cross-referenced suggestively-associated CpGs to the 500 kb up- and downstream region of each GWAS index variant. For any suggestive CpG mapped to a GWAS region, we then assessed the linkage disequilibrium of our meQTL estimates with the Geller index variant using the 1,000 Genomes CEU reference panel.

Two-stage least squares Mendelian randomization

The influence of genetic variation on methylation can be leveraged for use as instrumental variables in the causal assessment of CpGs that are measured simultaneously with a health outcome, i.e., when methylation could be the cause or consequence of the phenotype. To assess a causal relationship of DNA methylation on hypospadias development, we performed Mendelian randomization using two-stage least squares Mendelian randomization (2SLS MR). This approach uses individual-level genotype data as instrumental variables and is appropriate for assessing causality in a one-sample Mendelian randomization study design22. For each CpG suggestively associated with hypospadias we used independent meQTLs as instrumental variables (IVs) to perform 2SLS MR using the R package sem. In the first step, we fitted regression values of methylation M values on a hypospadias outcome, as appropriate for the EWAS from which the CpG was identified, via dosage genotypes of the IVs. In the second step, the hypospadias outcome is estimated from the predicted methylation values from the first step. We defined a significant 2SLS MR test of causal relationship between a CpG and hypospadias outcome as P < 0.05.

Causal testing for gene expression related to hypospadias

DNA methylation is one type of epigenetic mechanism that regulates gene expression. We sought to characterize the genes within methylation regions identified in our EWAS and further perform causal assessments of those genes with hypospadias in independent datasets. We assessed gene transcripts that may be causally related to hypospadias development using independent publicly available genome-wide association summary data available in the MR-Base web application23 to perform two-sample Mendelian randomization. To identify transcripts that may plausibly be regulated by DNA methylation identified in our EWAS, we mapped all UCSC gene symbols within 500 kb up or downstream from the CpG genomic position (a cis regulatory region around a CpG). We then identified transcripts across all available tissue types in the Genotype-Tissue Expression (GTEx) project24. For each transcript a single expression-associated variant was then used as the IV to assess transcription related to hypospadias. Hypospadias outcome data were extracted from UK Biobank data available from Ben Elsworth at the MRC Integrative Epidemiology Unit in the University of Bristol. The UK Biobank summary data include 26 cases of balanic hypospadias (Q54.0), 10 cases of penile hypospadias (Q54.1), and 39 cases of unspecified hypospadias (Q54.9, i.e., hypospadias that does not conform to the former two classifications) based on ICD10 medical record coding. Genome-wide summary data were available separately for each hypospadias classification and cases were compared to non-cases among a total sample of 463,010 individuals of European ancestry. We defined a significant causal relationship as P < 0.05 for the Wald test comparing the IV effect size between the gene transcript and hypospadias.

Results

Hypospadias cases from Arkansas comprised 12 moderate cases with chordee, 24 moderate cases without chordee, and 9 severe cases with chordee (Table 1). There were 816,995 CpG sites that passed quality control and were normalized to assess for association with hypospadias. Methylation-estimated proportions of fibroblasts, epithelial cells, and neutrophils differed between preputial tissue from hypospadias cases and unaffected controls (P < 0.05, Supplementary Table S1), but did not differ within cases between chordee and severity. We identified inflation in models not adjusted for technical covariates. In the hypospadias EWAS we adjusted for SVs to correct for differences between cases and controls, estimated cell types, mePCs, subject age, and tissue age. In the chordee EWAS we adjusted for SVs, mePCs, and tissue age. In the severity EWAS we adjusted for SVs, mePCs, and subject age. In methylome-wide analyses we identified differential methylation at 10 CpGs between hypospadias cases and controls (FDR < 0.20), 9 CpGs differentially methylated with chordee (FDR < 0.15), and 6 CpGs differentially methylated by severity (FDR < 0.15) (Table 2). Within the case-only analyses, we observed more CpGs associated with chordee and severity at FDR < 0.2 than were associated with hypospadias in the larger case–control study, therefore we chose a more stringent suggestive FDR threshold (FDR < 0.15) to limit false positive findings in the smaller case-only comparison groups. We then related genetic variation to each suggestive CpG and identified 2–23 independent meQTLs on the same chromosome (Table 2). We used meQTLs as genetic instrumental variables in one-sample 2SLS MR and found support for a causal relationship with hypospadias for 21 of the 25 CpGs identified through our three EWAS studies (Table 2). We further assessed causal relationships of genes in cis to these methylation regions and mapped 259 genes measured in multiple tissues in GTEx. Two-sample MR with hypospadias in the UK Biobank indicated that 97 genes neighboring these methylation regions may be causally related to hypospadias (Supplementary Table S2, Figs. S1–S4).

Table 1 Characteristics of the hypospadias cases included in epigenome-wide association analyses, shown as n (%) by hypospadias severity and chordee presence.
Table 2 Results of the hypospadias case–control and case-only epigenome-wide association studies (EWAS) and two-stage least squares Mendelian randomization (2SLS MR) using methylation quantitative trait loci as instrumental variables for causal association of the CpG with hypospadias characteristics.

The strongest association of methylation with hypospadias was identified at VAV3 cg14436889, with 15.7% less methylation measured in cases compared to controls (P = 5.4 × 10–7). Similarly large effects were identified in case–control analyses for cg25196688 (TSPAN9-PRMT8-EFCAB4B), where methylation was 11.4% lower in cases compared to controls (P = 1.7 × 10–7). The largest difference in methylation between cases with and without chordee was identified at cg06484075 (WWC2-CLDN22-CDKN2AIP-ING2), where 4.0% lower methylation was measured in cases with chordee (P = 1.4 × 10–6). Higher methylation among cases with chordee was observed for GMFB cg07509211 than for cases without chordee (delta beta = 3.8%, P = 1.4 × 10–7). When comparing methylation levels by severity, the strongest effect observed was 6.7% higher methylation among severe cases at cg22788109, located between ENAH and SRP9 on chromosome 17 (P = 8.9 × 10–8).

We did not replicate association of any CpGs previously associated with hypospadias. Of the 14 CpGs reported by Choudhry et al., two CpGs were not present on the EPIC array, five CpGs did not pass our quality control procedures, and the remaining seven CpGs had an uncorrected P > 0.05 in our case–control analyses (Supplementary Table S3). To identify methylation as a potential mediator of genetic associations with hypospadias, we cross-referenced CpGs identified in our three EWAS to genetic regions identified in the Geller et al. GWAS10. We identified methylation at cg15242360 as associated with hypospadias severity, which is located in the DAAM2 region. However, the rs417096 index variant is not in linkage with meQTLs we identified for this CpG (1,000 Genomes CEU r2 ≤ 0.021; Supplementary Fig. S2) and we did not identify direct association of this variant with methylation at cg15242360 in our data (P = 0.07).

Discussion

Among a sample of 45 moderate to severe cases and 46 controls, we identified 25 DNA methylation sites with evidence of causal association with hypospadias. Additionally, we provide evidence that 97 genes plausibly regulated by these methylation regions may be causally related to hypospadias. Functional annotation of significant genes reveals these methylation regions may be related to hypospadias through altered signaling throughout the course of urogenital development, beginning with formation of the genital tubercle.

The genital tubercle forms from all three embryonic germ layers and first develops in a sex-independent outgrowth stage, then sexually differentiates dependent on androgen levels25. Most of the genital tubercle is composed of mesoderm that develops into erectile tissue and stroma, while the ectoderm develops into the glans penis and skin and the endoderm develops into the urethral epithelium25,26.Two genes related to germ layer differentiation were identified within regions of two CpGs differentially methylated with chordee (Fig. 1). Within the cis-flanking region of cg07509211, we identified significant causal relationship of hypospadias with expression of WDHD1, a WD repeat and HMG-box DNA binding protein that acts as a transcription factor in the mesodermal commitment pathway. We additionally identified DNM1L in the region of cg19445285. DNM1L is a dynamin GTPase involved in apoptosis during endodermal differentiation and related to beta-catenin signaling27. Notably, we identified evidence of causal association of both CpGs with the development of chordee within hypospadias cases, a curvature of the penis that results from skin tethering or fibrosis of other tissues on the ventral side of the penis. We also identified cg25196688 and expression of TULP3 (TUB like protein 3) as causally related to hypospadias. Sonic hedgehog signaling occurs in the urethral epithelium to control development of the genital tubercle28 and TULP3 is necessary for proper ciliary function at this stage29,30.

Figure 1
figure 1

DNA methylation associated with chordee in genic regions involved in differentiation of the germ layers. The x-axis corresponds to genomic position within a chromosome and the y-axis plots − log10 p-values for three sources of statistical testing: (1) a scatter plot for the epigenome-wide association study (EWAS) where the top-associated CpG is indicated by a central vertical line, (2) a single red diamond at the top-associated CpG for two-stage least squares regression Mendelian randomization (2SLS MR) for causal relationship with hypospadias, and (3) horizontal lines across the length of gene transcripts for their causal association with hypospadias where multiple lines represent each tissue type in GTEx. Gene names above the plotted transcripts indicate Wald MR P < 0.05 and gene names below the plotted transcripts indicate Wald MR P > 0.05. Graph was generated by the authors using R version 3.5.2 (https://www.R-project.org/)15.

Beta-catenin is a protein that contributes to cell development in the Wnt signaling pathway and is required at multiple stages of formation of the genital tubercle and during urethral development31,32. Animal models show that knockout of beta-catenin leads to severe hypospadias31 and that estrogen and androgen levels alter beta-catenin signaling to affect penile development33. We identified evidence that methylation at cg14906547 (hypospadias), cg07509211 (chordee), and cg01749436 (severity) are causally associated with hypospadias development and its characteristics (Fig. 2). Within these regions, we further identified that expression of genes regulating beta-catenin are related to hypospadias, including UBE2R234 (ubiquitin-conjugating enzyme), PKP235 (plakophilin), and TNKS36 (tankyrase).

Figure 2
figure 2

DNA methylation associated with characteristics of hypospadias in genic regions involved in beta-catenin signaling. The x-axis corresponds to genomic position within a chromosome and the y-axis plots − log10 p-values for three sources of statistical testing: (1) a scatter plot for the epigenome-wide association study (EWAS) where the top-associated CpG is indicated by a central vertical line, (2) a single red diamond at the top-associated CpG for two-stage least squares regression Mendelian randomization (2SLS MR) for causal relationship with hypospadias, and (3) horizontal lines across the length of gene transcripts for their causal association with hypospadias where multiple lines represent each tissue type in GTEx. Gene names above the plotted transcripts indicate Wald MR P < 0.05 and gene names below the plotted transcripts indicate Wald MR P > 0.05. Graph was generated by the authors using R version 3.5.2 (https://www.R-project.org/)15.

In addition to their effect on beta-catenin signaling, sex steroids regulate differentiation of the genital tubercle into the male penis. Hormonal disruptions during penile development can cause urethral and epithelial abnormalities that result in hypospadias and chordee37,38. The cytochrome P450 superfamily of enzymes are responsible for catalyzing the synthesis of sex steroids from cholesterol, fatty acids, and other lipids. We identified a significant causal effect of methylation at cg04714159 on hypospadias, as well as potential causal effects of multiple members of the cytochrome P450 family of enzymes in this region (CYP4A11, CYP4A22, CYP4B1, CYP4X1, CYP4Z2P; Fig. 3). We additionally identified a causal association of methylation at cg22788109 and expression of EPHX1 with hypospadias, which encodes an epoxide hydrolase involved in the cytochrome P450 metabolism of epoxide-containing fatty acids39. Additionally, these genes have been implicated in pre-eclampsia, a gestational blood pressure disorder that is consistently associated with hypospadias in epidemiologic studies40. CYP4A11, CYP4A22, and other cytochrome P450 isoforms are differentially expressed in preeclamptic placentas41 and EPHX1 mutations contribute to preeclampsia42.

Figure 3
figure 3

DNA methylation associated with characteristics of hypospadias in genic regions related to androgens. The x-axis corresponds to genomic position within a chromosome and the y-axis plots − log10 p-values for three sources of statistical testing: (1) a scatter plot for the epigenome-wide association study (EWAS) where the top-associated CpGs are each indicated by a vertical line, (2) a single red diamond at each top-associated CpG for two-stage least squares regression Mendelian randomization (2SLS MR) for causal relationship with hypospadias, and 3) horizontal lines across the length of gene transcripts for their causal association with hypospadias where multiple lines represent each tissue type in GTEx. Gene names above the plotted transcripts indicate Wald MR P < 0.05 and gene names below the plotted transcripts indicate Wald MR P > 0.05. Two CpGs in the same region are depicted in the third panel, cg26638975 from the hypospadias case–control EWAS (dark brown scatter points) and cg15014976 from the chordee case-only EWAS (light brown scatter points). Graph was generated by the authors using R version 3.5.2 (https://www.R-project.org/)15.

Further support for the involvement of sex steroids in hypospadias development is evidenced by two likely causal associations of cg26638975 and cg15014976 with hypospadias and chordee, respectively (Fig. 3). This region on chromosome 19 is characterized by clusters of sialic acid-binding Ig-like lectin (SIGLEC) and kallikrein (KLK) genes, among which we identified likely causal relationships with hypospadias (CD33/SIGLEC3, SIGLEC5, SIGLEC7, KLK5, KLK7, KLK10, KLK13, and KLK14). The KLK family is involved in the kinin system, a hormonal pathway that regulates blood pressure, and KLK expression is upregulated by sex steroids43,44. The two methylation signals observed in the SIGLEC-KLK region are not strongly correlated (Supplementary Fig. S1) and have independent meQTLs (1,000 Genomes CEU r2 ≤ 0.022; Supplementary Fig. S2), thus cg26638975 and cg15014976 may represent two independent regulatory mechanisms in this region. An additional gene of interest in this region is SIGLEC6 that binds the obesity-associated leptin molecule. Expression of SIGLEC6 is consistently upregulated in the placentas of women with preeclampsia45,46,47,48,49, as well as with the presence of other placental abnormalities50. However, SIGLEC6 had a single tissue-specific eQTL estimate in GTEx that was not present in the UK Biobank, therefore we were not able to test a causal relationship with hypospadias.

We additionally identified methylation regions associated with hypospadias where gene transcripts are related to reproductive traits (Fig. 4). We identified cg25918138, cg25196688, cg05045951, and cg24241688 as causally related to hypospadias and its characteristics. Genes in these regions related to male fertility include ACAA1 (acetyl-CoA acyltransferase 1)51,52, PLCD1 (phospholipase C delta 1)53, EFCAB4B (CRACR2A; calcium release activated channel regulator 2A)54, GMCL1 (spermatogenesis associated germ cell-less protein expression decreased in male infertility)55, and MKRN2 (makorin ring finger protein 2)56. DNM1L, discussed previously for implications in germ cell differentiation, is also differentially expressed in sperm with poor motility57, and expression of multiple members of the KLK family are associated with measures of sperm quality58. Additionally, TEAD4 and TSPAN9 in the cis region of cg25196688 is associated with fallopian tube disease59. While infertility may be a long-term consequence among males with hypospadias, we assessed causal direction in these regions and both methylation and gene expression are suggested to be causally related to hypospadias. We postulate that methylation at these CpG sites may be inherited from the parents and reflect fertility of the parents of hypospadias cases. In fact, hypospadias is one type of birth defect more prevalent among families seeking fertility treatments60,61.

Figure 4
figure 4

DNA methylation associated with characteristics of hypospadias in genic regions related to reproductive traits. The x-axis corresponds to genomic position within a chromosome and the y-axis plots − log10 p-values for three sources of statistical testing: (1) a scatter plot for the epigenome-wide association study (EWAS) where the top-associated CpG is indicated by a central vertical line, (2) a single red diamond at the top-associated CpG for two-stage least squares regression Mendelian randomization (2SLS MR) for causal relationship with hypospadias, and (3) horizontal lines across the length of gene transcripts for their causal association with hypospadias where multiple lines represent each tissue type in GTEx. Gene names above the plotted transcripts indicate Wald MR P < 0.05 and gene names below the plotted transcripts indicate Wald MR P > 0.05. Graph was generated by the authors using R version 3.5.2 (https://www.R-project.org/)15.

We were unable to replicate association of 14 CpGs previously reported as associated with hypospadias13. Differences in our findings could be due to hypospadias severity (milder cases in Choudhry) or age at preputial tissue collection (an older median 26 months of age in Choudhry), however, the previous EWAS did not report any quality control of the methylation array, normalization, and control for potential batch effects that have become standard procedure for analysis of methylation array data. We additionally assessed whether methylation could mediate previous hypospadias GWAS regions. We did not identify methylation at cg15242360 to mediate the DAAM2 rs417096 index variant reported in the Geller et al. GWAS10. However, we identified causal support for cg15242360 and two gene transcripts in the development of hypospadias. KIF6 is a kinesin motor protein ubiquitously expressed in coronary and vascular tissues. MOCS1 is involved in the synthesis of molybdenum cofactor, which metabolizes water-soluble vitamins and other cofactors. Autosomal recessive MOCS1 mutation can result in an inborn error of metabolism. A case report a female 46,XY patient with testicular dysgenesis identified a large inherited duplication of the DAAM2-MOCS1 region and notably this proband is reported to have a brother with penoscrotal hypospadias62.

A limitation of this study is the dynamic nature of methylation and gene expression levels, particularly as they regulate human development. We measured methylation levels in the preputial tissue of boys at 10 months of age, on average. We also assessed gene expression levels across all tissues in GTEx, for which the samples comprise a variety of ages. Accessing the relevant tissue types at the developmentally-appropriate time period of urethral development in humans is not feasible. Therefore, we are limited to identify methylation or expression levels that remain altered after urethral development and are detectable across multiple or related tissue types. Further, methylation may itself be altered by adverse health conditions, which has been supported by causal testing in multiple cross-sectional EWAS63. However, young boys with hypospadias have limited health consequences from their condition, therefore methylation differences are unlikely to be a result of the hypospadias. Also, within the data available, we are not able to functionally relate DNA methylation to gene transcription. Despite these limitations, our study measured methylation in preputial tissue and achieved the greatest statistical power to date to relate methylation to development of this relatively rare health outcome. Our study had adequate statistical power to detect relatively large effect sizes (1.8–15.7% difference in methylation beta values) at sites that remained differentially methylated after embryo development. Through the analysis of DNA methylation among moderate to severe hypospadias cases and gene expression in independent samples, we have identified genetic mechanisms that may be involved in early penile development. Methylation and gene transcripts in these regions should be considered for future functional assessments of human development and morphology.