Background

The improvement of reproductive efficiency is of major economic importance in beef production systems and improvements in total lifetime productivity is a key metric for efficiency, which is a function of both reproduction and total output per cow [1, 2]. In South Africa (SA), the majority of beef is produced under extensive production systems with average calving percentages (i.e., the proportion of cows that give birth in comparison to the number of cows that could possibly give birth) of 62% [3]. Bonsmara cattle are a beef breed, developed during the nineteen-sixties, resulting in a composite comprised of approximately 5/8 Afrikaner and 3/8 exotic breeds [4]. It is the largest breed represented in the seed stock and commercial beef industry in SA. As the beef industry primarily relies on reproductive efficiency, a reduction in the unproductive period of a female’s life would positively impact production costs and profit as well as having a favourable impact on the carbon footprint.

Genetic improvement in reproductive performance remains challenging, hindered by the generally low associated heritability (h2), coupled with the complexity of recording these traits and/or the late expression of reproductive traits [5,6,7]. The age at onset of puberty can impact overall productivity through the heifer becoming productive at an earlier age [8]. Puberty is experienced earlier in composite and crossbred animals compared to their purebred counterparts with the same trend observed for early maturing versus late maturing beef breeds [9]. Age of puberty is however difficult to record, and therefore age at first calving (AFC) is often used as an indicator of heifer fertility. The heritability estimates reported for AFC differ depending on the breed, with [10] estimating a h2 = 0.10 in Brahman cattle, [8, 11, 12] reporting on Nellore cattle (0.01 to 0.31) and [13] estimating a h2 = 0.08 for AFC in Angus and Hereford cattle. Recent studies on South African populations are limited with [14] reporting a h2 = 0.08 in Bonsmara cross cattle, while [5] reported higher heritability for AFC in Afrikaner (0.27) and Drakensberger (0.30) Sanga cattle breeds. Inter-calving period (ICP) is a relatively easy trait to record and is included in most beef and dairy breed societies recording schemes in South Africa [15]. Estimates of ICP yield low to moderate heritability (0.01–0.10) in Brahman and composite beef cattle breeds [6, 10, 16]. As scrotal circumference (SC) in bulls is relatively easy to measure, with a moderate to high heritability [6, 12, 17,18,19], it has been suggested as an indicator trait for age at puberty, the latter being resource-intensive to measure.

Several studies have reported positive genetic correlations between SC and growth traits like mature body weight (rg = 0.37–0.40) in composite beef cattle [20], weaning and mature weight (rg = 0.60–0.72) in Bos indicus cattle [21] as well as weaning (rg = 0.312) and yearling (rg = 0.519) weight in Nellore cattle [22]. Although weaker genetic correlations between SC and weaning weight (rg = 0.15) have been reported in SA Bonsmara bulls [23], genomic analyses of SC and SC-related traits have identified genes which are also known to associate with growth traits [24].

Genome-wide association studies (GWAS) conducted on European and tropically adapted beef cattle breeds revealed several potential genes for fertility traits including AFC [8, 12, 25, 26], ICP [27], pregnancy status [28], gestation length [29], sexual precocity [30] and SC [12, 31, 32]. Some studies have combined fertility traits with body weight at puberty [33] while multi-trait meta-analyses have also been applied [12], which indicated that similar regions of the genome harbour genetic variation that potentially influence reproductive traits in both genders.

The SA Bonsmara, classified as a Sanga type, is a unique composite breed of 3/8 exotic (Milk Shorthorn, Hereford) and 5/8 Afrikaner [4]. The breed was established through a well-documented crossbreeding program, with the aim of founding a local composite breed that was well adapted to the challenges of a diverse SA climate. This study was the first attempt to apply GWAS for fertility traits in SA Bonsmara cattle to provide insight on gene regions in the SA Bonsmara. The objective of this study was to perform a genome-wide association study for three reproductive traits (AFC, ICP and SC) in a South African Bonsmara population in order to identify quantitative trait loci (QTL) for these traits.

Results

Variance components estimation

A genetic correlation of 0.37 was estimated between AFC and ICP in the SA Bonsmara using the breeding values derived from the bivariate model. Pedigree heritability estimates were 0.22 for AFC, 0.13 for ICP and 0.38 for SC. The genomic REML analysis yielded genomic heritability and standard errors (SE) of 0.183 (SE = 0.021) for AFC, 0.207 (SE = 0.022) for ICP and 0.209 (SE = 0.019) for SC.

Genomic population quality control

Individual based quality control resulted in the omission of ninety-five SA Bonsmara genotypes across the five genotyping arrays available. Assessment of identical by state (IBS) genetic distances yielded a multidimensional scaled plot (MDS; Fig. 1). Identification of outliers as well as their genotyped progeny led to the further removal of 103 genotypes (Fig. 2) resulting in a final sample population of 7,128 SA Bonsmara animals (4,403 males, 2,725 females). The effect of filtering out animals with reliabilities smaller than 0.01 and an effective record count (ERC) of less than 0.50 culminated in genome-wide population sizes of 4,460 animals for AFC (median ERC = 1.758), 4,276 animals (median ERC = 1.717) for ICP1, and 5,452 animals for SC (median ERC = 1.004).

Fig. 1
figure 1

Multidimensional Scaling of 7,231 SA Bonsmara Genotypes

Fig. 2
figure 2

Multidimensional Scaling of 7,128 SA Bonsmara Genotypes

Age at first calving

Eleven single nucleotide polymorphisms (SNPs) at the genome wide threshold p-value of less than 4 × 10− 7 and a further 16 suggestive SNPs at a p-value of less than 4 × 10− 6 were associated with AFC (Fig. 3). Pair-wise linkage disequilibrium (LD) analysis of significant SNPs resulted in the identification of 11 QTL across eight autosomes (Table 1). A total of 11 different genes were co-located with the associated QTL, with the frequency of the major alleles being between 0.527 and 0.982, respectively. Bos taurus autosome (BTA) 7 harboured two QTL containing nine genes (ARAP3, CLINT1, FCHSD1, LSM11, PCDHGA, PCDHGB, PCDHGC, RELL2 and THG1L) with each of these QTL having seven SNPs in pair-wise LD. The most significant QTL, located on BTA 17, was a single intergenic SNP with a minor allele frequency (MAF) of 0.13. The gene, PLCB1, resides in a QTL that spans four SNPs in pair-wise LD (r2 > 0.50), 212.17 kilobase pairs (kbp) across BTA 13, with the significant SNP (BovineHD1300000449; MAF = 0.15), associating with AFC.

Fig. 3
figure 3

Significant SNP above the red line (≤ 4 × 10− 7) for Age at First Calving (4,460 animals)

Table 1 List of identified QTLs and genes associated with Age at First Calving

Inter-calving period

Eleven SNPs were significantly associated with ICP, with a further 23 SNPs being suggestively associated (Fig. 4). The most significant SNP (BTA-16,045-no-rs; P = 2.83 × 10− 16, MAF = 0.13) was on BTA 17 and was an intergenic SNP. A total of ten QTL across eight autosomes were significantly associated with ICP (Table 2). A QTL 337.43 kbp in length containing six genes (ARAP3, FCHSD1, PCDHGA, PCDHGB, PCDHGC and RELL2) and a second QTL of 158.19 kbp in length harbouring three genes (CLINT1, LSM11 and THG1L) were both positioned on BTA 7. A QTL containing the LDAH gene, is a 131.85 kbp long and contains two SNPs in high LD (r2 = 0.74).

Fig. 4
figure 4

Significant SNP above the red line (≤ 4 × 10− 7) for Inter-Calving Period (4,276 animals)

Table 2 List of identified QTLs and genes associated with Inter Calving Period

Scrotal circumference

Forty-four SNPs were significantly (P ≤ 4 × 10− 7) associated with SC with a further 51 SNPs being suggestive (P ≤ 4 × 10− 6, Fig. 5). Pair-wise LD analysis identified a total of 41 QTL across 14 autosomes (Table 3). The most significant SNP, (BovineHD2300009170; P = 5.02 × 10− 11, MAF = 0.04), which is lowly segregating in the SA Bonsmara population, was an intron variant of SLC17A3 located on BTA 23. A QTL consisting of eleven SNPs in LD (r2 > 0.50) on BTA 11 consisted of downstream variants for the gene NEURL1B. Gene rich QTL were located on BTA 2, 11, 19 and 22. A 93.2 kbp QTL on BTA 2 consisting of six SNPs harboured the gene ABCA12, a small nucleolar RNA (RF00156) and an insertion/deletion copy number variant (CNV). Five QTL spanning a 1.65 Mega basepair (Mbp) portion of BTA 11 house the genes AAK1, ANTXR1, CNRIP1, GMCL1, MXD1, PP3R1 and SNRNP27. A homeobox gene dense QTL on BTA 19 sweeping 171.78 kbp consisted of four HOXB gene variants and TTLL6. On BTA 22, a significant SNP (Hapmap33950-BES3_Contig483_1359; P = 4.02 × 10− 9, MAF = 0.425) in moderate LD with two flanking SNPs (downstream r2 = 0.524, upstream r2 = 0.527) includes the genes ABHD6, PXK and RPP14 associated with SC in this study. The QTL with the most genes, namely ALDH1L1, CHST13, C22H3orf22, SLC41A3, TXNRD3, UROC1 and ZXDC, was comprised of five SNPs over a 137.92 kbp DNA region on BTA 22.

Fig. 5
figure 5

Significant SNP above the red line (≤ 4 × 10− 7) for Scrotal Circumference (5,452 animals)

Overlapping genes across traits

In this study, three QTL, two on BTA 7 and one on BTA 13, were detected to be associated with more than one reproductive trait. The QTL on BTA 13 contains four SNPs in LD, with the SNP (BovineHD1300000449) being significantly associated with both AFC (P = 5.13 × 10–11) and ICP (P = 5.13 × 10− 8). Two large aforementioned QTL located on BTA 7 are both significantly associated with AFC and ICP. The first QTL was significantly associated with both AFC (P = 1.15 × 10− 7) and ICP (P = 2.98 × 10− 7), while the SNP (ARS-BFGL-NGS-11368), located in this same QTL was significantly associated with ICP (P = 2.98 × 10− 7). The second QTL, with seven SNPs in LD with each other, was identified to be significantly linked to both AFC (P = 1.15 × 10− 7) and ICP (P = 5.30 × 10− 8).

Discussion

This study was the first attempt to gain insight into the underlying genetic mechanisms for reproductive traits in a South African beef cattle breed. The pedigree and genomic based heritability estimates for AFC, ICP and SC were similar to those identified in studies in beef cattle breeds [6, 8, 10,11,12,13,14, 17,18,19]. The low to moderate heritability estimates for most reproductive traits and limited genomic studies on indigenous breeds in Southern Africa [34,35,36] justifies the further investigation of this study. Genomic tools hold the potential to unlock invaluable information [37], that may help explain physiological conditions that currently remain unanswered [38].

Age at first calving

In this study, a total of eleven significant SNPs associated with AFC were observed; none of these have been previously reported in Bos indicus and crossbred beef cattle to be associated with AFC. Of the eleven genes and two CNVs located in QTL associated with AFC, ten of these genes have not been previously associated with AFC or sexual precocity in beef cattle populations but have been reported in human [39] and sheep [40] populations. Four of the genes observed in this study (ENSBTAG00000032764, FCHSD1, PLCB1 and RELL2) have been previously associated in beef cattle populations [41,42,43] with two of these genes involved in more than one trait. Physical growth and body composition have a developmental effect on the reproductive organs which would determine the onset of puberty and subsequent AFC [44, 45]. ENSBTAG00000032764 (pseudogene) was previously reported as a candidate gene associated with carcass traits [41]. The resultant translated protein facilitates the storage of iron in a soluble, non-toxic form, which is an essential component for iron homeostasis.

Inter-calving period

The inter calving period which is the period between two calvingss has the lowest genetic and genomic heritability of all three traits in this study and, in practice, is a function of the ability to ovulate post-calving [45], express oestrus, establish and maintain pregnancy and gestation length [29, 46]. A total of eleven genes and three CNVs were based in QTL associated with ICP which included ten genes not previously associated with ICP or reproduction in beef cattle. The 337.43 kbp QTL on BTA 7 contains three genes (PCDHGA, PCDHGB, PCDHGC) from the protocadherin (PCDH) family group. Studies have shown the PCDH genes play an integral role in ovarian follicular [47] and embryonic [48] development. LDAH is a lipid droplet-associated hydrolase that is essential in lipid storage and was previously associated with feet and leg disorders in Danish Holstein cattle [49].

Scrotal circumference

Pair-wise LD analysis identified a total of 41 QTL across 14 autosomes (Table 3). Genes not previously reported in beef cattle populations include AAK1, ACAD19, CHST13, CNRIP1, CRISP1, DERA, KIF2A, PARVB, PNPO, PXK, RPP14, SKAP1, SLC41A3, SNRNP27, SSBP2, TCOM5B, TJP2, TTLL6, UBE2Z, UROC1 and ZXDC. Most of these genes have not been previously reported to be associated with SC or sperm-related traits, but upon further investigation into the biological pathways that the genes are involved in, it becomes clear that the genes may play some role in the expression of fertility related traits. Biological pathways discussed below include carbohydrate catabolic processes (CHST13, DERA and UROC1), cellular development (HOXB5, HOXB7, HOXB9, HOXB13 and TTL13), lipid metabolism (ABCA12, ABHD6, ACAD9 and PARVB), immune response (SKAP1) and regulation of DNA transcription and RNA translation (CDKAL1, E2F3, MXD1, RPP14, SNRNP27, SSBP2 and ZXDC).

ABCA12 mediates lipid transporter activity, signals receptor binding, and has active trans-membrane transporter activity [50, 51]  identified that ABCA12 is the major gene that influences Harlequin Ichthyosis in humans and later was identified in livestock species. ABCA12 was associated with birth weight in Holstein cattle [52]. Monoacylglycerol lipase (ABHD6) is a lipase and the major enzyme for bone morphogenic protein catabolism, which plays a key role in the formation of intraluminal vesicles and in lipid sorting [53]. Although not previously associated with SC or fertility traits in cattle breeds, ABHD6 was associated with average daily gain in crossbred beef cattle [54] as well as identified as a positional candidate gene for milk cholesterol in dairy cattle [55]. ACAD9, which catalyses the rate limiting step during the beta-oxidation of fatty acyl-CoA, was identified as a CNV in Italian sheep [56] and was found to be significantly associated with intramuscular fat content in Large White pigs [57]. Parvin beta (PARVB) is essential in establishing and/or maintaining cell polarity as well as actine cytoskeleton reorganization. Although not previously reported in beef cattle, a study on ketosis in German Holstein cattle [58] associated PARVB with non-alcoholic fatty liver disease, indicating that PARVB could be a part of lipid metabolism. An investigation into selection signatures in Indian swamp buffaloes identified PARVB [59].

Two genes previously reported to be linked to fertility traits in cattle include FXN and GMCL1. Frataxin (FXN) promotes the biosynthesis of heme and plays a role in the protection against iron-catalysed oxidative stress. FXN was the second ranking SNP in a logistic regression analysis of pregnancy status in Santa Gertrudis cows [60]. GMCL1 appears to be located in an extended selection signature that shows high haplotypic homozygosity [61] and was previously associated with fertility traits in beef cattle breeds [62]. Single Stranded Binding Protein (SSBP2), located on BTA 7, is a candidate QTL for conferring resistance to Johne’s disease in cattle [63, 64]. SSBP2 is involved in the positive regulation of transcription by RNA polymerase II and transcription by RNA polymerase II. A study in Holstein cattle associated SSBP2 with body conformation traits [65]. PPP3R1 is a regulatory subunit of calcineurin, which plays a role in neuronal calcium signalling. A study on transcriptome profiling of muscle in Nelore cattle [66] identified PPP3R1 participates in mitogen-activated protein kinase (MAPK) signalling pathway which is responsible for cell proliferation, differentiation, and apoptosis.

A range of genes identified to be associated with SC have not been previously identified in cattle, but have been in water buffaloes and crossbred buffaloes. ALDH1L1 [67], TJP2 [67] and TCOM5B [68] were all linked to milk composition traits, more specifically fat and protein yield. The QTL containing ALDH1L1, CHST13, SLC41A3, TXNRD3, UROC1 and ZXDC on BTA 22 was previously reported to be associated with somatic cell score in Holstein cattle [69]. The TXNRD3 genes is known to affect adipocyte differentiation through the Wnt signalling pathway [70].

On BTA 19, an SC-associated QTL harbours four homeobox genes and TTLL6. TTLL6 is a gene in a strong candidate region, which included HOXB7 and HOXB9, for controlling skeletal tail length in sheep [71]. The four HOXB genes in this QTL have not been previously reported in beef cattle, but other HOX gene clusters are known to affect sperm quality in humans [72]. It is known that poor sperm DNA methylation is associated with decreased male fertility and low embryo quality. The hypomethylation of microRNA and HOX gene clusters play a significant role in embryonic development and is evidence of the sperm’s epigenetic contribution. KIF2A is known modulate mitotic events during spermatogenesis [73].

A limited number of studies have considered SC as a direct trait of interest for genomic investigations. Reviews of literature on bull fertility highlight the biological processes associated with sperm quality, motility, and scrotal volume [24, 74, 75]. More recent association studies revolve around sexual precocity, especially in tropical cattle [32, 74, 76,77,78,79] located in Central and Southern America.

Overlapping genes across traits

Of the genes observed to be significantly associated with both AFC and ICP in the present study, PLCB1 was identified to be linked to stay-ability in Nelore cattle [80]. PLCB1 is known to be a target of a micro interfering RNA (miR-301b), which has been associated with ovarian follicle development in cattle breeds [81]. Puberty in a heifer occurs upon ovulation of a potentially fertile oocyte [44], while [82, 83] stress the importance of proper nutrition postpartum in order to re-establish ovarian activity for a shortened ICP. PLCB1 is an enzyme that hydrolyses phospholipids into fatty acids as well as other lipophilic molecules and is involved in oxidative stress responses [84]. The regulation of adipose tissue affects the metabolic hormone leptin, known to regulate reproductive function in female animals [85, 86]. Twelve haplotype blocks for PLCB1 were identified through an association analysis of carcass traits in Hanwoo cattle [42] and through ontology of this gene linked it to lipid metabolism. ARAP3, a GTPase-activating protein, and the multiple genes that are members of the PCDH family group have been reported as a selection signature in cattle related to immune response [87]. Although immunological studies in livestock species are limited, [88] reviewed the effect immune cells have on ovarian follicle development and the establishment of pregnancy.

FCHSD1 and RELL2 are located in a long run of homozygosity (ROH) detected in multiple Alpine-based dual-purpose breeds. These two genes are involved in the MAPK14/p38 cascade as well as apoptosis. Clathrin interactor 1 (CLINT1) plays a major role in the formation of coated vesicles. This gene was associated with milk yield, fat yield and percentage as well as protein yield and percentage in dairy cattle [84], while was linked to milk fat content in Simmental cows [43]. LSM11 and THG1L have not been previously reported in cattle breeds but have been associated with milk protein yield and milk protein percentages in Valle del Belice dairy sheep [40]. LSM11 being a small nuclear RNA that has processes the mRNA 3’-end prior to translation while THG1L is involved in the regulation of tRNA processing during translation.

The number of overlapping genes co-located in QTL shared by AFC and ICP, alongside the moderate genetic correlation (0.37) in this study, indicates fertility is initiated, regulated and maintained by pleiotropic genetic mechanisms. Multiple genes in this study have no obvious direct link with fertility traits and this further demonstrates the complexity of genetic mechanisms for traits such as AFC and ICP.

Conclusion

In this study numerous genes, ARAP3, CLINT1, FCHSD1, LSM11, PLCB1, RELL2, SM11 and THG1L were co-located in QTL that had a significant or suggestive association with both AFC and ICP. Numerous QTL were identified across 14 autosomes for SC, the majority of which had never been previously reported to be linked to reproductive traits. The identification of different genes with similar molecular and biological characteristics for these sex-limited traits reaffirms our understanding that these lowly heritable traits are influenced by many genes each contributing a small amount to the variation in these traits’ expression. Some genes related to carbohydrate catabolic processes, cellular development, iron homeostasis, lipid metabolism and storage, immune response, ovarian follicle development and the regulation of DNA transcription and RNA translation were identified as candidate genes for reproductive traits in SA Bonsmara cattle.

Methods

Genotypic data

Genotypes from 7,326 SA Bonsmara animals originating from one of five possible genotype arrays were available. A total of 1,950 animals were genotyped on the GeneSeek Genomic Profiler (GGP) 150K (140,113 SNPs), while 597 animals were genotyped on the GGP 80K (76,883 SNPs), 2,625 animals were genotyped on the Versa 50K (49,855 SNPs), 1,326 animals were genotyped on the SASB 50K (54,394 SNPs) with the remaining 828 on the ICBF IDB v.2 platform (52,445 SNPs). Only autosomal SNPs with a known base pair position, a call rate ≥ 0.90, a MAF ≥ 0.10 and did not significantly deviate from Hardy-Weinberg equilibrium (p > 0.001) were retained. All SNP locations were based on the UMD 3.1 genome build (GCF_000003055.6; [89]). Animals had a call rate of > 90%, while individuals with ≥ 0.95 identical genotypes were discarded as were families with more than 10% Mendelian errors. Quality control of SNP data was carried out using PLINK v.1.9 [90].

Population stratification

Identical by state genetic distances between animals were computed through a MDS analysis with PLINK v.1.9 [90]. The analysis involved a total of 7,231 SA Bonsmara genotypes at a density of 24,216 SNPs that are truly genotyped across all five arrays. Visualisation of the data, reduced into two dimensions, allowed for the detection of possible population stratification as well as outliers. The remaining SA Bonsmara genotypes (4,403 males, 2,725 females) were imputed to 120,692 SNPs using FImpute v.3 [91].

Phenotypic data

The SA Bonsmara minimum breed standards [92] indicate that a heifer must calve before 39 months of age and the first ICP cannot exceed 790 days. SA Bonsmara animals occur throughout all nine of South Africa’s provinces and are mainly raised in extensive natural pasture systems. The recording of weaning weight (205-day weight) is compulsory and facilitates the selection of bulls for post-weaning growth tests. Scrotal circumference is measured on bulls participating in central and farm-based growth tests at around 12 to 18 months of age. Standardised phenotypes for AFC (days), first ICP (days), and SC (millimetres) were available on 347,749 records for AFC, 206,505 for records ICP and 238,454 records for SC in individual SA Bonsmara animals from the LOGIX Genetic Evaluation System [93]. This was accompanied by pedigree information on 2,135,235 animals dating back to 01 June 1949, as well as data on the contributing systematic environmental effects associated with these traits.

Deregression of breeding values

In order to predict estimated breeding values, a bivariate animal linear model for AFC and ICP and a univariate animal linear model for SC were defined as follows;

$$\varvec{y} = \varvec{Xb} + \varvec{Zu} + \varvec{e}$$

where,

y is the vector of phenotypes for AFC, ICP and SC;

b is a vector of fixed effects which include sex, herd, birth month and year, age in days at measurement of the phenotype covariate (linear regression);

u is a vector representing the direct additive-genetic effects, with u ~ N(0,A \({\sigma }_{u}^{2}\)), where A is the pedigree-based matrix and \({\sigma }_{u}^{2}\) is the direct genetic variance;

e represents the residual, where e ~ N(0,I\({\sigma }_{e}^{2}\)), with \({\sigma }_{e}^{2}\) representing the residual variance and I the identity matrix;

X and Z are incidence matrices for b and u respectively.

Estimation of variance components for the animal model stated above was calculated using restricted estimated maximised likelihood (REML) optimised with quasi-Newton procedure using analytical gradients in Variance Component Estimation (VCE) [94] software. MiX99 [95] was used to predict breeding values for AFC, ICP and SC using the same model in the estimation of variance components. Effective record contributions (ERCs) for each animal and trait were generated as described in [96] using the reversed reliability approximation method in APaX99 [97]. The EBVs of the genotyped animals for each trait were then deregressed using the Secant method [98] in MiX99 [95] alongside the generated ERC. Deregressed EBVs (DEBVs) were weighted using the formula set out by [99];

$${{w}}_{i}=\frac{1-{{h}}^{2}}{\left[{c}+\frac{1-{{r}}_{i}^{2}}{{{r}}_{i}^{2}}\right]{{h}}^{2}}$$

where,

w is the weighting factor of the ith animal with a DEBV;

h2 is the heritability estimate for the respective traits,

r2 is the reliability of the DEBV for the ith animal for a specific trait and,

c is the proportion of genetic variance not accounted by the SNPs with a value of 0.90 being used for all weighting factors between all the traits under analysis.

Only animals with an ERC ≥ 0.50 and a reliability ≥ 0.01 were retained for each trait analysis.

Association analyses

A genomic relationship matrix (GRM) was constructed for each trait using the VanRaden method 1 [100]. Additive and residual genetic variances for each trait were computed via genomic REML (GREML) using GCTA v1.94 [101]. Weighted DEBVs were regressed on each SNP individually using a linear mixed model in WOMBAT [102].

$$\varvec{y} = \varvec{\mu} + \varvec{SNP} + \varvec{a} + \varvec{e}$$

where,

y is the vector of phenotypes, the weighted DEBV;

µ is the fixed effect of the population mean;

SNP is the fixed effect of allele dosage for each SNP (coded as 0, 1 or 2);

a is the random effect of the animal, where a ~ (0,\({\varvec{G}\sigma }_{a}^{2}\)), with \({\sigma }_{a}^{2}\) representing the additive genetic variance of the animal;

G is the genomic relationship matrix among animals,

e represents the residual, where e ~ N(0,I\({\sigma }_{e}^{2}\)),

with \({\sigma }_{e}^{2}\) representing the residual variance and I the identity matrix.

The t-test statistics for all SNPs were obtained and subsequently transformed into lower tail p-values. To minimise false positives, the Benjamini-Hochberg False Discovery Rate (B-H FDR) method was applied to each SNP. SNPs with a P ≤ 4 × 10− 7 were considered to be genome-wide significant as per Bonferroni correction, with SNPs with a P ≤ 4 × 10− 6 being deemed suggestive. Manhattan plots, Figs. 3, 4 and 5, were generated in R using the qqman [103] package.

Defining QTLs and candidate genes

The extent of LD among significant SNPs (P ≤ 4 × 10− 7) was estimated, as was the pairwise LD among all SNPs within 5 Mb up and downstream of the significant SNP [104]. The start and end of each QTL was defined by SNPs furthest up and downstream of the significant SNP and had an r2 > 0.50 with other significant SNPs. If any QTL were deemed to be overlapping, these were consolidated into one large QTL. If no SNPs were in LD with the significant SNP, that SNP was deemed a quantitative trait nucleotide. Identified QTL were then explored using ENSEMBL (https://www.ensembl.org/) according to the UMD 3.1 genome build in order to detect candidate genes residing within and Panther [105] was used to list the biological and metabolic functions and/or processes of possible genes.

Table 3 List of identified QTL and genes significantly associated with Scrotal Circumference