Background

Canine hip dysplasia (CHD) is a common multifactorial hereditary disorder that has perplexed dog owners, breeders as well as veterinarians and researchers for decades. A standardized system for CHD grading has been developed in the countries that belong to the Fédération Cynologique Internationale (FCI). The FCI score is divided into five categories alphabetically: A to E, where A is normal and E is severe CHD. In Finland, the FCI score is defined separately for both hip joints, hence the format is given as: left hip score / right hip score. The FCI score is determined from different ‘sub-traits’ of the hip: congruency of the joint, Norberg angle (NoA), subluxation degree of the joint, shape and depth of the acetabulum, and whether there are any visible signs of osteoarthritis (OA) in the joint or not. FCI has derived the grading rules, from which the Finnish Kennel Club (FKC) has defined their guidelines for radiographing and scoring hip dysplasia [1]. The above-mentioned sub-traits are not recorded for later use, only the hip score is stored in the FKC database.

As the FCI or any other combinatory score does not accurately correlate with the various CHD sub-traits, these have to be studied separately. NoA and femoral head center position in relation to dorsal acetabular edge (FHCDAE) reflect the incongruity of the hip joint, which impacts the development of CHD [2]. Hip joint laxity is a major contributor to the development of OA. However, OA is suggested to develop due to many simultaneous pathologies, which influence the central structures of the joint [3]. OA may have a distinct genetic background in relation to the other hip sub-traits [4,5,6].

The current consensus is that CHD is polygenic, and genetic contribution to the phenotype can vary from small to moderate [7,8,9,10,11,12,13,14]. Variation between breeds is evident from several studies [5, 7, 9, 10, 14,15,16]. Some breeds are more susceptible to the disorder than others. Labrador Retrievers [7, 10, 17], Bernese Mountain dogs [9], Golden Retrievers [18], and German Shepherds [4, 14, 16] have been under special interest in studies of CHD, and several genetic associations with different hip phenotypes have been reported in these breeds. Different breeding strategies have been proposed to improve hip health; estimated breeding values are generally considered the most efficient approach [4, 19,20,21,22]. Also, newer methods like genomic selection might bring a long awaited solution in the fight against this disorder [17, 23, 24].

To better understand the genetic etiology of CHD related phenotypes, we have carried out here a successful genome-wide association study (GWAS) in a cohort of over 750 well-phenotyped German Shepherds to map loci for CHD and related sub-traits. We report three loci with genome-wide significance and two suggestive loci for different traits with physiologically relevant candidate genes.

Results

The joint incongruity, measured as FHCDAE and NoA, map to chromosomes 9, 25 and 28

Incongruity of the hip joint contributes to CHD. Therefore, we carried out two different association analyses on incongruity related traits, FHCDAE and NoA, which were assessed by two different veterinarians in our group. Both traits were measured for right and left hip, but we used only the worst measure in the analysis. NoA showed significant inter-observer variation in a linear regression model (P = 0.028, Additional file 1), which is consistent with earlier findings [25, 26]. Therefore, the evaluator was included as a covariate in the association analysis of NoA. For FHCDAE the inter-observer variation was non-significant. The association results for FHCDAE and NoA indicated overlapping loci, which is not surprising as these measurements were highly negatively correlated in the study cohort (Pearson’s r = − 0.94, Fig. 1). However, all the observed associations throughout the loci were stronger for FHCDAE than for NoA (Table 1).

Fig. 1
figure 1

Correlation plot of NoA and FHCDAE. NoA is on the Y-axis and FHCDAE on the X-axis. Above the correlation plot is the distribution of FHCDAE measurements in the cohort. A respective distribution of the NoA measurements is on the right side of the correlation plot. Pearson’s r = − 0.94 and P-value = 1.8 × 10–297

Table 1 Top SNPs from the GWAS on FHCDAE and NoA

On chromosome 9, two SNPs demonstrated association with FHCDAE (Fig. 2). One of these SNPs passed the threshold for significance with independent tests (BICF2G630834826 with a P-value of 1.57 × 10–6, Table 1). BICF2G630834826 and BICF2P742007 are located ~ 22 kb downstream and ~ 67 kb upstream of NOG encoding noggin (Additional file 2), and they are in high linkage disequilibrium (LD) measured as the squared value (r2) of Pearson’s correlation coefficient between pairs of SNPs (r2 = 0.84, Additional file 3). These two SNPs also associated with NoA but the association was stronger for FHCDAE. The third SNP on chromosome 9, which was observed only for NoA (BICF2G630837307) and was not genome-wide significant, lies ~ 64 kb upstream of LIM homeobox 1 (LHX1) (Additional file 2).

Fig. 2
figure 2

Manhattan plots for the analysis of hip joint incongruity traits FHCDAE and NoA. The upper Manhattan plot represents the results from the analysis of FHCDAE (N = 643). The blue line indicates the threshold for significance based on the number of independent tests. The lower plot represents the GWAS results of NoA (N = 642) with the blue line indicating the threshold for significance as in the upper plot

Other loci with at least a suggestive association with the incongruity traits were on chromosomes 25 and 28 (Table 1, Fig. 2). On chromosome 25, BICF2G630468961 showing suggestive association with NoA was intronic to solute carrier family 7 member 1 (SLC7A1) (Additional file 2). On chromosome 28, SNPs BICF2P1046032 (in high LD with BICF2P895332; r2 = 0.96, Additional file 3) demonstrated significant association with FHCDAE (Table 1). These SNPs located between CDK2 associated cullin domain 1 (CACUL1) (~ 18 and 30 kb upstream, respectively) and nanos C2HC-type zinc finger 1 (NANOS1) (~ 163 and 174 kb upstream, respectively) (Additional file 2).

OA maps to chromosome 1

We studied OA as a separate disorder. Two veterinarians in our group evaluated the radiographs of individual dogs for evidence of OA (see methods). The dogs exhibited either no radiographic evidence of OA (controls) or had mild, moderate or severe signs of OA (cases). A case-control association analysis, where all controls (N = 492) were compared with all cases regardless of the severity of OA (N = 163), revealed a genome-wide significant locus on chromosome 1 (Fig. 3). The SNP with the strongest association (BICF2P468585) had a P-value of 2.86 × 10–7 (Table 2). The second-best SNP (BICF2P357728) reached a P-value of 8.93 × 10–7 (Table 2). Both SNPs passed the threshold for genome-wide significance based on the estimated number of independent tests determined with simpleM (1.82 × 10–6).

Fig. 3
figure 3

Manhattan plots for the binary trait: OA status. The Manhattan plot represents the lambda-corrected (lambda = 1.007) P-values from the FASTA analysis of osteoarthritis (N = 655), where the blue line shows the threshold for significance with independent tests

Table 2 Top SNPs from the GWAS on OA

The two genome-wide significant SNPs, as well as four out of the six SNPs showing suggestive association with OA on this chromosome, located between NADPH oxidase 3 (NOX3) (except BICF2S23248027, which lies within the ninth intron of NOX3) and AT-rich interaction domain 1B (ARID1B) (Table 2, Additional file 2). The top SNPs BICF2P468585 and BICF2P357728 were observed to be in high LD (r2 = 0.85, Additional file 3). Otherwise, moderate to perfect LD (r2 = 0.63–1.00) was observed between these six SNPs, even though the region they covered was over 1.1 Mb long (Additional file 3). Thus, we concluded that these SNPs probably represent just one locus that associates with the disorder. SNPs BICF2S23216908 and BICF2S2305568 (Table 2) are in perfect LD (r2 = 1.00, Additional file 3). Although they are ~ 1.7 Mb away from the other SNPs that associated with OA on this chromosome, we observed some LD between these two loci (r2 = 0.50–0.61, Additional file 3). BICF2S23216908 located within the first intron of Transmembrane protein 181 (TMEM181) and BICF2S2305568 within the first intron of Dynein light chain Tctex-type 1 (DYNLT1).

We also observed suggestive associations for chromosome 9 and 25 for OA. On chromosome 9, BICF2G630837240 locates ~ 101 kb downstream from MRM1 encoding Mitochondrial RRNA Methyltransferase 1 and ~ 178 kb upstream from LHX1 (Table 2, Additional file 2). BICF2G630468961 on chromosome 25 is located within the second intron of SLC7A1 (Table 2 Additional file 2).

Different genetic etiology of mild and moderate-to-severe CHD

To identify loci for CHD according to the FCI hip scores, we carried out three sets of case-control association analyses. In the first case-control analysis, the controls had a bilateral FCI hip score A and cases B/C, C/B or bilateral FCI score C or worse (Ncases = 339, Ncontrols = 354). In the second analysis, the same controls were used but cases had a bilateral FCI score of D or worse (Ncases = 166). In the third analysis we compared mild CHD dogs (B/C, C/B or bilateral FCI score C) with dogs that had moderate-to-severe (at least an FCI score D or worse for either hip) CHD (Nmild = 124, Nmoderate-to-severe = 216). The summary of the results of these three comparisons are shown in Table 3.

Table 3 Top SNPs from the GWAS on different case-control analyses of the FCI hip score

A genome-wide significant association was found on chromosome 1 for the first comparison with close to 700 dogs (Fig. 4 and Table 3). The SNPs with the strongest association (BICF2P468585 and BICF2S23248027) passed the threshold for significance with independent tests (Table 3). The identified locus between NOX3 and ARID1B is the same we found for OA (Additional file 2). For the latter two case-control analyses with smaller number of dogs, none of the associations reached genome-wide significance. BICF2G630837405 on chromosome 9 lies within the eighth intron of apoptosis antagonizing transcription factor (AATF) and TIGRP2P126345 located ~ 8 kb downstream from the same gene. These two SNPs are in high LD (r2 = 0.97, Additional file 3).

Fig. 4
figure 4

Manhattan plots for the case-control analyses of controls and mild to severe cases. The upmost Manhattan plot represents the case-control analysis, where controls were dogs with an FCI score A/A and cases were dogs with an FCI score B/C, C/B, or C or worse on both hips (N = 693). The second Manhattan plot represents the case-control analysis, where cases were dogs with an FCI score D or worse on both hips (N = 520), and the lowest Manhattan plot is the comparison between mild cases (B/C, C/B, C/C) to moderate-to-severe cases (D or worse on both hips) (N = 340). In each plot, the blue line shows the threshold for significance with independent tests

A summary of the genome-wide significant loci across CHD-related traits described above are listed in Table 4. The frequencies of the effect and alternative alleles of the significantly associated SNPs in cases and controls (binary analyses) are in Additional file 4. Some SNPs were associated with more than one trait, as expected when the phenotypes are not independent from each other. The heritability (h2) estimates from the polygenic mixed model for the different traits varied from 36 to 64% (Additional file 5).

Table 4 Summary of the genome-wide significant SNPs for different CHD-related traits

Discussion

CHD is a complex skeletal disorder and one of the leading clinical concerns in veterinary medicine. CHD is categorically scored into five classes in screening programs of the FCI member countries but the phenotype manifests many sub-traits, which may eventually result in painful OA. The development of OA itself is a complex process, which involves alterations in many different tissues, including bone, cartilage, synovial membrane and ligaments [27]. Given the complexity of the disorder, it is not surprising that genetic discoveries have also remained scarce and breakthroughs require large and well-phenotyped study cohorts in each breed. We report here a remarkable progress by mapping three new loci on different chromosomes across key CHD traits in German Shepherds. The locus on chromosome 1 associated with OA and the FCI hip score, and the loci on chromosomes 9 and 28 associated with the trait FHCDAE, which measures hip joint incongruity (Table 4). In addition to the three loci with genome-wide significance, two suggestive loci on chromosomes 9 and 25 were uncovered for OA, NoA and different FCI hip score comparisons. Besides revealing novel loci, the study indicates that the locus on chromosome 1 associates with two binary traits: OA and the FCI hip score with relaxed case definition (B/C, C/B, or C or worse in both hips). Our study partially utilizes the study from Mikkola et al. (2019) [28] and as such cannot be regarded as an independent replication study.

The locus on chromosome 1 lies in a long intergenic region between NOX3 and ARID1B (Table 5) Neither of the genes nor the intergenic region is known for functions that could explain their role in the development of CHD or OA. However, the likely significance of this locus for CHD is highlighted by the fact that our previously observed suggestive association [28] was strengthened by over ten times with a larger sample size. The association of the NOX3-ARID1B locus to OA was 2.5 times as strong as to the FCI hip score (as assessed by the ratio of the P-values). The latter is an aggregate phenotype and visible signs of OA (or the lack of them) are part of its evaluation. Therefore, it is not surprising to observe overlapping results.

Table 5 Candidate genes near SNPs showing genome-wide significant association with CHD-related phenotypes

NOX3 is a member of NADPH oxidases and an interesting candidate for articular cartilage degradation. NADPH oxidase participates in the generation of hydrogen peroxide, which is used by myeloperoxidase as a substrate to produce a highly reactive hypochlorous acid, and in some circumstances chlorine gas [29, 30]. These two reactive molecules oxidize the pyridinoline cross-links of articular cartilage and initiate its degradation [29, 30]. The SNP BICF2P468585 with the strongest association is ~ 196 kb upstream from NOX3, but BICF2S23248027 (also known as rs21911799) is located in the intron between NOX3 exons 9 and 10 (Tables 4 and 5). Moreover, NOX3 is mainly expressed in the inner ear and fetal tissues [31], thus, the role of NOX3 in synovial tissue inflammation remains uncertain. Yet, among other protein-protein interactions, a STRING [32] database search (Additional file 6) suggested possible interplay between NOX3 and matrix metalloproteinases 2 and 9 – two matrix degrading enzymes implicated in CHD and OA [33,34,35]. We have previously discussed [28] that there is some evidence of the possible interplay between NOX3 and TRIO (trio Rho guanine nucleotide exchange factor), another candidate gene for CHD [16]. The product of T-cell lymphoma invasion and metastasis 2 (TIAM2) further upstream (Table 5) modulates the activity of Rho-like proteins [36]. ARID1B, on the other hand, participates in transcriptional activation and repression through chromatin remodeling [37]. Interestingly, ARID1B is associated with joint laxity via a multisystemic Coffin-Siris syndrome (CSS); CSS is caused by ARID1B variants and 66% of the CSS patients exhibit joint laxity [38, 39].

Previous studies have suggested seven different loci for OA, none of them overlapping our loci. A multi-breed study by Zhou et al. (2010) [5] suggested two loci on canine chromosomes 17 and 37 for OA. Another quantitative trait locus (QTL) study in a crossbreed experiment reported putative QTLs on chromosomes 5, 18, 23 and 31 [6]. Chromosome 3 has also been suggested to harbor a QTL that regulates cranial and caudal acetabular osteophyte formation in Portuguese Water Dogs [40]. Discrepancy to our results may result from the genetic heterogeneity in different study populations, differences in analysis methods or phenotyping approaches in evaluating OA.

A locus in chromosome 9 near NOG associated with the incongruity trait FHCDAE (Tables 4 and 5). The association of the loci with NoA were weaker than with FHCDAE. This is not surprising as NoA suffers from high inter-observer variability [25, 26], which was also noted in our study. Similar bias was not seen for FHCDAE (Additional file 1). We previously found protective regulatory variants upstream NOG, and demonstrated the inverse correlation of their in vitro enhancer activity with healthy hips in German Shepherds [28]. The association of this locus with FHCDAE (as assessed by the ratio of the P-values) was ~ 24 times as strong as what we observed for the FCI hip score [28]. The putative contribution of NOG to FHCDAE remains elusive but may offer some leads to reduced joint congruity. Decreased noggin activity could possibly strengthen the acetabular bone via bone morphogenic protein (BMP) signaling and help the repair of microfractures and other damage caused by mechanical wear in growing dogs. Interestingly, delayed ossification of the femoral head has been associated with CHD in later life [41, 42]. NOG is a crucial gene for many developmental processes, such as neural tube fusion, joint formation and skeletal development [43, 44]. In humans, dominant NOG mutations cause some congenital disorders with abnormal joints [45], and knocking out murine Nog leads to a state where the mice lack most of the joints in the limbs [46]. On the other hand, overexpression of murine Nog results in osteopenia, bone fractures and decreased bone formation, when the function of osteoblasts becomes defective [47]. A recent study by Ghadakzadeh et al. (2018) [48] showed that knocking-down Nog in rats with small interfering RNA leads to down-regulation of Nog and increases both BMP-mediated differentiation of osteoblasts and the mineralization process of extracellular matrix.

The third locus with genome-wide significance involved also FHCDAE and resided on chromosome 28 (Tables 4 and 5). This region contains CACUL1, a cell-cycle associated gene [49], and NANOS1 that upregulates MMP14 a.k.a. membrane type 1-matrix metalloproteinase (MT1-MMP) thus promoting epithelial tumor cell invasion [50]. MT1-MMP is a powerful collagenolytic element [51, 52] and Miller et al. (2009) have demonstrated the role of MT1-MMP in human rheumatoid arthritis with synovial invasion via collagenolysis [53]. The possible role of the NANOS1 – MMP14 interplay needs to be targeted in tissues relevant to CHD.

Intriguingly, chromosome 28 has been previously associated with NoA in two studies of which one included also German Shepherds [13, 54]. Although chromosome 28 did not associate with NoA in our study, the reported NoA locus is ~ 5.2 Mb upstream from our FHCDAE locus (Table 1). Because FHCDAE and NoA are strongly related traits (Pearsons’s r = − 0.94, Fig. 1), additional studies across breeds are warranted to find out whether the two loci on chromosome 28 are related or independent, and if they possess variants contributing to CHD.

We also observed some loci showing weaker associations with NoA and OA on chromosomes 9 and 25 (Tables 1 and 2), and with FCI hip score on chromosome 9 (Table 3). These loci included relevant candidate genes LHX1, AATF (both on chromosome 9) and SLC7A1 (chromosome 25) (Additional file 2). LHX1 could be a candidate for OA as it has been shown to be differentially methylated in OA [55] and is one of the most significantly up-regulated genes in this disorder [56]. SNPs near LHX1 demonstrated also a suggestive association with CHD (quantified as the FCI hip score) in our previous study [28]. AATF is located close to LHX1 but its role in CHD remains uncertain. Both LHX1 and AATF have been associated with the levels of macrophage inflammatory protein 1b (MIP-1b) [57, 58]. MIP-1b is a cytokine increased in the synovial fluid in OA and may play a role in the ingression of monocytes into osteoarthritic joints [59]. The canine gene encoding MIP-1b (C-C motif for chemokine ligand 4, CCL4) is located on chromosome 9, ~ 795 kb away from TIGRP2P126345 and ~ 803 kb from AATF (Tables 1 and 3). SLC7A1 is a high affinity cationic amino acid transporter that belongs to the solute carrier family 7 [60]. It participates in the transportation of cationic amino acids arginine, lysine and ornithine across the plasma membrane [60]. L-arginine and its methylated forms could impact OA via the nitric oxide pathway [61].

Considering the clinical complexity of CHD, it is not surprising that we have successfully mapped several loci, which contain candidate genes that are involved in different biological pathways. Identification of these pathways is an important step in understanding the pathophysiology of CHD. Some of the genes in these networks may have no direct function on the disorder but have a circuitous effect through other genes [62]. As demonstrated here and previously by Sánchez-Molano et al. (2014) [7], the complexity and polygenicity of traits such as CHD required large sample sizes for significant associations. Sánchez-Molano et al. (2014) [7] had a cohort of 1500 Labrador Retrievers, and observed two genome-wide and multiple chromosome-wide significant QTLs explaining maximum 23% of the genetic variance in the analyzed traits. It is possible that larger cohorts might reveal additional loci with smaller effects.

Besides sample size, accurate and reliable phenotyping is another essential factor when studying complex traits. This is particularly important when the trait comprises of many interconnected sub-traits that explain only small parts of the total variation. As long as the assessment of CHD relies on the FCI scoring, it is crucial to have standardized high-quality radiographs and a minimal number of people assessing them to reduce inter-observer bias [26]. More reliable indices of joint laxity such as the distraction or laxity index [25], could facilitate the discovery of genetic findings by removing some confounding factors affecting NoA and FHCDAE, as some laxity remains undiscovered in the extended view radiographs.

Conclusions

In conclusion, we have performed a successful association study with a large cohort of accurately and robustly phenotyped German Shepherds and describe three loci with genome-wide significance and two suggestive loci for CHD-related traits. The candidate genes include NOX3 and ARID1B on chromosome 1, NOG on chromosome 9, and NANOS1 on chromosome 28. Future studies will focus on ascertaining their role in CHD by resequencing the candidate region for putative risk variants.

Methods

Dogs

We acquired the data for our study from the Finnish Kennel Club. Before quality control we had a total of 775 samples of German Shepherds and of these 356 were controls, 322 were cases with both hip joints scored C or worse and 97 were of intermediate phenotypes with at least one hip joint scored as B. Majority of the dogs had either the same FCI score bilaterally or had maximum one score grade difference between the right and the left hip; three dogs had more than one grade difference (they had been scored A/C, C/A and B/D). The average age at radiographing was 1.55 years ranging from 1.01 to 5.83 years with a standard deviation of 0.63 years. 435 of the dogs were female and 340 were male. We collected at least one blood sample from all the dogs with ethylenediaminetetraacetic acid (EDTA) as an anticoagulant.

Phenotypes

The FCI-standardized ventrodorsal extended hip radiographs were taken by different veterinarians, but the hip scoring was done by two specialized veterinarians at the FKC. Therefore, inter-observer bias was reduced in this data set [26]. All of the hip scores for these dogs are available in the FKC database [63]. We had at least a CHD score for all the dogs. We used the official FCI hip scores to divide the dogs into two different case-control groups: the first group with a relaxed case definition, where the cases had an FCI score B/C (left/right hip), C/B, or C/C or worse, and the second group with a stringent case definition, where the cases had an FCI score D or worse on both hips.

Two veterinarians in our group carefully evaluated the acquired radiographs for more specific hip phenotypes. These phenotypes were: findings suggestive on osteoarthritis (in four categories from 0 = no signs to 3 = severe signs), NoA (in degrees), and FHCDAE (in millimeters). The phenotyping process was carried out as follows: One veterinarian (evaluator 1 in the phenotype file doi: https://doi.org/10.6084/m9.figshare.10096595) assessed all the radiographs for the study cohort that was used in our previous study [28]. However, another veterinarian (evaluator 2 in the phenotype file (doi: https://doi.org/10.6084/m9.figshare.10096595) in our group evaluated the radiographs of the dogs that were genotyped during the current study. A small subset of randomly chosen radiographs, which the evaluator 1 had previously assessed were re-assessed by the evaluator 2 to check their consistency. In case there were any inconsistencies, the re-assessed phenotype was used in the analysis.

NoA varied between 70 and 108 degrees in our cohort (Table 6); the smaller the value is, the worse is the incongruity of the joint. Generally dogs with an FCI hip score A have NoA of 105 degrees or higher [64]. Significant inter-observer variation for NoA was seen in our data (P = 0.028, Additional file 1). We handled this in our GWAS by using the evaluator as a covariate. FHCDAE was measured as millimeters (mm) and in our data this trait ranged between − 4 and 15 mm (Table 6). The smaller the value is, the deeper the femoral head sits into the acetabulum in relation to the dorsal acetabular edge. OA was divided into four categories (the quantities for each category here are before quality control): no signs of arthritis (0, N = 498), some mild changes affiliated with OA (1, N = 57, minor osteophytes on the femoral neck and/or at the craniolateral acetabular edge), moderate changes (2, N = 74, larger osteophytes, also at the dorsal acetabular edge), or severe osteoarthritis (3, N = 33, massive osteophytes of the femoral neck and surrounding the acetabular edge). However, radiographs are relatively insensitive in detecting early osteoarthritic changes [65]. Therefore, the current study is unlikely to detect any associations with loci affecting exclusively the early stages of OA.

Table 6 Median, interquartile range and minimum and maximum values for the analyzed traits

DNA preparation and genotyping

The original EDTA preserved blood samples for this study are stored at the Dog DNA bank at the University of Helsinki. The DNA was extracted from these samples with a Chemagic Magnetic Separation Module I with a standard protocol by Chemagen (Chemagen Biopolymer-Technologie AG, Baeswieler, Germany). Thereafter the DNA samples were genotyped at Geneseek (Lincoln, NE, US) with a high density 173 K canine SNP array from Illumina (San Diego, CA, US). Genotyping of the samples was done in multiple batches.

Population structure

We used information from a genomic relationship matrix built from the SNP data to divide our highly stratified German Shepherd population into three subpopulations (Additional file 7). For the clustering we used an R [66] package “mclust” [67] that utilizes covariance parametrization. The selection of appropriate number of clusters was executed with Bayesian information criterion. We then created a covariate vector from the clustering data where every individual belonged to one of the clusters. This way we could use the clustering effect in our model to account for any differences in disease association between the genetic clusters.

Quality control (QC)

We used PLINK [68] to merge the original three genotype sets from different genotyping batches. A preliminary QC was done on all of the genotyping batches before merging, with the following thresholds: call rate per sample 0.10, call rate per SNP 0.05, minor allele frequency 0.05, P-value cut-off for deviation from Hardy-Weinberg equilibrium (HWE) 0.00001 (from controls only). After these quality controls and data merging a total of 100,435 SNPs and 775 samples were transferred from PLINK to R. The final QC was done in R with GenABEL [69], and the thresholds were: minor allele frequency = 0.05, per sample call rate = 0.85 and per SNP call rate = 0.95, and again a P-value cut-off level < 0.00001 to test for deviations from HWE. After the final QC we had 89,251 autosomal SNPs and 769 samples to use in our association analysis. However, the final number of dogs per analysis varied between 338 and 693 as FASTA dropped individual dogs from analyses if they missed a phenotype or a covariate. CanFam3.1 was used as the position map for our SNPs [70]. After the GWAS the genotype call quality of the top SNPs was checked to exclude associations due to calling errors.

Genome-wide association analysis (GWAS)

We performed a GWAS by using polygenic mixed models in GenABEL, with functions “polygenic” and “mmscore” (FASTA: Score test for association in related people) [71]. The appropriate covariates were estimated with fitting linear regression models with the R function “lm” from the stats-package [72] for all non-binary traits. The binary traits were analyzed with fitting generalized linear models with the R function “glm” [73]. The following covariates were tested: sex, age at radiographing, genetic cluster of the dog, genotyping batch, birth month, and evaluator, in other words the veterinarian who evaluated the radiographs (tested for traits NoA, FHCDAE and OA). The appropriate covariates which had a significant effect (P-value < 0.05) for each dependent trait are in Table 7 (See also Additional file 1). The inflation factor lambda for the various models are indicated in the Tables 1-3. The corresponding Q-Q plots are in the Additional file 8.

Table 7 Covariates for different traits

The r2 values for the top SNPs were estimated in R with “r2fast” function [74] from GenABEL-package.

Bonferroni correction can be seen as a too stringent method to correct for multiple testing as it expects independency between the tests, which is untrue in many association studies because of LD between markers [75]. This is especially important to note in canine studies, as the structure of the canine genome is unique with strong LD due to the history of intensive selection [13]. Therefore, we used the number of independent tests to determine the threshold for significance. We estimated the effective number of independent tests to be 27,456 using simpleM, which uses dimension reduction models for filtering the correlations between the analyzed SNPs [76]. Based on this, the threshold for significance 1.82 × 10–6 (0.05/27456) is applied for P-values in this study.