Mammalian Genome

, Volume 23, Issue 3, pp 294–303

A genome-wide association study of osteochondritis dissecans in the Thoroughbred

Authors

    • The Roslin Institute and Royal (Dick) School of Veterinary StudiesUniversity of Edinburgh
  • Sarah C. Blott
    • Animal Health Trust
  • June E. Swinburne
    • Animal Health Trust
  • Charlene Sibbons
    • Animal Health Trust
  • Laura Y. Fox-Clipsham
    • Animal Health Trust
  • Maud Helwegen
    • Animal Health Trust
  • Tim D. H. Parkin
    • Boyd Orr Centre for Population and Ecosystem Health, Institute of Comparative Medicine, Faculty of Veterinary MedicineUniversity of Glasgow
  • J. Richard Newton
    • Animal Health Trust
  • Lawrence R. Bramlage
    • Rood and Riddle Equine Hospital
  • C. Wayne McIlwraith
    • College of Veterinary Medicine and Biomedical SciencesColorado State University
  • Stephen C. Bishop
    • The Roslin Institute and Royal (Dick) School of Veterinary StudiesUniversity of Edinburgh
  • John A. Woolliams
    • The Roslin Institute and Royal (Dick) School of Veterinary StudiesUniversity of Edinburgh
  • Mark Vaudin
    • Animal Health Trust
Article

DOI: 10.1007/s00335-011-9363-1

Cite this article as:
Corbin, L.J., Blott, S.C., Swinburne, J.E. et al. Mamm Genome (2012) 23: 294. doi:10.1007/s00335-011-9363-1

Abstract

Osteochondrosis is a developmental orthopaedic disease that occurs in horses, other livestock species, companion animal species, and humans. The principal aim of this study was to identify quantitative trait loci (QTL) associated with osteochondritis dissecans (OCD) in the Thoroughbred using a genome-wide association study. A secondary objective was to test the effect of previously identified QTL in the current population. Over 300 horses, classified as cases or controls according to clinical findings, were genotyped for the Illumina Equine SNP50 BeadChip. An animal model was first implemented in order to adjust each horse’s phenotypic status for average relatedness among horses and other potentially confounding factors which were present in the data. The genome-wide association test was then conducted on the residuals from the animal model. A single SNP on chromosome 3 was found to be associated with OCD at a genome-wide level of significance, as determined by permutation. According to the current sequence annotation, the SNP is located in an intergenic region of the genome. The effects of 24 SNPs, representing QTL previously identified in a sample of Hanoverian Warmblood horses, were tested directly in the animal model. When fitted alongside the significant SNP on ECA3, two of these SNPs were found to be associated with OCD. Confirmation of the putative QTL identified on ECA3 requires validation in an independent sample. The results of this study suggest that a significant challenge faced by equine researchers is the generation of sufficiently large data sets to effectively study complex diseases such as osteochondrosis.

Introduction

Osteochondrosis (OC) is a disease of the locomotory system that affects the joints of many animals, most frequently being observed in pigs, horses, and dogs. Osteochondrosis can be described as a focal disturbance of endochondral ossification (Ytrehus et al. 2007) that occurs in young, growing individuals and as such has been classified as a developmental orthopaedic disease. Primary lesions, thought to be initiated by a failure of blood supply to the cartilage (Ytrehus et al. 2007), progress to form retained cores of cartilage that eventually cause dissecting lesions on the joint surface (McIlwraith 2011). In its early stages, the condition has been referred to as dyschondroplasia, or more recently osteochondrosis latens (Ytrehus et al. 2007), and is likely to be subclinical in nature. In the most serious cases, where cartilage or subchondral bone fragments become separated from the articular surface, introducing an inflammatory component, the disease may be referred to as osteochondritis dissecans (OCD). In such cases, typical clinical signs of the disease are synovitis and pain accompanied by varying degrees of lameness (McIlwraith 2011). In the horse, joints most commonly affected are the fetlock, hock, and stifle; within these joints specific predilection sites have been identified (McIlwraith 1993).

Prevalence estimates for OC vary widely, ranging from 3% [stifle OC in Thoroughbreds (Oliver et al. 2008)] to 70% [estimates for all joints in Dutch Warmbloods (van Grevenhof et al. 2009a)]. A large proportion of this variation is attributable to differences in the type and number of anatomical locations examined, differences in the specific manifestation of the disease considered, and breed differences (Philipsson et al. 1993; Pieramati et al. 2003; van Grevenhof et al. 2009a; Wittwer et al. 2006). A recent prevalence estimate of 25% for the Thoroughbred (Lepeule et al. 2009) appears typical. This relatively high disease prevalence, along with the likely contribution of OC to the predominance of lameness as a cause of wastage in young horses (Olivier et al. 1997; Rossdale et al. 1985), makes OC a high priority for study.

Whilst there exists both experimental and anecdotal evidence of a genetic component to OC, the aetiopathogenesis of the disease is not fully understood (Ytrehus et al. 2007). The disease is considered multifactorial in origin, with at least some evidence of both environmental factors, e.g., nutrition, and physiological factors, e.g., growth and body size, endocrine factors, and conformation, which may themselves be mediated through genetics, playing a role in the condition (Lepeule et al. 2009; McIlwraith 2004; van Weeren et al. 1999). Low to moderate estimates of heritability for OC across a range of breeds and disease manifestations (Philipsson et al. 1993; Pieramati et al. 2003; Schougaard et al. 1990; van Grevenhof et al. 2009b; Wittwer et al. 2007a) together with between-breed differences in prevalence (Lepeule et al. 2009) indicate that genetic variability exists in disease susceptibility. Typical values for OC scored as a single binary trait (all joints combined) are 0.10–0.20 (Pieramati et al. 2003; Wittwer et al. 2007a), but heritability estimates of up to 0.5 have been reported for individual joints (Grøndahl and Dolvik 1993).

The search for markers to explain the proposed genetic variance in susceptibility to OC began several years ago, with the intention of both enhancing our understanding of the condition and enabling marker-assisted selection. Early studies using primarily linkage-based analyses (dependent on family data) to detect regions of the genome associated with OC in the horse have identified several putative quantitative trait loci (QTL) (Dierks et al. 2007; Wittwer et al. 2007b). As is typical for QTL discovered using this approach, their effects are generally large but their locations are imprecise. Whilst several of these QTL have undergone further refinement, very few have been validated in independent data sets. Similar studies in pigs have revealed few (Andersson-Eklund et al. 2000) or no (Lee et al. 2003) QTL for OC. These results illustrate the difficulty in identifying truly associated regions for complex traits using linkage analysis.

The opportunity for QTL studies in horses has recently been advanced by the publication of the equine genome sequence (Wade et al. 2009) together with the release of the Illumina Equine SNP50 BeadChip, which has allowed the implementation of genome-wide association studies (GWAS). In contrast to linkage analysis, GWAS rely on samples of individuals, which may be unrelated, genotyped at medium to high density. It is expected that this approach will allow the identification of common variants that could not be found using the traditional linkage-based approach (Iles 2008). We are aware of four GWAS for OC that have been carried out in three different horse breeds to date: those of Lampe (2009) and Komm (2010) (using the same data), Teyssèdre et al. (2010), and Lykkjen et al. (2010). The number of QTL identified per study ranges from 4 (Lykkjen et al. 2010) to 18 (Lampe 2009), with the range likely at least partly attributable to differences in significance thresholds used and to differing phenotype definitions. A single putative correspondence between QTL has been described (Lykkjen et al. 2010).

Our study demonstrates the use of clinical observations as a source of data for use in genomic studies and is the first QTL mapping study for OC to be conducted in the Thoroughbred. A GWAS was performed on 348 samples using the Illumina Equine SNP50 BeadChip to identify loci associated with OCD in the Thoroughbred. In addition, QTL for OC previously identified in a Hanoverian Warmblood (HWB) population were tested for their effect in the current data set.

Materials and methods

Sample collection

Blood samples were collected over 2 years (2007 and 2008) from 348 Thoroughbreds (159 males, 189 females) classified either as cases (169) or controls (179) for OC. Horses were admitted for surgery to the Rood and Riddle Equine Hospital, Lexington, Kentucky, at age 9–12 months. Horses originated from one of 19 surrounding horse farms. The number of horses per farm ranged from 2 to 89, with approximately equal numbers of cases and controls sourced from each farm (Fig. 1). Management of the horses, including feeding, housing, and exercise levels, were expected to vary by farm. Due to the anonymity of samples, pedigree details for the horses were not available but the sample was expected to comprise a mixture of half-sibs (by sire and dam since data were collected across 2 years) and more distantly related horses.
https://static-content.springer.com/image/art%3A10.1007%2Fs00335-011-9363-1/MediaObjects/335_2011_9363_Fig1_HTML.gif
Fig. 1

Distribution of cases and controls across farms

Osteochondrosis case samples (n = 169; 90 males, 79 females) consisted of horses that were diagnosed as having OC requiring surgery in at least one joint from radiographic surveys performed by referring veterinarians (see Supplementary Table 1 for further details). The diagnosis was then confirmed through repeat radiography of suspected OC-affected regions on the admission of the horses to the equine hospital. In order to be considered for surgery, cartilage and/or bone fragments separated from the articular surface would have to be present, so our cases should be considered as suffering specifically from osteochondritis dissecans (OCD). Subsequent arthroscopic surgeries were performed by L. R. Bramlage. Typical arthroscopic surgery involved the removal of all fragments and the debridement of any separated articular cartilage and defective bone (McIlwraith 2002). Horses were affected in at least one of the following joints: fetlock (24.9%), hock (56.2%), stifle (29.6%), and shoulder (0.6%). The total number of joints affected per horse ranged from 1 to 5.

Control samples (n = 179; 69 males, 110 females) consisted of horses that were admitted to the hospital for surgical procedures other than OC, most commonly the insertion of a transphyseal bridge to address angular limb deformities (ALD), the arthroscopic removal of osteochondral fractures of the proximal (first) phalanx in the fetlock joint (fetlock chips), and the treatment of sesamoid fractures (see Supplementary Material, Document 1, for further details). Many case horses also underwent these procedures (Table 1). All control horses were clear from signs of OC, as determined by a full radiographic survey (as in the case horses) prior to surgery.
Table 1

Conditions other than OC for which horses were treated

Condition

No. affected

Cases

Controls

Total

Angular limb deformity (ALD)

38

90

128

Fetlock chip(s)

36

71

107

Other chip(s)

3

3

6

Sesamoid fracture(s)

8

23

31

Other—bone-related

4

1

5

Other—not bone-related

7

3

10

For further information see Supplementary Material, Document 1

Genotyping

Blood samples were collected in ethylenediaminetetraacetic acid and DNA was extracted either by Tepnel (http://www.tepnel.com/dna-extraction-service.asp) or at the Animal Health Trust using Nucleon BACC DNA extraction kits (http://www.tepnel.com/dna-extraction-kits-blood-and-cell-culture.asp). A small dilution of each sample was prepared at 70 ng/μl and submitted for genotyping to Cambridge Genomic Services (http://www.cgs.path.cam.ac.uk/services/snp-genotyping/services.html). The Illumina Equine SNP50 Genotyping BeadChip (www.illumina.com/documents/products/datasheets/datasheet_equine_snp50.pdf), which comprises 54,602 single nucleotide polymorphisms (SNPs) located across all autosomes and the X chromosome, was used. These were selected from the database of over one million SNPs (http://www.broadinstitute.org/ftp/distribution/horse_snp_release/v2/) generated during the sequencing of the horse genome (http://www.broadinstitute.org/mammals/horse). Samples for this study were genotyped alongside samples for several other studies, and the full genotyped data set was inspected using the Illumina Genome Studio genotyping module and a series of quality control metrics used to identify poorly performing SNPs. Quality control (QC) at this stage led to the removal of 7.1% of the SNPs (n = 3,895) due to poor genotyping quality (see Supplementary Table 2 in Supporting Material). These SNPs were set to missing prior to the commencement of QC for this study.

QC for data analyses

Firstly, samples were checked for sex discrepancies (marker-based prediction of sex versus sample label) and intermediate X-chromosomal inbreeding (0.2 < F < 0.8), with exclusions being made on the basis of suspected sampling or genotyping errors. This process resulted in two exclusions due to sex discrepancy and 16 exclusions based on indeterminate sex as demonstrated by intermediate inbreeding, leaving 168 controls and 162 cases for further analysis.

For the GWAS (see below), the following thresholds were used for excluding data: minor allele frequency (MAF) (<0.05), missing genotypes per SNP (>5%), missing SNPs per sample (>5%), and differential proportions of missing SNPs between cases and controls (P < 0.05). No exclusions were made on the basis of Hardy–Weinberg equilibrium (HWE).

For construction of a marker-based relationship matrix (see below), a subset of markers meeting more stringent QC was chosen as recommended by Yang et al. (2011), with exclusions made as follows: MAF (<0.10), missing genotypes per SNP (>0.5%), missing genotypes per sample (>1%), and HWE (P < 0.05).

Mixed-model analysis

Binary case/control phenotypes were adjusted for fixed and random effects using the following linear mixed model in ASReml (Gilmour et al. 2009). A single categorical fixed effect was fitted, which represents the division of samples into contemporary groups relating to the three most common reasons for surgery, other than OCD, listed in Table 1 [ALD, fetlock chip(s) and sesamoid fracture(s)] and sex, resulting in 23 × 2 = 16 classes in total, 11 of which contained observations in the final analysis (see Supplementary Table 3 in Supplementary Material). A single random effect, animal, was fitted generating an individual animal model (Henderson 1975) in which the pedigree relationship matrix was replaced with a marker-based relationship matrix (G-matrix) in order to adjust for average allele sharing among sampled horses. Autosomal markers remaining after QC were used to generate the G-matrix as follows: \( f_{i,j} = \frac{1}{N}\sum\nolimits_{k} {\frac{{\left( {x_{i,k} - p_{k} } \right)\left( {x_{j,k} - p_{k} } \right)}}{{\left( {p_{k} \left( {1 - p_{k} } \right)} \right)}}} , \) where summation is across SNPs (k = 1, N), xik is a genotype of the ith horse at the kth SNP coded as 0, ½, and 1, and pk is the frequency of the allele that is homozygous for the genotype coded as 1 (Aulchenko et al. 2007). On the diagonal, fi,i = 0.5(1 + fi), where fi is the loss (or gain) of heterozygosity relative to the expectation. The relationship matrix describes the average relatedness between individuals and therefore controls for genetic stratification likely to be present in the sample. The transformation of the G-matrix into a distance matrix followed by a multidimensional scaling (MDS) analysis (Cailliez 1983; Cox and Cox 1994; R Development Core Team 2009) also allowed data to be inspected for the presence of outliers and substructure. MDS plots based on the first two principal components were considered with respect to farm of origin, sex, and contemporary group. Following the implementation of the mixed model, a vector of approximately normally distributed residual errors replaced our binary (0, 1) observation as the phenotype for testing in the GWAS.

Genome-wide association study

GWAS was performed in GenABEL (Aulchenko et al. 2007) using a score test for a Gaussian distributed trait and no covariates (Schaid et al. 2002). A genome-wide significance level was calculated by performing 10,000 permutations of the residual phenotypes against genotypes. Permutations were carried out within sex, and the 5% significance level empirically determined. Confirmation of the effects of SNPs found to be significant by this approach was carried out by fitting all such SNP genotypes (coded as 0, 1, 2) simultaneously as fixed effects in the original mixed model.

Testing previously published QTL

SNPs selected to represent OC QTL detected in other studies were also tested by fitting them simultaneously as fixed effects in the mixed model. The QTL regions tested were based on primarily GWAS results published in Lampe (2009) and Komm (2010). These studies were performed on samples from HWB horses, and it has been shown in a reference sample of more than 150,000 horses that the Thoroughbred contributes nearly 35% of this breed’s genes (Hamann and Distl 2008). Whilst these studies examined a range of OC phenotypes, we tested only QTL relevant to OC or OCD with fetlock and hock cases combined (see Supplementary Table 4 in Supplementary Material for a list of QTL). Where SNP names or precise SNP locations were provided, the exact SNP was fitted in the mixed model with the exception of one case where the SNP was not typed in our sample; in that case the closest SNP was fitted in the mixed model (type A in Supplementary Table 4). In cases where only an approximate location was given, i.e., to the nearest 0.1 Mb, current GWAS results for the region 1 Mb upstream and 1 Mb downstream were examined and the SNP with the smallest P value was fitted in the mixed model (type B in Supplementary Table 4). Finally, in cases where several SNPs within a region were listed as being significant, the same range was searched in the current GWAS analysis and the SNP with the smallest P value was fitted in the mixed model (types C and D in Supplementary Table 4). In order to assess their ability to enhance our model, all SNPs representing QTL were fitted simultaneously alongside the contemporary group, SNPs found to be significant in the current GWAS and the G-matrix in the mixed model.

Results

Mixed-model analysis

The genomic relationship matrix was calculated based on 30,554 autosomal SNPs that passed the stringent QC thresholds. The distribution of genomic relationships between individuals in the sample is shown in Fig. 2. MDS plots revealed no obvious outliers or any genetic substructure relating to factors such as farm or contemporary group (data not shown). The fitting of the mixed model resulted in an extremely small estimated genetic variance component (<10−7), making it impossible to estimate trait heritability with any precision; estimates of random animal effects (estimated additive breeding values) were correspondingly small (−5.8 × 10−8 to 6.5 × 10−8). Therefore, the residuals generated for testing in the association study were determined by contemporary group. The distribution of residuals can be seen in Fig. 3.
https://static-content.springer.com/image/art%3A10.1007%2Fs00335-011-9363-1/MediaObjects/335_2011_9363_Fig2_HTML.gif
Fig. 2

Distribution of genomic relationships between pairs of individuals

https://static-content.springer.com/image/art%3A10.1007%2Fs00335-011-9363-1/MediaObjects/335_2011_9363_Fig3_HTML.gif
Fig. 3

Distribution of residuals from mixed-model analysis

Genome-wide association study

Following QC, 40,180 SNPs were tested for association; the mean MAF of the remaining SNPs was 0.28 and the distribution of MAF was approximately uniform. Based on empirical genome-wide significance (P < 2.91 × 10−6), a single SNP was found to be significantly associated with OCD as tested using residuals from the mixed model. This was SNP BIEC2-799865, located at 88,493,417 bp on ECA3. This SNP has alleles C and T, with a MAF (T) of 0.4, and conforms to a HWE genotype distribution (see Table 2 for genotype frequencies). Figure 4 shows a Manhattan plot of SNPs on ECA3. A haplotype block analysis of the region containing BIEC2-799865 revealed somewhat erratic linkage disequilibrium (LD) structure surrounding the SNP, making the definition of an associated QTL region problematic (Fig. 5). The apparent deviation from the expectation of decreasing LD with increasing distance between markers exhibited by BIEC2-799865 and its neighbours goes someway to explaining why this SNP stands apart from surrounding SNPs in Fig. 4. With SNPs exhibiting r2 (Purcell 2009; Purcell et al. 2007) with BIEC2-799865 of greater than 0.10 at distances up to 10 Mb, we extended our search for other potentially associated SNPs within this range. An additional four SNPs within 10 Mb of BIEC2-799865 had P < 0.001; two of these SNPs had r2 of 0.45–0.55 and were within three SNPs of BIEC2-799865 (Fig. 5), with the remainder being more than 4 Mb away and having r2 < 0.10. All four SNPs were located to the right of BIEC2-799865.
Table 2

Genotype frequencies of BIEC2-799865 and results of χ2 tests for association with OCD

 

Genotype frequency

Total No. samples

P value from χ2 testa

C/C

C/T

T/T

Controls

0.26

0.55

0.19

168

 

Casesb

0.44

0.46

0.10

162

0.002

Hock cases

0.42

0.46

0.12

89

0.034

Stifle cases

0.48

0.44

0.08

50

0.008

Fetlock cases

0.44

0.46

0.10

41

0.062

aThe χ2 tests compare each case category with the controls

bThe number of cases is not equal to the sum of the cases in each joint location because some horses were affected in multiple joint locations

https://static-content.springer.com/image/art%3A10.1007%2Fs00335-011-9363-1/MediaObjects/335_2011_9363_Fig4_HTML.gif
Fig. 4

A Manhattan plot showing association results for ECA3. The solid horizontal line represents the genome-wide significance level and the dashed line represents the significance level used to identify surrounding SNP with possible relevance

https://static-content.springer.com/image/art%3A10.1007%2Fs00335-011-9363-1/MediaObjects/335_2011_9363_Fig5_HTML.gif
Fig. 5

LD plot (Barrett et al. 2005) of ECA3 region 1 Mb either side of BIEC2-799865 (solid line, black circle). Haplotype blocks were derived using default algorithm in Haploview (Gabriel et al. 2002). SNPs within the UGDH gene are indicated by a white circle. SNPs with a P < 0.001 in the GWAS are indicated by a dashed line

Fitting BIEC2-799865 as an additional covariate in the mixed model resulted in an estimated additive effect of −0.16 (±0.03), i.e., for every T allele an individual carries at the locus, that individual’s probability of OCD is decreased by 0.16. This allows us to make a crude estimate of the contribution of this SNP to the overall phenotypic variance. Under the assumption of no dominance or interaction effects and using VA = 2p(1 − P2 (Falconer and Mackay 1996), where P is allele frequency at the locus and α is the estimated SNP effect, BIEC2-799865 explains ~5% of the variance of OCD. The effect of BIEC2-799865 remained significant, even when contemporary group was removed from the mixed model. Fitting the additional four SNPs with P < 0.001 alongside contemporary group and BIEC2-799865 resulted in both BIEC2-799865 and one of the more distant SNPs (BIEC2-802230) having regression coefficients significantly different from zero.

Testing previously published QTL

For each of the 24 QTL regions listed in Supplementary Table 4, a representative SNP was added to the mixed model containing contemporary group, BIEC2-799865, and the random effect of animal so that all SNPs were analysed simultaneously. This analysis resulted in only 2 of the 24 SNPs having a significant association with OCD. These SNPs were BIEC2-859811 on ECA4 (39,852,072), representing a QTL at 39.26 Mb (Supplementary Table 4, QTL No. 8) (Komm 2010) and BIEC2-410967 on ECA18 (36,772,271), representing a QTL between 36,408,881 and 38,738,316 (Lampe 2009) (Supplementary Table 4, QTL No. 16). BIEC2-799865 remained significant when fitted alongside the 24 QTL SNPs, albeit with a slightly reduced size of effect (−0.11).

Discussion

This GWAS in the Thoroughbred revealed a single SNP, BIEC2-799865 on ECA3, to be associated with OCD at a genome-wide level of significance when tested using the residuals from a mixed-model analysis. Population genetics theory allows us to predict that assuming the heritability for OCD is 0.15, this QTL accounts for ~34% of the genetic variation of the trait. However, effect estimates based on primary GWAS data have been shown to be upwardly biased, often to a large degree (Göring et al. 2001), and so a majority of the genetic variance underlying OCD remains to be captured. Two neighbouring SNPs showed an association with OCD that approached significance (P < 0.001); the relatively lower MAF of these SNPs (0.27 and 0.25) compared to that of BIEC2-799865 (0.4) may explain their failure to reach genome-wide significance. The lack of haplotype block structure around BIEC2-799865 means that the much sought after and characteristic GWAS peak is not observed in this case. Whilst the implication of this on the validity of the association is not clear, it does impact on our ability to precisely define a corresponding QTL region for further evaluation. Although it would have been desirable to fit haplotypes representing the associated region in our model, the low LD in the region hindered our ability to accurately infer phase. For the purposes of candidate gene discovery, we chose to examine the region 1 Mb on each side of the SNP.

The 2-Mb window surrounding BIEC2-799865 contained 22 labelled genes, 21 of which are described as protein-coding and one is labelled as a pseudogene. Whilst according to the current annotation, BIEC2-799865 lies between genes, LOC100064680, located at 88,494,084–88,511,295 bp, contains (within an intron) BIEC2-799867, the SNP that is both adjacent to and most highly correlated with BIEC2-799865. This gene is described as being similar to the basic Kruppel-like factor, and studies in mice and C. elegans show orthologues to this gene, kruppel-like factor 3 (basic) (KLF3), to be involved in adipogenesis (Sue et al. 2008; Zhang et al. 2009). More generally, KLFs have been described as DNA-binding transcriptional regulators that play diverse roles during differentiation and development (Bieker 2001). Whilst the likely function of KLF3 does not preclude its relevance, there is no evidence of a direct role for this gene in OC. This was true of most of the genes located within the QTL region defined, with the exception of UDP-glucose dehydrogenase (UGDH). The UGDH gene (located at 87,818,121–87,843,937 bp) appears to function in the regulation of glycosaminoglycan (GAG) synthesis in cells lining the articular cartilage surface (Clarkin et al. 2011). These GAGs are involved in extracellular matrix integrity, playing a crucial role in chondrogenesis, homeostasis, and compressive resilience (Clarkin et al. 2011). A potential link between GAG and osteochondrosis has been demonstrated by the observation of differential levels of GAG in osteochondritic lesions versus healthy cartilage (Kuroki et al. 2002; Lillich et al. 1997). However, the direction of causality is not clear and several other studies have observed no significant difference (Bertone et al. 2005; de Grauw et al. 2006). Two SNPs located within introns of UGDH were not significantly associated with OCD (0.05 < P < 0.10). One of these SNPs did, however, show moderate LD (r2 = 0.1–0.2) with BIEC2-799865 and the two neighbouring SNPs mentioned above (Fig. 5); as before, the relatively lower MAF of this SNP (0.34) may have prevented it from appearing above the background in terms of significance. The second SNP in UGDH had a MAF of 0.06 and therefore provided little information about either association or LD. Whilst the distance of this gene from BIEC2-799865 and its relatively low LD with the SNP make one question its relevance, there are likely to be many untyped variants in this region, some of which could plausibly have stronger LD with BIEC2-799865.

Three previous GWAS for OC in the horse have also identified QTL on ECA3 (Komm 2010; Lampe 2009; Teyssèdre et al. 2010). The closest to BIEC2-799865 was presented recently in a preprint version of a study carried out in French Trotters and is located at 100–110 Mb (Teyssèdre et al. 2011). The relatively close proximity of the two QTL represents some correspondence between studies. However, with average LD at this distance (~12 Mb) being r2 < 0.02 (Corbin et al. 2010), it is also possible that these QTL represent two different underlying genetic variants.

Adding SNPs to represent previously identified QTL to our model (which included BIEC2-799865) resulted in 2 of 24 SNPs tested having regression coefficients significantly different from zero (P < 0.05) and therefore showing the potential to enhance the fit of the model. On ECA4, BIEC2-859811 (39,852,072) had a regression coefficient of −0.102 (±0.049). Komm (2010) identified six candidate genes located between 37.1 and 44.7 Mb. On ECA18, BIEC2-410967 (36,772,271) had an estimated effect size of −0.085 (±0.042). Lampe (2009) identified three candidate genes in the vicinity of the QTL corresponding to this SNP. These apparent validations should, however, be viewed with caution since adjustments to the mixed model, e.g., the removal of BIEC2-799865, led to different QTL being significant and we were therefore unable to unambiguously confirm any of the previous QTL in the current data set.

There are several reasons for the poor correspondence between QTL studies of OC in the horse. Firstly, the QTL that have been identified to date may be false positives (McCarthy et al. 2008). Alternatively, subsequent studies may have been underpowered to detect them. In this case, such results may be due to, for example, differences in phenotypic definition or population ancestry. Ideally, replication studies should involve precisely the same allele or haplotype, the same phenotype, and the same genetic model as the original signal (McCarthy et al. 2008). In this study, by testing only the QTL regions associated with OC under the combined phenotype definition (hock and fetlock) used by Lampe (2009) and Komm (2010), the difference in phenotypic definition between the three studies was minimised.

Another reason for the lack of correspondence may be breed differences. Hamann and Distl (2008) estimated that 35% of the HWB genes came from Thoroughbred lines, but it is not known what the proportion was in the Komm (2010) and Lampe (2009) sample of 154 foals. Assuming the same QTL are controlling the genetic predisposition to OC in both breeds, differences in allele and haplotype frequencies between breeds will have an impact on the proportion of variance the QTL explain and therefore on our ability to detect them. Furthermore, with no standardised method for either reporting QTL or carrying out validation studies, the approach taken here to select SNPs for testing in the mixed model was largely subjective and we may have missed more appropriate SNPs.

Despite being one of the largest GWAS of OC in horses performed to date, the principal limitation of this study remains lack of power. This lack of power is evidenced by both the low number of genome-wide significant SNPs and the very small estimated genetic component. Whilst disappointing, our inability to estimate heritability in this sample is perhaps not surprising given the relatively large standard errors that accompany some of the heritability estimates for OC to date (Pieramati et al. 2003; Wittwer et al. 2007a). Furthermore, our findings do not necessarily rule out a nonzero heritability; rather more data are needed to produce a reliable estimate.

The explanation for the apparent low power of this study is likely to be multifaceted. Firstly, since power is directly related to sample size, the relatively small number of horses genotyped for this study will have limited the number of identifiable QTL, as shown by power calculations of, for example, Wang et al. (2005). Secondly, phenotypic definition can play an important role in determining the power of GWAS of complex diseases. Optimal phenotypic definitions are those with strict inclusion criteria, with minimising phenotypic heterogeneity amongst cases being a useful way of increasing study power (McCarthy et al. 2008). Unfortunately, OC represents a clinically complex phenotype that affects multiple joints and predilection sites within joints and appears in a variety of different forms. Just as prevalence and heritability estimates for OC have been affected by this problem, so we can expect QTL mapping studies to be. In this study, by considering exclusively those cases with fragments present (OCD), the phenotypic heterogeneity of the cases has been reduced. We are also following the recommendations of van Grevenhof et al. (2009b), i.e., flattened bone contours and fragments should be evaluated as statistically different disorders.

Several studies to date have considered further subdivision of OC cases by joint affected, resulting in different QTL being identified for each subgroup (Dierks et al. 2007; Wittwer et al. 2007b). This is appealing given the apparent low correlation among the occurrence of lesions of OC in different body locations (Jorgensen and Andersen 2000; Jorgensen et al. 1995; van Grevenhof et al. 2009b) and the corresponding idea that OC is in fact a localised disease (Ytrehus et al. 2007). However, subdividing cases in this way represents a significant loss of power. Furthermore, testing several manifestations of the disease serves to exacerbate the already serious problem of multiple testing. For this reason and from a practical selection perspective, expressing OC as a single trait is more appealing and should enable the identification of QTL controlling more generalised factors.

In this study, model complexity due to the presence of horses suffering from conditions other than OC in our cohort may have reduced the power of our association test. The uneven representation of cases and controls across the contemporary groups describing the presence or absence of ALD, fetlock chips, and sesamoid fractures in our samples was a potential cause of bias in the sample and therefore had to be fitted in the model. In the event that none of these conditions are related to OC or have a hereditary component, our adjustment for contemporary group represents a loss of power through the reduction in the number of degrees of freedom of the model. However, in the case where one or more of these diseases has a hereditary component [of which there is some evidence (Philipsson et al. 1993; Wittwer et al. 2007a)], the exclusion of contemporary group from the model would result in severe confounding. Since the latter is by far the more serious case, we chose to fit contemporary group in the mixed model.

However, there is seemingly a trade-off to be made. Whilst the use of clinical data in this case added complexity and potentially noise to the data, it also gave us increased confidence in our phenotypic classifications of OCD. In this study, all of our cases underwent arthroscopy, the so-called gold standard of diagnosis of cartilage defects (McIlwraith 2010), and so we can be confident of high specificity. All of the controls had OC ruled out through a comprehensive radiographic survey of predilection sites, and the evaluation of radiographs by a specialist in the field (LRB) significantly reduced the chance of OC going undiagnosed.

In this GWAS we identified a SNP associated with OCD in a sample of 330 Thoroughbreds. This association requires validation in an independent data set in order to rule out the possibility that it represents a false-positive association. In the event that the SNP is validated, further fine-mapping and resequencing of the region will be needed to elucidate the causal mutation behind this association. The likely issue of poor power to detect QTL in this study illustrates the challenge faced by members of the equine genetics community in collecting and genotyping sufficiently large samples for effective GWAS to be carried out. Here we have demonstrated the potential for clinical data to be utilised as a source of samples for the future.

Acknowledgments

LJC thanks A. Tenesa and R. Pong-Wong for helpful discussions and S. Miller for help with data preparation. LJC also thanks two anonymous referees for their helpful comments. Samples from OCD cases and controls were provided by LRB. LJC, JAW, and S. C. Bishop are financially supported by the British Equestrian Federation, the Biosciences Knowledge Transfer Network, and the Biotechnology and Biological Sciences Research Council (BBSRC). S. C. Blott, JES, CS, LYF-C, MH, TDHP and the genotyping were funded by the Horserace Betting Levy Board and the Thoroughbred Breeders’ Association.

Supplementary material

335_2011_9363_MOESM1_ESM.doc (34 kb)
Supplementary material 1 (DOC 34 kb)
335_2011_9363_MOESM2_ESM.doc (40 kb)
Supplementary material 2 (DOC 40 kb)
335_2011_9363_MOESM3_ESM.doc (29 kb)
Supplementary material 3 (DOC 29 kb)
335_2011_9363_MOESM4_ESM.doc (56 kb)
Supplementary material 4 (DOC 56 kb)
335_2011_9363_MOESM5_ESM.doc (52 kb)
Supplementary material 5 (DOC 52 kb)

Copyright information

© Springer Science+Business Media, LLC 2011