Introduction

Adult height is an explicit quantitative phenotype and stable once a person has grown and is easily measured. Adult height is largely controlled by genetic factors, with heritability ranging from 75 to 90% in various populations (Carmichael and McGue 1995; Silventoinen et al. 2003).

Adult height usually follows normal distribution in a given population and sex, the phenotype representing a typical polygenic model of a human quantitative trait influenced by multiple genes each with small effects. Numerous linkage studies have attempted to identify loci underlying adult height variation. Thompson et al. first reported a locus for adult height on chromosome 20 in Pima Indians (Thompson et al. 1995). Several other groups reported evidence of linkage with adult height in Europeans (Beck et al. 2003; Dempfle et al. 2006; Deng et al. 2002; Ellis et al. 2007; Geller et al. 2003; Hirschhorn et al. 2001; Liu et al. 2006; Mukhopadhyay et al. 2003; Mukhopadhyay and Weeks 2003; Perola et al. 2001; Perola et al. 2007; Sammalisto et al. 2005; Willemsen et al. 2004; Wiltshire et al. 2002; Xu et al. 2002). Wu et al. (2003) found evidence of linkage in four ethnic groups; White, Black, Mexican American, and Asian. Most recently, Visscher reported a large-scale linkage study with 11,214 sibling pairs showing that additive genetic variance is spread across multiple chromosomes, with no evidence of large between-chromosome epistatic effects (Visscher et al. 2007). While multiple evidence of linkage of adult height has been identified in several populations in these studies, common loci are not evident.

Since linkage study has limited power to detect genes of modest effect, especially where there is genetic heterogeneity, we applied association study with a sufficient number of subjects to identify genes with a small impact on the phenotype (Risch and Merikangas 1996). Recently, some groups reported genes associated with adult height variation using data from genome-wide association study (Gudbjartsson et al. 2008; Lettre et al. 2008; Sanna et al. 2008; Weedon et al. 2007, 2008). In the present study, we report results of a genome-wide association study of adult height with 1,555 individuals from the Khalkh population of Mongolia using 23,465 microsatellite markers. The Khalkh population has a relatively close genetic affinity to populations of the northern part of East Asia showing a relatively homogeneous genetic background, which provides an advantage to study complex phenotype (Katoh et al. 2002, 2005; Nakajima et al. 2004). We applied a selective genotyping strategy in which individuals with trait values deviating from the population mean were preferentially recruited to identify genetic variations underlying quantitative traits with improved power (Arking et al. 2006; Lander and Botstein 1989).

Material and methods

Study subject selection

Adult height for both male and female shows normal distribution with average height and standard deviation being 164.76 ± 5.74 cm for male and 153.76 ± 5.04 cm for female according to epidemiological and anthropometric surveys on adult height among Khalkh-Mongolians (Otgon et al. 2002; personal communication L. Namsrainaidan). A total of 1,555 unrelated individuals of Khalkh-Mongolian origin from the region of Ulaanbaatar, Mongolia participated in the current study. The selection of individuals from the general population was >95th percentile for the tall group corresponding to >173.9 cm and <5th percentile for the short group corresponding to <155.6 cm for male and 161.8 and 145.7 cm, respectively, for female. The subjects in the short group were over 18 years of age and those in the tall group were over 15 years of age at the time of examination. Individuals with medical conditions affecting adult height, such as dwarfism, gigantism, and acromegaly were excluded. The study was approved by the Institutional Review Board of Tokai University and the Medical Research Ethics Committee of the National Institute of Medicine and the Ethics Committee, Ministry of Health, Mongolia. The participants gave written, informed consent.

DNA pool construction and microsatellite genotyping

The pooled DNA method for microsatellite typing was performed according to the protocol of Collins et al. (2000) with a slight modification (Oka et al. 2003). DNA was extracted using QIAamp DNA blood kit (QIAGEN) under the standardized protocol to prevent variation of DNA quality. The DNA concentration was precisely measured using the PicoGreen fluorescence assay (Molecular Probes) as previously described (Tamiya et al. 2005; Kawashima et al. 2006). For the first round screening, four DNA pools were prepared. The first set for association study was DNA pools of 125 male-tall, 125 male-short, 125 female-tall, and 125 female-short samples, respectively. A second set was also grouped from another 125 male- and female-tall samples and 125 male- and female-short samples, respectively. In the first round screening, 23,465 microsatellite markers were used. Among them, showing statistical significance of P < 0.05 were subjected to the second round screening.

All microsatellite markers and methods for microsatellite genotyping used in this study are described by Tamiya et al. (2005). PCR on pooled DNAs was performed in a 20-μl reaction mixture containing 48 ng of pooled DNA, 0.5 U of AmpliTaq DNA polymerase, 1× reaction buffer with 1.5 mM MgCl2 provided by the manufacturer (Applied Biosystems), 5 μM of each primer, and 0.25 mM of each deoxyriobonucleotide triphosphate (dNTP) in 96-well plates. The PCR amplification was performed on the GeneAmp PCR System 9700 (Applied Biosystems) with the following conditions: 96°C for 5 min (hot start), 57˚C for 1 min, and 72˚C for 1 min followed by 40 cycles of 96°C for 45 s, 57°C for 45 s and 72°C for 1 min. For the microsatellite genotyping of individual samples, PCR was performed in a 20 μl reaction containing 1 ng of genomic DNA. The amplification condition was the same as described above. The pooled and individual microsatellite genotyping procedures after PCR amplification were carried out according to standard protocols using ABI3730 DNA analyzer (Applied Biosystems). Peak positions and heights were automatically extracted by the PickPeak and MultiPeaks programs.

SNP genotyping

The SNPs in candidate regions were selected from the SNP database of Applied Biosystems (http://www2.appliedbiosystems.com/) using SNPbrowser software 3.5 (Applied Biosystems). The SNPs were genotyped by TaqMan assays. The TaqMan assays were carried out using the standard protocols for the ABI PRISM 7900HT Sequence Detection System using a 384-well block module and automation accessory (Applied Biosystems).

Statistical analysis

In pooled DNA typing, adult height associations with microsatellites were assessed by Fisher’s exact test, with the use of 2 × 2 contingency tables for each allele. Allele frequencies in pooled DNA typing were estimated from the height of peaks: each allele frequency was determined by dividing the height of each allele by the summed height of all alleles. In individual genotyping, significance was evaluated by Fisher’s exact test, with the use of 2 × 2 contingency tables for each allele.

For SNPs genotyping, adult height associations were assessed using chi-square test (Haploview 4.0 software [http://www.broad.mit.edu/mpg/haploview/]). Since multi-step analysis was used, the nominal P values were corrected with 1,000,000 iterated permutations for all 82 SNPs. Significance level was set at .05 throughout the study.

To assess the extent of pair-wise linkage disequilibrium between SNPs, standard definition of D′ and r 2 were calculated using Haploview software. D′ and r 2 were calculated only for polymorphisms with minor-allele frequency (MAF) > 5%. LD blocks were then defined with pair-wise LD with D′ > 0.9.

Results and discussion

Genome-wide association study

We performed a genome-wide association study with 23,465 microsatellite markers for detection of loci controlling adult height using the selective genotyping method. To reduce cost and technical burden of genome-wide genotyping, the pooled DNA method was applied, as previously described (Collins et al. 2000, Tamiya et al. 2005). Association results with the pooled DNA method and following re-genotyping of individual DNAs using the same set of 1,000 screened individuals, 23 markers showed significant differences by Fisher’s exact test (Table 1). These markers were subjected to correction of multiple tests with the number of alleles, and nine microsatellites remained significant.

Fig. 1
figure 1

SNP allelic association within 15q22.33-q23. SNP association analysis. The blue line shows P values calculated by chi square test. The red line shows P values generated after 1,000,000 iterated permutations. Yellow background indicates the 188 kb LDB. In the Mongolian population, we investigated the 188 kb LD block constructed by these significant markers spanning from intron 6 of SMAD3 (rs2289791) to intron 10 of IQCH (rs12164949). The LD block contained the MH2 domain and the 3′ UTR of SMAD3, the entire coding sequence of FLJ11506, and the IQ domain of IQCH. SNP rs8038652 located in intron 1 of IQCH was most strongly associated (P = 0.0003, Pc = 0.015) with adult height. SNP rs227860 located in the 3′ UTR of SMAD3 also was associated (P = 0.0006, Pc = 0.028). SNP rs7166081 (P = 0.0004, Pc = 0.018) was in an intergenic region between SMAD3 and FLJ11506. Three remaining SNPs, rs4776908 (P = 0.0004, Pc = 0.017), rs877177 (P = 0.0004, Pc = 0.020), and rs4776906 (P = 0.0007, Pc = 0.030) located in intron 1, 5, and 5 of FLJ11506, respectively, were also associated

Table 1 Twenty-three positive microsatelllite markers from individual genotyping

Visscher et al. reported that at least six chromosomes (3, 4, 8, 15, 17, and 18) were responsible for height variation in the European population (Visscher et al. 2007). We also detected significant association in those chromosomes, except chromosome 18. In addition, five regions overlapped at least partially with loci previously reported by linkage analysis, 5q31 (Wu et al. 2003), 6q25 (Hirschhorn et al. 2001; Xu et al. 2002), 8q21.3 (Perola et al. 2007), 8q21.13 (Willemsen et al. 2004), and 21q21.1 (Hirschhorn et al. 2001), respectively. We also detected six strongly associated regions, 4q13.2, 4q31.3, 5q21.3, 7p21.1, 7q11.22, and 19q13.2, which have not been reported before. The inconsistent results of these studies may be due to population specificities and/or differences of technique.

Fine mapping by SNP

Among the nine most associated markers, we selected two: D8S0285i and D15S988. D8S0285i was the most strongly associated microsatellite, located at 8q21.13, and D15S988 was flanked by a candidate gene, SMAD3, located at 15q22.33. 82 SNPs were surveyed and genotyped in a total of 1,555 samples (1,000 screened samples and additional 555 samples).

Ten SNPs at 8q21.13 showed nominal significance, among which SNP rs2220456 was the most strongly associated with height, showing empirical significance (P = 0.000016, Pc = 0.0008). These SNP associations might be reflected the reported evidence of linkage (Perola et al. 2007: Willemsen et al. 2004). Since an approximately 300 kb region in the vicinity of SNP rs2220456 and D8S0285i at 8q21.3 had no coding sequence according to NCBI build 36.2, we shifted our target to the locus at 15q22.33-q23. To cover a gene-containing region, we selected two additional microsatellites, D15S0240i and D15S0028i, and 64 SNPs at 15q22.33-q23 (Fig. 1). Among these, allele 230 of D15S0240i and six SNPs retained empirical significance (Pc < 0.05) as depicted in Figure1. SNP rs8038652, the most strongly associated SNP, is located in intron 1 of IQCH. The six SNPs maintained a strong LD index with each other (D′ > 0.9 and r 2 = 0.8). Additionally, SNP rs8038652 and allele 230 of D15S0240i were in strong LD (D′ = 0.99 and r 2 = 0.77).

Based on the SNP association results, SNP rs8038652 was further analyzed under different genetic models. Association analysis under a recessive model for SNP rs8038652 showed the lowest P value (P = 0.000046) with the AA genotype, indicating that the AA genotype of rs8038652 has an adverse effect on adult height in Mongolians (odds ratio = 0.59, confidence interval, 0.46–0.76). Additionally, a deviation from HWE (P = 0.04) was observed in the tall height group with SNP rs8038652.

In conclusion, we have identified two candidate loci for adult height at 8q21.13 and 15q22.33-q23 in Mongolians. Although the causative polymorphisms were not determined in this study, we were able to locate genetic association with adult height to two regions. 15q22.33-q23 contains only three genes, so functional analyses should help to elucidate the causative polymorphisms. Analysis of the remaining seven highly associated microsatellite markers should lead to identification of new causative genes underlying adult height variation.