International Journal of Legal Medicine

, Volume 124, Issue 6, pp 653–657

Y chromosome homogeneity in the Korean population

Short Communication

DOI: 10.1007/s00414-010-0501-1

Cite this article as:
Kim, S.H., Han, M.S., Kim, W. et al. Int J Legal Med (2010) 124: 653. doi:10.1007/s00414-010-0501-1

Abstract

The distribution of Y-chromosomal variation from the 12 Y-SNP and 17 Y-STR markers was determined in six major provinces (Seoul-Gyeonggi, Gangwon, Chungcheong, Jeolla, Gyeongsang, and Jeju) to evaluate these populations’ possible genetic structure and differentiation in Korea. As part of the present study, a 10-plex SNaPshot assay and two singleplex SNaPshot assays were developed. Based on the result of 12 Y-SNP markers (M9, M45, M89, M119, M122, M174, M175, M214, RPS4Y, P31, SRY465, and 47z), almost 78.9% of tested samples belonged to haplogroup O-M175 (including its subhaplogroups O3-M122: 44.3%, O2b*-SRY465: 22.5%, O2b1-47z: 8.7%), and 12.6% of the tested samples belonged to haplogroup C-RPS4Y. A total of 475 haplotypes were identified using 17 Y-STR markers included in the Yfiler kit, among which 452 (95.2%) were individual-specific. The overall haplotype diversity for the 17 Y-STR loci was 0.9997 and the discrimination capacity was 0.9387. Pairwise genetic distances and AMOVA of the studied Korean provinces reflected no patrilineal substructure in Korea, except for Jeju Island. Thus, this survey shows that the present data of Korean individuals could be helpful to establish a comprehensive forensic reference database for frequency estimation.

Keywords

Y chromosome SNP STR Forensic genetics Population structure Koreans 

Introduction

Y chromosome short tandem repeats (STRs) are powerful markers for forensic purposes (particularly in male–female DNA mixtures) and paternity testing. To estimate the match probability of the Y-STR haplotype between the compared DNA, it is a prerequisite to construct a Y-STR population database taking the subpopulation structure into account. The Y chromosome is more sensitive to genetic drift due to its fourfold smaller effective population size compared with an autosome. Therefore, it is necessary to verify that no population substructure exists before pooling Y-STR data from different regions of the country [1, 2]. Although studies have been undertaken to establish the Korean Y-STR databases [3, 4, 5], a focus on population substructure of Y-STR haplotypes within Korea has not yet drawn much attention.

Recently, as the Y chromosome single nucleotide polymorphism (SNP) haplogroup has become a valuable tool for predicting the geographical or ethnic origin of unknown forensic evidences, there has been an increasing interest about the combined data on Y-SNP haplogroups and Y-STR haplotypes [6, 7, 8, 9]. Here, the combined Y-SNP haplogroups and Y-STR haplotypes of the Korean population consisting of six major provinces were analyzed in order to improve our understanding of the population structure of Korea.

Materials and methods

Subjects

Buccal swabs and blood samples were collected from 506 unrelated males residing in six major provinces in Korea: Seoul-Gyeonggi (YA003583, n = 110), Jeolla (YA003583, n = 63), Chungcheong (YA003588, n = 72), Jeolla (YA003584, n = 90), Gyeongsang (YA003586, n = 84), and Jeju (YA003585, n = 87) [10]. Appropriate informed consent was obtained from all subjects. Genomic DNA extraction was carried out by DNA IQ System (Promega, Madison, WI, USA). We performed proficiency testing of Collaborative Testing Service and the Y chromosome haplotype reference database (YHRD; http://www.yhrd.org) trials during the course of this study.

Y-SNP genotyping

As part of the present study, a 10-plex SNaPshot assay (M45, M89, M119, M122, M174, M175, M214, P31, SRY465, and 47z) and two singleplex SNaPshot assays (RPS4Y, and M9) were developed. A phylogenetic tree defined with the 12 Y-SNP markers and scheme of 12 Y-SNP genotyping is shown in Fig. S1. The electropherograms of Y-SNP genotyping results are shown in Fig. S2. The Y-SNP haplogroups were named using the Y Chromosome Consortium nomenclature [11]. Pimers for polymerase chain reaction (PCR) and single base extension (SBE) reaction were designed with the Primer 3.0 program. For M45, M89, and M175, previously described PCR primers were used [12].

Ten-plex PCR reaction was performed on a GeneAmp PCR System 9700 in a 25 μl of 1 ng DNA, 1 × PCR buffer II (Applied Biosystems, Foster City, CA, USA), 1.5 mM MgCl2, 200 μM of each dNTPs, 0.08–0.4 μM of each primer (Table S1), bovine serum albumin (0.16 mg/ml), and 1 U AmpliTaqGold DNA polymerase. The cycling conditions were 95°C for 5 min followed by 35 cycles of amplification at 95°C for 30 s, 62°C for 30 s, 72°C for 30 s, and a final extension at 72°C for 7 min. PCR products were purified by QIAquick gel extraction kit (Qiagen, Valencia, CA, USA) after confirmation of PCR amplicons by 2% agarose gel electrophoresis.

Ten-plex SBE reaction was performed on a GeneAmp PCR System 9700 in 10 μl with 4 μl of purified PCR product, 5 μl of SNaPshot reaction mix (Applied Biosystems), and 1 μl of SBE primer mix (0.01–1.2 μM of each primer, Table S2). The SBE primer mix was diluted in 200 mM ammonium sulfate to minimize primer-dimer artifacts. The cycling parameters consisted of 25 cycles of 96°C for 10 s, 50°C for 5 s, and 60°C for 30 s. The unreacted fluorescent ddNTPs were removed by addition of 1 U shrimp alkaline phosphatase to the SBE product and incubation at 37°C for 45 min followed by 80°C for 15 min. Samples were analyzed by capillary electrophoresis on an ABI Prism 3100 Genetic Analyzer in which 1 μl of SBE product was mixed with 20 μl of Hi-Di formamide and 0.3 μl of LIZ120 internal size standard. Automated allele calls were made using GeneMapper ID v3.2.

To define haplogroup C-RPS4Y and K-M9, samples initially resulting as haplogroup Y* (n = 64) and F* (n = 3) using 10-plex SNaPshot assay were further analyzed by the singleplex SNaPshot assay based on each primer pair of RPS4Y and M9, respectively (Fig. S1). Cycling parameters were initial denaturation at 95°C for 5 min followed by 35 cycles of amplification at 95°C for 30 s, 60°C for 30 s, 72°C for 30 s, followed by a final extension at 72°C for 7 min. Conditions of the SBE reaction and detection of the RPS4Y and M9 by capillary electrophoresis were the same as the aforementioned 10-plex SNaPshot genotyping assay.

Y-STR genotyping

Y-STRs (DYS456, DYS389I, DYS390, DYS389II, DYS458, DYS19, DYS385a/b, DYS393, DYS391, DYS439, DYS635, DYS392, YGATA H4, DYS437, DYS438, and DYS448) were amplified with the AmpFlSTR Yfiler kit (Applied Biosystems) and analyzed by capillary electrophoresis using an ABI Prism 3730 Genetic Analyzer and GeneMapper ID v3.2. Alleles were named according to the International Society of Forensic Genetics recommendations [1].

Data analysis

Haplogroup, haplotype, and allele frequencies were estimated by direct counting. Haplogroup, haplotype, and gene diversities were calculated as previously described [13]. Discrimination capacity was determined by dividing the number of different haplogroups or haplotypes by the number of sampled individuals. Pairwise genetic distances (RST) and analysis of molecular variance (AMOVA) based on 15 Y-STR markers (excluded DYS385a/b) were calculated using Arlequin 3.11 [14].

Results and discussion

In all the individuals studied for the 12 Y-SNP markers, a total number of 10 Y-SNP haplogroups was observed from 506 Korean male samples. The frequency distribution of the 10 Y-SNP haplogroups in six major provinces of Korea is listed in Table 1. The overall haplogroup diversity was 0.7276, ranging from 0.6781 in Seoul-Gyeonggi to 0.7940 in Gyeongsang.
Table 1

Distribution of Y-SNP haplogroup frequencies and forensic parameters of Y-SNP haplogroups and Y-STR haplotypes in six provinces from Korea

Haplogroup

Seoul-Gyeonggi (n = 110)

Gangwon (n = 63)

Chungcheong (n = 72)

Jeolla (n = 90)

Gyeongsang (n = 84)

Jeju (n = 87)

Total (%) (n = 506)

C

15

8

8

12

14

7

64 (12.6)

D

1

 

1

3

2

1

8 (1.6)

K*

1

  

1

1

 

3 (0.6)

NO*

2

4

3

4

4

6

23 (4.5)

O1a

1

1

1

1

2

5

11 (2.2)

O2*

   

2

3

 

5 (1.0)

O2b*

23

20

15

21

15

20

114 (22.5)

O2b1

8

5

7

7

10

8

45 (8.9)

O3

56

24

36

39

31

38

224 (44.3)

P

3

1

1

 

2

2

9 (1.8)

 

12 Y-SNP haplogroup

Haplogroup diversity

0.6781

0.7389

0.6921

0.7383

0.7940

0.7412

0.7276

Discrimination capacity (%)

8.18

11.11

11.11

10.00

11.90

9.20

1.98

17 Y-STR haplotype

No. of haplotypes

108

62

71

88

84

79

475

No. of unique haplotypes

106

61

70

86

84

73

452

Haplotype diversity

0.9997

0.9995

0.9996

0.9995

1.0000

0.9971

0.9997

Discrimination capacity (%)

98.18

98.41

98.61

97.78

100.00

90.80

93.87

Most haplogroups in the Korean population were found in one of the two major Asian Y-SNP haplogroups, O-M175 (78.9%) and C-RPS4Y (12.6%). Among the five subhaplogroups of O-M175 surveyed here, O3-M122, O2b*-SRY465, and O2b1-47z constituted 44.3%, 22.5%, and 8.9%, respectively. The distribution of the four most frequent Y-SNP haplogroups in the Korean provinces was shown in Fig. S3 and compared with previously published Korean population (Table 2).
Table 2

Haplogroup frequencies of the Korean population compared with those published Korean

Haplogroup

Frequency (%)

This study (n = 506)

Hong et al. [14] (n = 154)

Xue et al. [16] (n = 43)

O3

44.3

42.2

39.5

O2b*

22.5

14.3

14.0

C

12.6

14.9

16.3

O2b1

8.9

5.8

14.0

Others

11.7

22.7

16.3

The Y-SNP haplogroups and Y-STR haplotypes obtained from the six major provinces of Korea were submitted to the YHRD and are searchable via the accession numbers (Table S3). A total of 475 haplotypes were identified using 17 Y-STR markers included in the Yfiler kit, among which 452 (95.2%) were found once. Eighteen, two, and three Y-STR haplotypes were shared by two, three and four individuals, respectively. All provinces exhibited a high haplotype diversity (>0.99), ranging from 0.9971 in Jeju Island to 1.0000 in Gyeongsang (Table 1). The overall haplotype diversity was 0.9997, which represented value similar to the previous Yfiler data set of the Korean population (0.9996, n = 526) [5]. The overall discrimination capacity was 0.9387.

The RST between all possible pairs of six Korean provinces were non-significant (p > 0.05), except in two comparisons of Jeju vs. Seoul-Gyeonggi, and Jeju vs. Gangwon (Table 3), indicating little genetic difference. Jeju was also lowest in haplotype diversity (0.9971) among the six provinces studied here (Table 1), probably as a result of genetic drift acting on this isolated island. AMOVA confirmed that there were no statistically significant genetic differences among the provinces (RST = 0.00169, p = 0.25119), indicating that almost all (99.83%) of the genetic variation was due to variation within the provinces, with the non-significant variation (0.17%) being due to the differences among the provinces. These results indicate no patrilineal substructure in Korea, except for Jeju Island. A previous study on the coancestry coefficients estimated for the autosomal STR makers also did not support substantial heterogeneity among the geographic subpopulations of Korea [15]. The genetic homogeneity that likely arose by a shared national history (say almost 5,000 years) in the relatively small territory of the Korean peninsula seems to be the main characteristic of the population structure of Korea regarding the Y chromosome.
Table 3

Pairwise RST values (below diagonal) and p values (above diagonal) between six provinces from Korea

 

Seoul-Gyeonggi

Gangwon

Chungcheong

Jeolla

Gyeongsang

Jeju

Seoul-Gyeonggi (n = 110)

0.15315

0.57430

0.98178

0.72716

0.03336*

Gangwon (n = 63)

0.00613

0.43421

0.51302

0.32017

0.02218*

Chungcheong (n = 72)

−0.00209

−0.00041

0.76705

0.29997

0.34700

Jeolla (n = 90)

−0.00722

−0.00170

−0.00480

0.96406

0.11246

Gyeongsang (n = 84)

−0.00361

0.00188

0.00193

−0.00754

0.05465

Jeju (n = 87)

0.01143*

0.01784*

0.00122

0.00672

0.01045

*p < 0.05

The most frequent minHts (minHt 1∼3), which were shared by 16, 14, and eight Korean males, belonged to haplogroup O2b*, although the most frequent haplogroup in Korea was O3 (Table 4). The next most frequent minHts (minHt 4∼6), which were shared by five Korean males, belonged to haplogroup O2b1 or O3. The six most frequent minHts were searched for in the YHRD database (release 34) among 87,440 minHts in a set of 659 populations. All six most frequent minHts (minHt 1∼6) of the Korean population matched with the Japanese population samples (1,846 minHts) whereas only 3 minHts (minHt 3∼5) matched with the Sino-Tibetan population samples (10,404 minHts; Table 4).
Table 4

YHRD (release 34; 89,237 MinHt) search of the six most frequent minHt of the Korean population

MinHt

Freq. (n)

Haplogroup

YHRDa

Kor

Jap

Sin

Adm

Aus

E-Eu

W-Eu

Other

Total

Matching haplotypes (n)

  

3,706

1,846

10,404

10,073

1,855

11,051

18,755

29,750

87,440

MinHt1: 16-14-29-23-10-13-13-10,18

16

O2ba

98

17

 

1

 

1

  

117

MinHt2: 16-14-29-23-10-13-13-10,17

14

O2ba

74

8

 

1

    

83

MinHt3: 16-14-29-23-10-13-13-10,19

8

O2ba

81

7

2

1

1

 

1

 

93

MinHt4: 15-14-30-22-10-13-13-10,20

5

O2b1

22

56

1

2

    

81

MinHt5: 14-12-28-25-10-14-12-13,19

5

O3

27

2

17

     

46

MinHt6: 15-12-29-23-10-12-12-11,19

5

O3

40

1

      

41

Kor Korean, Jap Japanese, Sin Sino-Tibetan, Adm Admixed, Aus Austronesian, E-Eu Eastern European, W-Eu Western European.

The three most frequent minHts (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, and DYS393) without DYS385a/b (minHt 1∼3) in Korean males were identical and shown in bold letters. The six population samples studied here are included in the YHRD (uploaded to release 32)

aSome samples omitted DYS385a/b are not included in this data

In summary, RST and AMOVA of the six Korean provinces studied here show that there are no genetic differences among these provinces, indicating Y chromosome homogeneity in the Korean population, except for Jeju Island. The Y-SNP haplogroups and Y-STR haplotypes of Korean individuals could be helpful to establish a comprehensive forensic reference database for frequency estimation.

Acknowledgments

The authors would like to thank all volunteers for providing DNA samples. They also thank K.C. Kim and N.Y. Kim for technical assistance and data analysis of this survey. The present research was conducted by the research fund of Dankook University in 2009.

Supplementary material

414_2010_501_MOESM1_ESM.docx (16 kb)
Table S1Y-chromosome SNPs and primer information for PCR amplification (DOCX 15 kb)
414_2010_501_MOESM2_ESM.docx (15 kb)
Table S2SBE primers for the detection of the 12 Y-SNPs used in this study (DOCX 15 kb)
414_2010_501_MOESM3_ESM.xlsx (63 kb)
Table S3List of Y-SNP haplogroups and Y-STR haplotypes of 506 Korean males (XLSX 63 kb)
414_2010_501_MOESM4_ESM.docx (30 kb)
Fig. S1Phylogenetic tree defined with the 12 Y-SNP polymorphisms and scheme of the method for SNP genotyping. Gray color squares represent 10-plex PCR set (M45, M89, M119, M122, M174, M175, M214, P31, SRY465, and 47z) and black color squares represent two singleplex PCRs (RPS4Y and M9) (DOCX 30 kb)
414_2010_501_MOESM5_ESM.docx (239 kb)
Fig. S2Representative electropherograms of Y-SNP haplogroup observed in the Korean population (DOCX 238 kb)
414_2010_501_MOESM6_ESM.docx (78 kb)
Fig. S3Distribution of the four most frequent Y-SNP haplogroups (O3, O2b*, C, and O2b1) in six Korean provinces (DOCX 78 kb)

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.DNA Analysis DivisionNational Institute of Scientific InvestigationSeoulSouth Korea
  2. 2.Department of Biological SciencesDankook UniversityCheonanSouth Korea
  3. 3.School of Biological SciencesSeoul National UniversitySeoulSouth Korea

Personalised recommendations