Introduction

Knowledge about the bio-geographic ancestry revealed from crime-scene samples can be relevant for investigative intelligence purposes in search for unknown sample donors who usually cannot be identified via conventional forensic STR profiling. DNA-based bio-geographic ancestry inference is also applied in genealogical and anthropological research for various purposes. The human Y chromosome is widely studied as an evolutionary marker of patrilineal descent. A well-established Y-chromosome phylogeny is available [6] and is continuously being expanded as novel SNPs are discovered. A wealth of data has been produced previously on the worldwide distribution and allele frequencies of numerous Y-SNPs and the respective Y haplogroups they define. Here, we take advantage of existing knowledge on the Y-SNP phylogeny and worldwide Y haplogroup distribution and introduce two Y-SNP multiplex assays, based on single-base primer extension (SNaPshot™) technology, for the detection of the major worldwide Y haplogoups. Together with well-known Y-SNPs, we have also included some relatively novel Y-SNPs such as M522 [4], M526 [4], P326 [8] and M412 [9], acknowledging most recent progress in Y-chromosome research.

Materials and methods

DNA samples

A subset of DNA samples from the HapMap 3 reference panel [1], belonging to various Y haplogroups, was obtained from the Coriell Institute for Medical Research (http://www.coriell.org/).

Primer design

Primers were designed using Primer3Plus [11] with a Tm around 60°C for PCR primers and around 55°C for extension primers. Potential interactions between primers in the same multiplex were evaluated with the AutoDimer version 1.0 software [12]. In order to minimize allelic dropouts due to primer mismatches, we avoided as much as possible that primer-annealing sites overlapped with known Y-chromosome polymorphisms. Extension primers were varied in length through the addition of 5′ non-homologous poly(GACT) tails to ensure electrophoretic separation of extended fragments.

PCR amplification

Multiplex PCR amplification was carried out in a reaction volume of 6 μL, containing 1× GeneAmp PCR Gold Buffer (Applied Biosystems, CA, USA), 4.5 mM MgCl2 (Applied Biosystems), 100 μM of each dNTP (Roche, Mannheim, Germany), 0.35 units of AmpliTaq Gold DNA polymerase (Applied Biosystems), 1–2 ng of genomic DNA template, and PCR primers (desalted; Metabion, Martinsried, Germany) in concentrations as specified in Tables 1 and 2. The reactions were performed in a Dual 384-well GeneAmp PCR System 9700 (Applied Biosystems) using the following cycling conditions: 10 min at 95°C, followed by 30 cycles of 94°C for 15 s, 60°C for 45 s, and a final extension at 60°C for 5 min. PCR products were purified by adding 2 μL ExoSAP-IT (USB Corporation, OH, USA) to 6 μL PCR product, followed by incubation at 37°C for 30 min and 80°C for 15 min.

Table 1 Genotyping details of Y-SNP multiplex 1
Table 2 Genotyping details of Y-SNP multiplex 2

Single-base extension

Multiplex single-base primer extension was carried out in a reaction volume of 6 μL, containing 1 μL SNaPshot™ Ready Reaction Mix (Applied Biosystems), 1 μL purified PCR product, and extension primers (HPLC-purified; Metabion, Martinsried, Germany) in concentrations as specified in Tables 1 and 2. The reactions were performed in a Dual 384-well GeneAmp PCR System 9700 (Applied Biosystems) using the following cycling conditions: 2 min at 96°C, followed by 25 cycles of 96°C for 10 s, 50°C for 5 s, and 60°C for 30 s. The reaction products were purified by adding 1 unit of Shrimp Alkaline Phosphatase (USB Corporation) to 6 μL of extension product, followed by incubation at 37°C for 45 min and 75°C for 15 min.

Capillary electrophoresis

The extended fragments were separated and detected by capillary electrophoresis on a 3130xl Genetic Analyzer (Applied Biosystems) using POP-7 polymer. A mixture of 1 μL purified extension product, 8.7 μL Hi-Di formamide (Applied Biosystems) and 0.3 μL GeneScan-120 LIZ internal size standard (Applied Biosystems) was run with 10 s injection time at 1.2 kV and 500 s run time at 15.0 kV. Results were analysed using GeneMapper version 3.7 software (Applied Biosystems).

Results and discussion

Two genotyping multiplex assays were developed targeting a total of 28 Y-SNPs that define the major worldwide Y-chromosome haplogroups (Fig. 1). During the course of this work, a paper was published that reported a reorganization of the deepest clades of the Y-chromosome phylogeny, one of the consequences being that marker M91 no longer defines a monophyletic haplogroup A, but rather should be placed on the stem leading to the BCDEF (also referred to as BT) clade [5]. We have incorporated this change in our tables and figures to conform with the latest Y-chromosome topology. Furthermore, we took advantage of some recently discovered Y-SNPs (P326, M526, M522 and M412) that, as far as we know, were not included in previous Y genotyping systems [e.g. 2, 3, 10]. Of these novel SNPs, P326 (also known as L298) defines a new branch that joins haplogroups L and T into a single clade now called LT [8]. M526 is located downstream of marker M9 and encompasses haplogroups K1 to K4 as well as M to S [4]; the branch defined by M526 is now referred to as haplogroup K, and the former haplogroup K (defined by M9) is now relabelled as KLT. M522 (also known as L16 or S138) defines a new node within haplogroup F that encompasses haplogroups I, J and KLT [4] and is referred to as haplogroup IJKLT. M412 (also known as L51 or S167) defines a significant subhaplogroup within haplogroup R that is most abundant in western parts of Europe [9].

Fig. 1
figure 1

Y-SNP marker phylogeny, inferred haplogroups and their geographic distributions as covered by the two Y-SNP multiplex assays introduced here. The phylogeny shown is a truncation of the entire Y-chromosome tree. Due to the fact that some haplogroups have so far been observed only sporadically, their regions of occurrence are less certain and are therefore shown in parentheses

The 28 Y-SNPs were divided into two multiplexes such as to allow a hierarchical typing strategy. Multiplex 1 covers haplogroups BCDEF, B, C, DE, D, E, F, F3, G, H, IJKLT, I, J and KLT. If a sample is found to belong to the latter, it can subsequently be typed with multiplex 2 which covers haplogroups KLT, K, K1, K2, K3, K4, M, N, O, P, Q, R, R-M412 (also known as R1b1a2a1a), S and LT.

To maintain high sensitivity of the multiplexes, PCR amplicons were kept short with an average length of 103 bp (minimum, 46 bp; maximum, 178 bp). The recommended amount of template DNA for the PCR reactions is 1–2 ng, which gives satisfactory results when the DNA is of reasonable quality (Fig. 2). Although we did not further evaluate the sensitivity of the two multiplex assays, we expect that in many cases, lower amounts of template DNA will still yield informative genotypes.

Fig. 2
figure 2

Typical electropherograms obtained with the two Y-SNP multiplex assays introduced here, using DNA samples belonging to a range of Y haplogroups. For each peak, the detected allele is indicated in concordance with Tables 1 and 2. As is convention, the yellow dye is shown as black for better contrast

The multiplexes were optimized on a Genetic Analyzer using POP-7 polymer. We noticed in the past that the type of POP polymer has some influence on the relative electrophoretic mobilities as well as peak intensities of the extended fragments. Therefore, re-adjustment of 5′ tail lengths as well as reaction concentrations of the extension primers might be necessary when employing a POP polymer that is different from the one used here.

Conclusion

The multiplex assays presented here form a convenient tool for detecting the major worldwide Y haplogroups, hence giving a first idea about the patrilineal bio-geographic ancestry of men, being of relevance in forensic investigation and anthropological research. Notably, for most of the haplogroups covered here, more detailed phylogenetic resolution can be obtained by genotyping additional Y-SNPs. Hence, we foresee that additional multiplex assays, targeting more downstream Y-SNPs and dedicated to the dissection of particular (sub)haplogroups, will form useful additions to the global assays presented here. For more complete reconstruction of a person's overall bio-geographic ancestry, we recommend that Y-chromosome markers are combined with ancestry-informative markers from mitochondrial DNA and autosomal DNA, as already achievable with efficient multiplex tools offering resolution on a continental level [e.g. 7, 13].