X-chromosomal STRs can be very helpful for the assessment of complex kinship scenarios. However, after 20 years of research and X-STR usage in forensic genetics, there is still a continuous need for high-quality population data [1]. With the present publication, we contribute the first-ever X-STR dataset for Switzerland.

We analyzed a sample of 1198 Swiss individuals (592 females, 606 males) with Qiagen Investigator® Argus X-12 QS multiplex kit in an ISO17025 accredited laboratory framework. DNA extractions were prepared as described in Zieger and Utz [2]. Multiplex PCR was performed in a reduced reaction volume of 12.5 μL. Capillary electrophoresis was conducted on a 3500xl genetic analyzer (ThermoFisher, USA) and data interpretation was carried out with Genemapper® ID-X, v1.4 (Thermo Fisher, US). Genotypes were exported from Genemapper and the export file was manually controlled twice by comparison with the electropherograms for QC. Details on the investigated population can be found in Zieger and Utz [2]. Allele frequencies and forensic parameters were calculated with StatsX [3]. Test for Hardy–Weinberg equilibrium (HWE) based on female samples with 100,000 permutations and calculation of pairwise FST among potential subpopulations were done with STRAF [4]. We checked for linkage disequilibrium with the genepop R package [5, 6] by performing an exact test with 50,000 iterations and 1000 batches. All statistics were calculated excluding eight genotypes with triallelic patterns.

Allele frequencies are listed in Table S1, haplotype frequencies in Table S2, and statistical parameters can be found in Table S3. Despite the low p-value of 0.006 for DXS7423, none of the 12 loci significantly deviates from HWE after correcting for multiple testing of 12 loci, based on a Bonferroni-adjusted threshold at 5% significance level p = 0.0008. Haplotype frequencies are available online as FamLinkX [7] input file (Table S4). In order to use the data for calculations with FamLinkX, it is possible to open the.sav file as a project in FamLinkX.

We could detect significant linkage disequilibrium within three out of four linkage groups. In total, six out of 12 expected allele combinations displayed significant linkage disequilibrium, after applying a Bonferroni correction with a significance threshold of 0.0008. Note that the Bonferroni correction assumes independence between tests and is overly conservative in the presence of linkage, so it is worthwhile to mention that all marginally significant tests (p < 0.01) correspond to combinations within and not between linkage groups (Table S5), supporting the consideration of loci within these four linkage groups as haplotypes for forensic calculations.

We discovered variant alleles in about 3% (n = 37) of all analyzed samples. All variants are listed in Table S6. Most of them have been described previously [8,9,10,11,12,13,14,15,16,17]. However, we list 12 alleles in Table S6 for which we could not find a reference to date. Half of them were in DXS10146, so a better allele coverage by the kit manufacturer would be desirable for this marker. In addition to frequent off-ladder alleles, a couple of multi-allelic patterns were discovered. Most of them (6 out of 9) are in DXS10079, a locus for which duplications can be observed frequently [9, 17]. Contrary to off-ladder alleles and allele duplications, obvious allele dropouts occurred scarcely, with just one partial dropout in DXS10101 and a potential dropout in DXS10146, inferred from the constantly reduced height of the remaining peak in a female sample. Dropouts have been reported previously for both of those markers [18].

We checked for potential intra-national differentiation by calculating pairwise FST values between subgroups defined geographically (for details, see Zieger and Utz [2]) based on female samples. Even though subsamples were relatively small (50 to 130 individuals), FST values are generally very low (not exceeding 0.004) and uniformly distributed, suggesting no significant degree of population stratification for this marker set (Table S7).

The allele frequencies of the complete Swiss dataset were compared to 36 other worldwide populations [8] using multidimensional scaling (MDS) based on Nei’s genetic distance [19]. The Swiss dataset clusters very well with other European datasets, as expected (Figure S8).