Introduction

In World War II, nearly 3 million and about 570,000 people were killed during the Nazi and the Soviet occupation of Poland, respectively [1]. Furthermore, historians have estimated that after World War II, at least 30,000 people were killed during the Stalinist regime in Poland (1944–1956) [2]. The exact number is unknown, because both executions and burials were kept secret. Thousands of people just vanished [2].

In 2012, forensic scientists from the Pomeranian Medical University in Szczecin started, in cooperation with historians from the Institute of National Remembrance, the project called “The Polish Genetic Database of Victims of Totalitarianism” [3]. It was created as a tool for the identification of communist terror victims, killed in the years 1944–1956. The project is a response to historical events.

The biggest exhumation work done under this project happened in the eastern part of Poland, in Białystok, the capital of the Podlaskie province. According to information gathered by local historians, a detention centre in Białystok city centre was the place of secret burials of communist victims [4]. The main aim of the project was to identify the remains found in Białystok. Based on the initial hypothesis that the victims were killed in the years 1944–1956, the gathered reference material applied mostly to people who disappeared after World War II. But surprisingly, except for a few graves from the post-war period, most of the burials found in Białystok indicated that the majority the victims were probably local civilians who died during the Nazi occupation. Due to that, new families had had to be found—also the ones who lost their relatives during World War II. Unfortunately, data concerning what happened in the detention ward during that period of time is not very detailed, as the Gestapo destroyed most of their archives before leaving Białystok. It was, therefore, hard to create a list with the names of possible victims. What is known is that people who got incarcerated were both Polish underground activists and accidental civilians from the whole province. Inmates were called “political prisoners” what, according to Nazi politics, was based on their nationality, religion, and activity against the Third Reich.

Aim

By molecular-genetically testing of the human remains found on site, this research aimed at shedding new light on the victims and gaining insights on what happened in the Białystok detention centre during the Nazi occupation. For this purpose, Y-chromosomal STR markers were analysed on the remains of 100 male victims.

Materials and methods

Exhumation

The detention centre in Białystok was built in 1906. Before World War II, the detainees were mostly convicted on criminal charges. But starting in 1939, the detention centre became a place to keep not only criminals. Between 1939 and 1941, it stayed under authority of NKVD, and when the Nazi occupation started in 1941, it went under the control of the Gestapo until 1944. From 1944, the detention centre was subjected to the Polish Ministry of Public Security and got filled with people from the political opposition.

The place has changed over the years. At the turn of the 1960s, the garden of the detention ward was changed into an economic space, which was still there when the exhumation work started. The research group believed they would find hidden remains of communism opponents killed by the government just after the war. During the exhumation, a few remains of people shot in the back of the head were found [5]. They were carrying personal belongings, which indicated that the victims indeed were underground activists. However, most of the graves differed by being mass graves with women, men, children, and seniors inside. Most of the remains showed no signs of trauma on the bones and the remains were often covered with calcium and potassium permanganate. The artefacts found in many graves were identified as basic personal belongings, indicating that the victims were most probably local civilians who died during the Nazi occupation. One of the graves had even stronger evidence—it was an execution grave, with 24 people killed by a Nazi firing squad [5].

The field work in Białystok took 2 years and consisted of six stages. Finally, over 300 remains in 66 graves were found. Of these remains, 177 were attributed to men, 61 to women, and 141 came from children.

Biological material

DNA material for analysis was sampled from all 177 male remains found in the former gardens of the detention centre in Białystok. The individuals were chosen from among all the exhumed remains based on the anthropological assessment. The victims were 20–60 years old, with an average age of about 40 years. This means, according to the historical data, that they were born at the beginning of the twentieth century and lived in the region of Podlaskie province.

A team of “The Polish Genetic Database of Victims of Totalitarianism” [3] conducted the exhumations. Forensic anthropologists, supervised by geneticists, collected healthy molars from each skeleton. Most of the remains were found in mass graves and the collected biological material was classified as being highly degraded.

Preparation of teeth for extraction of DNA

The teeth were mechanically cleaned from surface deposits using a special tool from Proxxon and sterile diamond grids (Proxxon) [3]. Next, they were chemically cleaned by a 15-min wash in 15% sodium hypochlorite solution [3] and rinsed with distilled water. After that, the teeth were sterilised by UV-C irradiation for 30 min and air-dried in a laminar flow chamber.

The pre-treated teeth were placed in a cryogenic laboratory grinder 6870 Freezer/Mill® by Spex SamplePrep, chilled with liquid nitrogen, and then milled to fine powder.

DNA extraction

About 0.05 g of tooth powder were used for extraction done with the PrepFiler® BTA Forensic DNA Extraction Kit (ThermoFisher Scientific) according to the manufacturer’s instructions. Each sample was extracted at least two times.

DNA quantification

The Quantifiler Trio DNA Quantification Kit (TFS) was used to assess the DNA concentrations of the extracts and to identify samples being potentially compromised by unintentionally co-extracted PCR inhibitors.

DNA amplification and electrophoretic separation

DNA was amplified using the Yfiler Plus PCR Amplification Kit (TFS) according to the manufacturer’s protocol. DNA input for the 25-μl reactions was within the optimum range recommended by the manufacturer and 30 thermal cycles were applied on an Applied Biosystems Veriti Thermal Cycler (TFS).

Electrophoretic sizing of the PCR products was performed on a 3500 Genetic Analyzer using 600 LIZ™ as internal size standard and the GeneMapper® ID-X software for data processing (both: TFS).

Y-haplogroup estimation

Y-haplogroup estimation was done using Nevgen [6]. This online tool uses Bayesian-Allele-Frequency to estimate to which haplogroup a Y-STR haplotype belongs. For the estimations, 23 markers were used: DYS576, DYS389I, DYS635, DYS389II, DYS460, DYS458, DYS19, YGATAH4, DYS448, DYS391, DYS456, DYS390, DYS438, DYS392, DYS570, DYS437, DYS385a/b, DYS449, DYS393, DYS439, DYS481, and DYS533. The predictor model was based on automatic selection, and the final estimates were the haplogroups suggested by Nevgen with the highest probability.

Biogeographical background of Y-STR profiles

In an attempt to shed light on the spatial distribution of the Y-STR profiles found in this research, we queried the Y-HRD database, which holds worldwide information on Y-chromosomal variation at the level of Y-STR haplotypes and corresponding haplogroups. In its current version (release 57), the Y-HRD holds five times more Yfiler than Yfiler Plus profiles. The 165,259 Yfiler haplotypes come from 118 national databases and 4683 of them are from Poland. For database queries, we, therefore, trimmed the 27-locus Yfiler Plus profiles to the 17-locus Yfiler haplotypes. Furthermore, to achieve reliable results, only full or 16-locus Yfiler profiles were considered for addressing the biogeographical background of our Y-STR data.

Results

From the 177 studied individuals described as males by the anthropological assessment, the Y-chromosomal analysis failed for 77 (due to the high degradation of bone material or incorrect sex estimation). For the remaining 100 individuals, satisfying Y-STR data was obtained. These samples were used for further analyses. Seventy samples yielded data for ≥ 20 Y-STRs, whereas allele-calls for 16–19 loci were obtained for the other 30 specimens. The haplotypes presented in Table 1 represent consensus profiles based on multiple amplifications.

Table 1 The haplotypes based on 23 studied Y-STRs

For 67 individuals, full Yfiler haplotypes (or 1 marker missing) were obtained and queried against the Y-HRD database. Samples not producing direct matches were subjected to one-step neighbour analysis. Results are summarised in Tables 2 and 3.

Table 2 The results of Y-HRD analysis for the samples showing matches in the YFiler database
Table 3 The results of Y-HRD analysis for the samples with no matches in the YFiler database

Among 27 haplotypes producing direct matches in Y-HRD database, 14 showed matches among Polish database. Over 70% (ten samples) of those were estimated by Nevgen as R1a haplogroup, two haplotypes as I1 haplogroup, one as I2 and 1 as H-M82. For the remaining 40 haplotypes with no direct matches, one-step neighbour analysis produced matches in Poland for 11 haplotypes.

A subset of up to 23 Y-STRs was used for Nevgen haplogroup estimation on the 100 samples amplifying at least at 16 of the 27 Yfiler Plus loci. For 95 individuals, the probability of the estimation was nominally 100%. For the remaining five individuals, it ranged between 60% (haplogroups estimated as J-Z7671, E-M123 and G-M342) and 80% (G-U1 and J-Z7671). Results are presented in Fig. 1.

Fig. 1
figure 1

Nevgen haplogroup estimates

Almost all of the individuals were found in mass graves of men, women, and children. In case of 18 graves, more than one male individual was analysed. The distribution of estimated haplogroups in the studied mass graves is presented in Fig. 2.

Fig. 2
figure 2

The distribution of estimated haplogroups among studied mass graves

Discussion

Analysis of markers on the MSY (the male specific region of the human Y chromosome) [7], facilitates reconstruction of paternal lineages. The MSY is passed down clonally and contains a plethora of single nucleotide variant markers as well as short tandem repeat loci, making it the largest pool of human genetic markers that are inherited in the form of a single haplotype. Mutational events in the germline result in groups of individuals carrying similar Y-chromosomal haplotypes, which in turn can be compiled into specific haplogroups. Thanks to that, it is possible to reconstruct the genealogical tree of humanity and to retrace historical migrations of male lineages, shedding some light into their possible ethnic background.

In our study, we analysed 23 Y-STR loci on the remains of 100 men exhumed at the detention centre in Białystok. In the light of historical data, those people lived in Podlaskie province and were killed in the ward between 1939 and 1956. Due to lacking archives, data about their identity was not very detailed. Gathered historical records suggest that the Bialystok detention centre was not a place of mass ethnic cleansing and that people who got incarcerated were both Polish underground activists and accidental civilians from the whole province.

On basis of up to 23 Y-STRs, the main Y-chromosomal haplogroup estimated for the studied individuals was R (50%, Fig. 1), which is known to be found among around half of the European populations [8, 9]. Most of the studied individuals (41%) were suggested as R1a, often called the “Slavic” haplogroup. Its high frequency in Eastern Europe was confirmed by different research groups [10,11,12,13,14]. Haplogroup R1b is known to be more prevalent for Western Europe [11, 15]. According to the study by Battaglia et al. [16], haplogroup R1a was found among Polish samples at a frequency around 56% and haplogroup R1b around 18%. Other authors [11] reported similar observations by mentioning haplogroups R as the most common in Poland and Podlaskie province. The following most frequent haplogroups were I (25%) and N (9%, Fig. 1), which are known to be common in European populations including Poland [11, 16].

Pepiński et al. [12] also studied male samples from Podlaskie. On basis of 186 haplotypes comprising 12 Y-STRs, these authors found no statistically significant discrepancy between the population of Podlaskie and other Polish populations.

Three of the studied individuals were assigned to haplogroup E, one being suggested as E-V13 and two as E-M123 (Fig. 1). Battaglia and colleagues [16] did not observe haplogroup E-M123 in the Polish sample although they reported it for other populations analysed in the very same paper [16]. This finding is in line with data published by Cruciani et al. [17] and Semino et al. [18], which also did not observe E-M123 Y chromosomes among Polish samples [17, 18]. Notably, some studies show that E-M123 is the most common E sub-haplogroup found among Ashkenazi Jews [19, 20].

Five of our studied individuals were placed by Nevgen within haplogroup J, which was found in Polish population samples by different studies [11, 16, 18]. Battaglia et al. [16] observed the J-M241 subclade, which did not occur among our samples. Studied individuals were estimated as J-Z387 and J-Z7671, both being J2a branches. Furthermore, Battaglia et al. [16] also reported the presence of J1 Y chromosomes (J-M267), a finding not mentioned by the other studies [11, 18]. Our N individuals from J1 were assigned to J-P58 (Fig. 1), which is called the “Semitic” branch. According to various research groups, this particular haplogroup is found almost exclusively among Ashkenazi Jews [19, 20].

Among the studied remains, 6% were estimated as H-M82. Pamjav et al. [21] studied a group of Roma people from Hungary and found out that H-M82 is the most frequent haplogroup among them. This discovery was confirmed by another publication on Romani samples [22]. However, the occurrence of haplogroup H-M82 Y chromosomes has not been reported for the general European population [23]. This applies to previously published papers including Polish samples [11, 16], too.

For our sample, the collective haplogroup G (G-M342 and G-U1) was estimated at the lowest frequency. In line with that, it was not mentioned by previous papers including Polish samples [11, 16]. Hammer and Behar reported haplogroup G as being rather frequent among Ashkenazi Jews and that G1 branch (G-M342 falls into that) is found within European populations with rather low frequencies [19, 20].

Conclusions

The preliminary results presented in this paper shed some light on the possible ethnic background of the remains exhumed in Białystok. Most of the studied males (over 80%) were suggested as of European origin and represented haplogroups typical for Polish population. Nevertheless, some of the individuals got assigned to Y-chromosomal haplogroups known for being very rare in Europe. They might have belonged to ethnic minorities (Jews and Roma) being present in the Podlaskie province before World War II [24]. The available archives are uncertain about what had happened in the detention centre during the Nazi occupation. The distribution of the haplogroups among the studied mass graves suggests that the victims were buried all together irrespectively of their sex, age or ethnicity. Genetic results show that none of the found burials was dedicated only to one ethnic minority. Thus, our preliminary data rather suggest that the garden of the detention ward in Białystok was not used to hide the bodies of victims of mass ethnic cleansing but that the victims were local civilians representing multiple ethnic groups living in Podlaskie in the 1940s. The here presented data on Y-STRs adds to the scarce body of information that is available on the victims found in the Białystok detention centre. Including phylogenetic analysis into the complex process led by the Polish Genetic Database of Victims of Totalitarianism may help with the final identification of hundreds of anonymous victims.