Parentage testing and animal identification are essential for protection and efficient management of animal populations (Werner et al. 2004), among others, for estimation of effective population size, reduction of the inbreeding level and to minimize mating between close relatives (Quader 2005). Because of the wide availability and high polymorphic information content (PIC), microsatellite markers (STRs) were used for this purpose (Glowatzki-Mullis et al. 1995), but lower mutation and genotyping error rate, automation of genotyping, ease of data manipulation and calculation caused that panels of single nucleotide polymorphisms (SNPs) have displaced STRs (Heaton et al. 2002; Werner et al. 2004). Recently, the International Committee for Animal Recording (ICAR) developed a cattle consensus panel of 100 SNPs for routine parentage testing (Fernández et al. 2013).

The European Bison (EB, Bison bonasus) became extinct in the Białowieża Forest in 1919. The contemporary Lowland-Białowieża line was restored from seven animals preserved in zoological gardens and reintroduced into the wild (Pucek et al. 2004). The high (and still rising) inbreeding coefficient, ranging from 0.35 to 0.48 (Matuszewska et al. 2004; Wołk and Krasińska 2004), is the result of the bottleneck effect. The lack of modern parentage and individual identification based on genetic markers may lead to a further decrease in the genetic diversity of this species. Tokarska et al. (2009) tested the effectiveness of a routine set of cattle STRs for parentage testing and showed that these markers are too homozygous in wisent and suggested the creation of a panel of 50–60 of the most heterozygous loci for paternity and identity analysis. Labuschagne et al. (2015) also proved that SNP markers perform better than STRs in wild populations. Since Kamiński et al. (2012) showed that cattle SNPs genotyped on Illumina bovine microarrays could be used for differentiation of two lines of EB, we hypothesized that bovine 777k SNPs microarray is a dataset suitable to elaborate a subset of SNPs used for individual and parentage control in the bison.

One hundred and sixty-three European bison DNA samples were genotyped using Illumina BovineHD microarray (777,962 SNPs). SNP genotyping quality analysis, pair-wise Linkage Disequilibrium (LD) analysis and SNP selection was performed in Golden Helix SVS 8.3.3 (Bozeman, MT). SNPs with missing genotypes, localized on mitochondrial DNA, chromosome Y, non-pseudoautosomal region of X chromosome, unknown genomic location, deviated from the Hardy–Weinberg equilibrium (P < 0.0001) and with minor allele frequency less than 0.45 were removed from analysis. For purposes of sex verification, the final set of SNPs was supplemented by one SNP localized on the Y chromosome. For ascertainment purposes, nine randomly-selected polymorphic and monomorphic SNPs were sequenced.

Standard formulas were used to convert the allelic frequencies into the probability of parentage exclusion (Jamieson and Taylor 1997):

For one parent:

$$P{E_1}~=1 - 4\mathop \sum \limits_{i=1}^n p_{i}^{2}+2\,{\left( {\mathop \sum \limits_{i=1}^n p_{i}^{2}} \right)^2}+4\mathop \sum \limits_{i=1}^n p_{i}^{3} - 3\mathop \sum \limits_{i=1}^n p_{i}^{4}$$

For two parents:

$$P{E_2}=1+4\mathop \sum \limits_{i=1}^n p_{i}^{4} - 4\mathop \sum \limits_{i=1}^n p_{i}^{5} - 3\mathop \sum \limits_{i=1}^n p_{i}^{6} - 8\,{\left( {\mathop \sum \limits_{i=1}^n p_{i}^{2}} \right)^2}+8\,\left( {\mathop \sum \limits_{i=1}^n p_{i}^{2}} \right)\left( {\mathop \sum \limits_{i=1}^n p_{i}^{3}} \right)+2\,{\left( {\mathop \sum \limits_{i=1}^n p_{i}^{3}} \right)^2}$$

The probabilities were calculated for each locus tested, where p i is the frequency of allele i, n the number of alleles (two per SNP). The total exclusion power was calculated by combining all P values of the tested loci as follows:

$$PE=1-\left( {1-{P_1}} \right){ }\left( {1-{P_2}} \right){ }\left( {1-{P_3}} \right) \ldots { }\left( {1-{P_k}} \right),$$

where k is the number of loci used.

The probability of identity was calculated according to Waits et al. (2001):

$$PI=\prod\limits_{i=1}^r {\left( {\sum\limits_{j=1}^{{n_i}} {p_{i}^{4}+\sum {\sum {p_{i}^{2}p_{j}^{2}} } } } \right)}$$

where p i , p j are the frequencies of the i-th and j-th allele, n is the number of alleles and r is the number of the tested loci. Calculations were performed using Microsoft Excel™.

The Average Call Rate of the genotyped animals was 0.95. Data clean-up resulted in elimination of 776,699 SNPs, which did not meet the selection criteria. Pair-wise SNP LD analysis for remaining 1263 SNPs showed that the minimal distance between two unlinked SNPs was 10 Mb.

A list of 100 SNPs, their rs numbers, allele frequencies, probabilities of identity and parentage exclusion for one and two parents are presented in Table 1. All chosen SNPs were checked in Illumina GenomeStudio software for quality of clustering to ensure the correctness of genotyping process. Moreover, the sequencing of randomly-selected 9 SNPs confirmed the same genotypes in cattle and European bison (Supplementary Fig. 1).

Table 1 Allele frequencies, probability of exclusion (PE) and probabilities of identity (PI) of 100 SNPs selected for parentage testing of European bison

The probability of parentage exclusion varied from 3.90 × 10−6 to 3.55 × 10−14 for one and two known parents, respectively. The theoretical probability of identity was calculated as 1.14 × 10−40. Weller et al. (2006) showed that at least eight SNPs are required to achieve a 99% probability of rejection for a match between two individuals, and a panel of 25 markers provides enough power to identify a single individual between any of the five million individuals with less than a 1% chance for a match between any of five million individuals. So panels of 40–100 SNPs with MAF greater than 0.3 may allow accurate pedigree reconstruction even in situations of thousands of potential family trios with a probability at the level of 1.00 (Baruch and Weller 2008; Fisher et al. 2009). The wisent is an extreme case of the bottleneck effect—less than 3% of genotyped SNPs were heterozygous and the average estimated relatedness between analysed individuals was 0.97 (data not shown), so the chosen SNPs have as high MAF as possible to confirm the identity of analysed individuals between other “clones”. According to presented literature, the values of probabilities obtained in this study should be sufficient for effective identification of all individuals of the Lowland-Białowieża line of EB.