Background

Y-chromosome short tandem repeats (STRs) are valuable in the forensic identification of male DNA from sexual assault cases; also, it has great importance in tracing paternal lineages that help in missing persons investigations and historical studies and it aid to link families through genetic genealogy (Roewer, 2009; Gusmão et al., 2006). The use of Y-STR markers has many benefits due to their genetic characteristics in paternal cases and kinship between male family members. Y-STR markers are located on the non-recombining region (NRY) of the Y chromosome; thus, they are transmitted as haplotypes in the same way as single locus alleles, and the firm paternal inheritance of Y-STRs indicates their importance in paternity and kinship tests (Gusmão et al., 2006; Jain et al., 2016).

Requests for kinship testing between different family members are sent from courts of law and police stations from all over Iraq to Paternity and Kinship division in Medico-Legal Directorate (MLD), Baghdad, to establish the identity of an unregistered people or to solve a familial dispute. In many of these cases, the use of DNA short tandem repeat (STR) is not enough to resolve complicated cases. For the past 5 years, the PowerPlex® Y23 System has been relied upon for solving paternity and kinship cases that were referred to the Paternity and Kinship division in MLD. The frequency of Y-STR haplotype was assessed in many populations due to their importance in the statistical analysis of requested cases (Parvathy et al., 2012; Roewer et al., 2008). Here in this study, we represent the frequency of alleles registered in each locus in Iraqi male samples that were collected from all over the country (except Kurdistan), which may provide more information about the genetic pool of the Iraqi population.

Materials and methods

Sample collection

A total of 1032 blood samples were used in this study. Data were registered for each person, who were all male unrelated Iraqi nationals and identified as Iraqi Arabs. Participants were instructed about the purpose of the sampling, and interested volunteers have been enrolled, and informed consent was obtained from each subject before blood sampling. Blood samples were placed on DNA storage cards (Direct™ Classic Card-FTIZCO, USA).

DNA amplification and Y-STR typing

Samples run for DNA amplification using the direct polymerase chain reaction (PCR) method with PowerPlex® Y23 System (Promega Co., USA), according to the manufacturer’s instructions using Gene Amp 9700 thermal cycler (Applied Biosystems). The amplified products were run on the ABI PRISM 3130xl Genetic Analyzer (Applied Biosystems), and the obtained data was analyzed using the Gene Mapper ID Analysis Software (Applied Biosystems, USA).

Statistical analyses

Frequency-based statistical analysis was calculated with the GenAlEx-6.5 Genetic Analysis software (Peakall & Smouse, 2006), while the haplotype diversity was calculated using the HapYdive software (http://portugene.com/hapydive.html) as mentioned previously (Parvathy et al., 2012). Analysis of molecular variance (AMOVA) was calculated by https://yhrd.org/amova/.yhrd.org.tools.

Results and discussion

Alleles frequency was calculated for each locus, in which observed alleles frequency of 23 Y chromosome STR loci in the Arab Iraqi population is summarized in Table 1. Haploid diversity by locus and the allelic patterns for haploid data as well as mean allelic patterns across population are shown in Tables 2 and 3, respectively.

Table 1 Alleles frequency of PowerPlex® Y23 System loci for a sample of Iraqi Arab population
Table 2 Haploid Diversity by locus using PowerPlex® Y23 System
Table 3 Mean allelic patterns across population using PowerPlex® Y23 System

Using PowerPlex® Y23 System markers, 478 different haplotypes were identified as the most frequent haplotype (detected 10 times) that is shown in Table 4. Eight profiles were unique, along with 435 profiles that found twice each, as well as 30 different profiles each repeated four times. Four different profiles were detected each within 6 individuals, which could be considered as accepted result due to the high inbreeding rate between family members or clan members across the country. DYS635 was the most polymorphic locus with 12 detected alleles, while DYS438 and DYS437 were the least polymorphic with 5 detected alleles each.

Table 4 Most frequent haplotype registered

One hundred and eighty-five alleles were detected at the 23 Y-STR loci in 1032 samples. Eighty-three samples (8.04%) were observed showing mono-allelic condition for the bi-allelic marker DYS385. A number of new unregistered genetic variations were recorded within DYS458 locus (using PowerPlex® Y23 System), and each sample was re-amplified and re-analyzed for confirmation; variants included (16.2:6 times, 17.2:29 times, 18.2:196 times, 19.2:143 times, 20.2:14 times, and 21.2:4 times), and these variants were previously detected by using AmpFlSTR® Yfiler™ (Applied Biosystems) and registered in STRBase (https://strbase.nist.gov/var_DYS458.htm).

Genetic diversity for each 23 Y-STR loci (h) ranged from 0.331 for DYS392 (which was the least informative locus) to 0.838 for DYS481(the most informative locus) in mono-allelic markers, and the average gene diversity was 0.616 ± 0.027; this result is similar to other neighboring populations that showed low genetic diversity due to several causes (Marchi et al., 2017; Palstra et al., 2015). The decrease in loci genetic diversity has been related to inbreeding (that is common in most of the Iraqi provinces) which at population level decreases genetic diversity eventually (Charlesworth, 2003). The firm paternal inheritance of Y-STRs as well as emigration influences the Y-haplotype diversity (Gusmão et al., 2006; Wang & Li, 2013; Singh et al., 2018). The lower effective number of Y-chromosomes in a given population indicates that Y-haplotypes/haplo groups might have a higher variation between populations than markers observed on X chromosomes or autosomes (Domingues et al., 2007). The haplotypes of Iraqi haplotypes were compared with the haplotypes of neighboring populations, according to multi-dimensional scaling plot (MDS) for genetic distance (Fig. 1), which indicates that Iraqi population are genetically closer to Lebanon and Kuwait than to the United Arab Emirates populations. Similarly, Rst values for the pairs of Iraqi and Kuwait and Lebanon populations showed much closer genetic distances than pairs of Iraqi and United Arab Emirates population (Table 5).

Fig. 1
figure 1

Multi-dimensional scaling plot (MDS) based on pairwise Rst genetic distance values for Y-STR haplotypes for Iraqi haplotypes (current study) and other registered haplotypes of Iraqi population along with Lebanon, Kuwait, and United Arab Emirates populations

Table 5 Rst values for Iraqi and neighboring populations

The loci included in the PowerPlex® Y23 System are suitable for forensic DNA fingerprinting casework, population genetic analysis, and anthropological purposes in the Iraqi population. This is the first genetic report using PowerPlex® Y23 for the Iraqi population. These markers showed a high degree of polymorphism, and these results are in concordance with previous studies regarding PowerPlex® Y23 (Purps et al., 2014). Data were registered at YHRD under accession number YA004665.

Conclusion

The present study established the genetic information obtained by using the PowerPlex® Y23 System for the Iraqi population and also created a database of 23 Y STR markers in this population. It also showed that the genetic origin of the Iraqi population has a unique combination from local citizens and the ancestors that migrated to Iraq from different regions and settled in it throughout history. Current studies are ongoing to increase the size of the samples and to make the regional comparison within the country as well as compare the obtained results with other populations Y- STR data.