Background

The non-coding or control region, synonymously referred as the displacement loop (D-loop) segment of mitochondrial DNA (mtDNA) is most extensively studied in forensics, particularly the hypervariable regions (HVR I, II, and III) (Wilson et al. 1995; Szibor et al. 1997; Lutz et al. 2000; Chung et al. 2005; Lander et al. 2008; Irwin et al. 2009; Elmadawy et al. 2013; Hayat et al. 2015; Chaitanya et al. 2016; Bhatti et al. 2018). The hypervariable regions are mutation hotspots, which evolve ~ 10 times rapidly than chromosomal DNA, with relatively higher mutation rate giving rise to more polymorphic sites (Stoneking 2000). High copy number and circular form of mtDNA makes it less vulnerable to exonuclease degradation, which makes it a beneficial tool in identifying persons from the highly degraded biological sample in forensic cases (Parson et al. 1998; Thongngam et al. 2016).

Although the HVR I (np 16,024 to 16,365) and HVR II (np 00073 to 00340) regions are unanimously used in forensics, other regions of the D-loop such as the HVR III region (np 00438 to 00574), which was first identified by Lutz and co-workers (Lutz et al. 1997) can also be analyzed as an additional marker which increases the discrimination power of common haplotypes and helps confirm relationships between individuals presenting only one difference in the sequences (Szibor et al. 2007; Nilsson et al. 2008). In addition to point mutations, repetition of nucleotides :single and dinucleotide repeatsoccurs in the D-loop segment known as length heteroplasmy (Li et al. 2016). Commonly occurring length heteroplasmies are the Poly-C (Poly-cytosine) residues at np 16,189 in HVR I, np 303–315 in HVR II, and np 568–573 in HVR III and the five pairs of CA/AC residues at np 514–524 in the HVR III region (Bhatti et al. 2018).

The mitochondrial CA/AC residues are synonymously referred as (CA)n repeats found within the HVR III region are length heteroplasmy or variables similar to the chromosomal DNA short tandem repeats (STRs); mutations at this position occur as an insertion or deletion (indels) of one or more pairs of CA/AC residues. Chromosomal STRs are profoundly used as a forensic marker, and when relating to the sensitivity of D-loop, the (CA)n repeat amplification seems to be a useful technique than the chromosomal STRs of similar fragment length, due to its high copy number and high stability of exonuclease degradation of corpses; the mtDNA D-loop (HVR I, II, and III) analysis is often helpful when chromosomal STRs fail (Szibor et al. 1997); this has already been proven by the examination of mtDNA in skeletal and ancient human samples (Holland et al. 1993; Kurosaki et al. 1993). Therefore, the length heteroplasmy acts as an additional marker in forensic analysis and is very informative in the interpretation of matches between subjects and their maternal relatives (Naue et al. 2015) and can deliver further insights of discriminative potential that helps to identify human remains during forensic casework (Santos et al. 2008).

Nevertheless, the HVR III has proven to be useful as a supplementary polymorphic segment for the forensic application (Hoong and Lek 2005). Even though the majority of forensic cases utilize only the HVR I and HVR II sequences, it is noteworthy that HVR III (CA)n repeat segment has equal potential as an additional mtDNA marker in forensic investigations (Ivanov et al. 1996; Lutz et al. 2000; Chung et al. 2005; Fridman and Gonzalez 2009; Bhatti et al. 2018). No such study of (CA)n dinucleotide indels of mtDNA HVR III 514–524 region in Indian population has been reported so far; this study was aimed to evaluate the (CA)n dinucleotide repeats in the Urali Kuruman tribe of Kerala, South India.

Tribe

Urali Kurumans are an endogamous tribal group who migrated along the unpastoral lands inhabiting the hillocks of Wayanad, Kerala, South India, with a population of approximately 11,179 people (Chandramouli- Census of India 2011). Urali Kurumans are slender to medium in stature and dark brown or dark in complexion, and some members of the tribe exhibit platyrrhine features; they are monogamous and neolocal and speak a dialect of the Dravidian language family. Traditionally, they were nomadic hunter–gatherers living on steep edges, practicing shift cultivation, foraging, and trapping small birds and animals for food and hide (Singh 2002). Until now, none of the studies have described the genetic structure of the Urali Kuruman tribe. Even though the present study only focusses on the (CA)n dinucleotide repeat of mtDNA HVR III 514–524 region, which will be an impending aspect for studying complete mtDNA and autosomal DNA of the Urali Kurumans.

Materials and methods

Sample collection

Prior to collecting blood samples, written informed consents were obtained from all the 100 healthy unrelated individuals of the Urali Kuruman tribal population of Wayanad, Kerala. About 5–10 ml of sample blood was drawn in vacutainers as per standard protocols.

DNA extraction and quantitation

DNA was extracted following Proteinase-K digestion followed by phenol–chloroform extraction and ethanol precipitation. The extracted DNA was quantified through spectrophotometer (Hitachi U-1800) at an absorption of 260 nm and 280 nm followed by 1% agarose gel electrophoresis (Sambrooke et al. 1989).

Mitochondrial DNA control region analysis

Complete mtDNA control region was amplified using a set of primers illustrated in Table 1. PCR amplification was carried out in a 10μl reaction mixture, having 1 μl of genomic DNA, dNTP 0.5 μl, PCR buffer with MgCl2 1 μl, 0.1 μl of Taq polymerase; 0.2 μl of primers (both forward and reverse); and 7.04 μl of Milli Q using Gene Amp PCR system 9700 thermal cycler (Applied Biosystems) with cycling conditions of 94 °C for 2 min, 95 °C for 1 min, 58 °C for 0.45 s, 72 °C for 2.30 min, and final extension at 72 °C for 7.00 min for 35 cycles. PCR amplicons were cycle sequenced with Big Dye cycle sequencing ready reaction kit (Applied Biosystems), and fluorescent amplimers were detected through ABI 3730 DNA analyzer (Applied Biosystems).

Table 1 Primer sequences used for the amplification and sequencing of mtDNA control region

DNA sequence and statistical analysis

Complete control region sequences were analyzed using Seqscape v2.5 (Applied Biosystems) and mutations were scored using Revised Cambridge Reference Sequence (rCRS) (Andrews et al. 1999), and only the (CA)n dinucleotide repeat sequences from position 514–524 of HVR III region are included in this paper. The genetic diversity (h) of the complete (CA)n heteroplasmy was estimated following h = n (1 − ΣX2)/(n − 1), where n is the sample size and X is the frequency of observed haplotypes, and the discrimination power (DP) was also determined using the equation DP = (1 − ΣX2), where X is the frequency of each observed haplotype (Tajima 1989). The random match probability (p) was defined by the equation p = ΣX2, where X is the frequency of each observed haplotype (Stoneking et al. 1991).

Results and discussion

In order to investigate the HVR III (CA)n dinucleotide repeats at position 514–524, the sequences were categorized into nucleotide insertions and deletions, which generally occurs by one or more insertion or deletion of CA/AC nucleotides. The results revealed 86 out of 100 individuals have indel changes in the (CA)5 dinucleotide repeat cluster which is the reference sequence and detected four different (CA)n dinucleotide repeat sequences (Table 2). (CA)5, the threshold sequence, was detected in 14% Urali Kurumans, and (CA)4 repeat was the most frequently detected repeat in 49% Urali Kurumans which occurs by 1 CA deletion. In addition, (CA)6 repeat with 1 AC insertion was detected in 14% Urali Kurumans, and (CA)7 with 2 AC insertion was detected in 20% Urali Kurumans. Likewise, a rare dinucleotide repeat (CA)3 was detected in 3% Urali Kurumans, which occurs when 2 CA nucleotides are deleted, and according to previous reports, (CA)3 was detected in 2% Cameroonians (Szibor et al. 1997) and 1% of Thais (Thongngam et al. 2016). Nevertheless, (CA)4 repeat was also the most frequently detected dinucleotide change in some world populations, and (CA)6 and (CA)7 were also detected in many populations (Table 3). A statistical estimate showed a genetic diversity (h) of 0.6866, a probability of the random match (p) of two individuals sharing the same haplotype was 0.3202, and discrimination power (DP) calculated was 0.6798 in the Urali Kuruman population. (CA)n dinucleotide repeat polymorphisms present in this study and some world populations are inferred in (Table 3), and the sequence structure of the (CA)n dinucleotide detected in this study are depicted in (Fig. 1).

Table 2 Insertion and deletion of mtDNA HVRIII region np 514–524 detected in 100 individuals of the Urali Kuruman tribal population
Table 3 (CA)n dinucleotide repeat in mtDNA HVR III region np 514–524 in different human populations
Fig. 1
figure 1

The five (CA)n dinucleotide repeats: a (CA)3 dinucleotide repeat, b (CA)4 dinucleotide repeat, c (CA)5 dinucleotide repeat(rCRS), d (CA)6 dinucleotide repeat, and e (CA)7 dinucleotide repeat genotyped by deletion and insertion of CA/AC nucleotides at position 514–524 of mtDNA HVR III region by Sanger sequencing. Electropherograms of (CA)n dinucleotide repeats at position 514–524 of mtDNA HVR III region detected in the present study

Conclusion

The findings demonstrated the degree of insertion–deletion variability of mtDNA HVR III (CA)n dinucleotide repeat sequences at np 514–524 in 100 individuals of the Urali Kuruman tribe. Furthermore, the study imparts the use of HVR III CA indels in human identification, which has an equal potential as HVR I and HVR II regions, and will contribute as an additional forensic investigation marker.