Introduction

Advances in molecular biology have made it possible to identify genetic identification markers for common use. Among these markers, short tandem repeats (STRs) are used worldwide in paternity testing and to diagnose aneuploidies. STRs are repeating DNA sequences of 1–6 bp, where a difference in the number of repeating sequences results in different alleles and contributes to genetic polymorphisms at STR loci. Several loci, from 15 to 24 STRs, have been grouped into standard panels and are used routinely for paternity testing1.

Penta D is a simple pentanucleotide repeat (AAAGA) located on chromosome 21 (21q22.3), for which alleles ranging from 1.1 to 19 repeats have been observed by length-based genotyping, and alleles 5 to 19 have been sequenced2. Fifty-six Penta D STR alleles are reported in STRbase with 18 tri-allelic patterns3.

TPOX is a simple tetranucleotide repeat (AATG) located in the tenth intron of the human thyroid peroxidase gene on chromosome 2 (2p25.3). TPOX alleles size ranged from 4 to 18 repeats, and 22 tri-allelic patterns are reported in STRbase4. The tri-allelic pattern 8-10-11 is the most common4.

In 2004, Clayton et al. distinguished two types of tri-allelic patterns: tri-allelic Type 1 occurs when two peaks have a different height from the third and tri-allelic Type 2 occurs when all three peaks have a similar height5. More recently, Picanço et al. (2015) identified three tri-allelic Type 2 subcategories: Type 2-A (three peaks of the same height), Type 2-B (one peak with a 2:3 height ratio and one with 1:3 height ratio) and Type 2-C (one peak with a 3:3 height ratio)6.

In this study, we determined the allele frequencies of 22 STRs loci in unrelated Gabonese subjects, and we reported for the first time in the Gabonese population, the existence of a novel tri-allelic pattern of the Penta D locus, and tri-allelic patterns at the TPOX locus.

Results

Allele frequencies

We screened for paternity testing 115 unrelated subjects from 39 families at DNA-LAB Gabon located in Libreville. Were included 42 children, 39 presumed fathers and 34 mothers.

Of the 22 STRs tested, eleven were highly polymorphic (Table 1), while other loci had fewer than ten alleles (Table 1). The most polymorphic loci were D21S11 (16 alleles) and FGA (17 alleles), while D3S1358 and TH01 loci were less polymorphic, with only five alleles each. Deviations from Hardy–Weinberg equilibrium were observed in TPOX, D3S1358, CSFPO and D7S820 loci.

Table 1 Allele frequencies of the eleven (a) highly polymorphic, (b) lowly polymorphic STRs loci in unrelated Gabonese subjects.

Comparing 10 STRs, the allelic frequencies observed in our study were not statistically different from those of the Gabonese samples in a previous study published in 2002 (Table 2)11. Furthermore, comparison of the frequencies of 13 STRs between African-Americans from California (USA)15 and the Gabonese in the present study also revealed no significant differences (Table 3).

Table 2 20-year comparisons between allele frequencies of STR loci in Gabonese subjects using Wilcoxon signed rank test with continuity correction.
Table 3 Comparisons between allele frequencies of STR loci in Gabonese subjects of our study and African-American subjects from California using Wilcoxon signed rank test with continuity correction.

Tri-allelic patterns

We found tri-allelic patterns in 8% of recruited families (3/39 families), and 4% of recruited subjects (4/115 subjects). In all tri-allelic cases, presumed fathers were the biological fathers of tested children, and we did not observe any physical abnormalities that could suggest a genetic disorder in any member of the recruited families.

Penta D family case

Were screened for parentage a phenotypically normal family. Penta D genotypes were 2.2–10 for presumed father, 5–8-16 for the mother and 5–10 for the child. This new tri-allelic Penta D genotype 5-8-16 was observed with a frequency of 3% in tested subjects (1/35 subjects) (Fig. 1). This genotype has never been reported in Sub-Saharan Africa.

Figure 1
figure 1

Penta D tri-allelic genotype observed in a Gabonese family (depicted by Pharmacia DNA Fragment Manager V1.2 software).

TPOX family cases

Tri-allelic TPOX genotypes were observed with a frequency of 3% (3/102 subjects tested). In the second family, which comprised a man with genotype 9-11, a woman with genotype 8-8-10, and their child (male) with genotype 8-10-11, we found two different types of tri-allelic TPOX genotypes. In this family, the tri-allelic type 2-B, 8-8-10, was observed in the healthy mother (Fig. 2). Type 2-A was found in the healthy child of this family (8-10-11, Fig. 2) as in the healthy child of the third family (8–9-10; figure not show). The third family comprising a father with genotype 8-8, a mother with genotype 9-9, and their child (female) with genotype 8-9-10 (figure not show).

Figure 2
figure 2

TPOX tri-allelic patterns observed in healthy Gabonese family (depicted by Pharmacia DNA Fragment Manager V1.2 software).

Discussion

We investigated more loci than in a previous study of the Gabonese population twenty years ago11. Allele frequencies were similar between the two studies, but we found features related to specific alleles that did not appear in the previous survey. These were allele 14 of D2S1338, alleles 15, 22.2, 24.2, and 29 of FGA, alleles 9, 10, and 17 of D8S1179, allele 19.1 of vWA, alleles 8 and 9 of D18S51, allele 17 of D19S433, and alleles 18, 28.2 and 36 of D21S11. All are at low frequency in our study. In contrast, alleles 12 and 13 of D3S1358, alleles 18, 19.2, and 21.2 of FGA, allele 10 of TH01, alleles 11, 12.2, and 13.2 of D18S51, alleles 10 and 17.2 of D19S433, and alleles 32.1, 34, 35.1 and 35.2 of D21S11 were absent in our study but appeared in the earlier survey11. The polymorphic loci FGA, D18S51, and D21S11 showed remarkable differences in terms of various allele presence. Locus D8S1179 was the most polymorphic in our study, with three new alleles compared to the older study, of which one of the alleles (10) is involved in tri-allelic inheritance in India12.

Power discrimination analysis suggests that all 22 STR may be promising markers for paternity testing (PD between 0.973 and 0.993). One of the four probability tests showing significant departures from the HWE equilibrium (CSFPO locus) was also significant in Rwandans (Hutu) and Angolans13,14. The frequency of homozygotes at the CSFPO locus (9/33 = 27%) indicates that this deviation is due to homozygote excess, as in Rwandans13. The allele frequencies are also similar to those for descendants of African people in California15. These comparisons suggest that these loci are stable and good genetic identification indicators.

We report a tri-allelic Penta D pattern (5–8-16) that has not been described previously. The Penta D allele 5, the only allele transmitted from the mother to the child in family #1, occurs at a frequency of 11% in our study and 4% in Africa16. Of the other Penta D alleles in this family, allele 8, is less frequent in our study population (6%) than in Africa more generally (15.5%)16. Penta D allele 16, at a frequency of 2% in our study, has not yet been detected elsewhere in Africa but exists in other populations, such as the Middle East, at a frequency of 2.5%16. Further population studies of the STR locus Penta D should be conducted in Sub-Saharan Africa to determine the types of changes and their frequencies. In our study, the phenotypically normal mother has a tri-allelic pattern of Penta D. The other STR on chromosome 21 in the panels of STR loci used for paternity testing, D21S11, showed no allelic abnormality. Her tested offspring inherited only two alleles, one from each parent. A tri-allelic pattern of Penta D indicates a genetic abnormality on chromosome 21. Trisomy 21 (also known as Down syndrome) is the most common chromosomal anomaly and corresponds to the presence of an extra chromosome 21, in whole or in part. It can be due to various chromosomal aberrations: free trisomy, translocations, mosaicism, critical region duplication, and other structural rearrangements of chromosome 2117. Mosaicism or partial trisomy 21 are more challenging to diagnose because the karyotype is often normal17, showing the importance of studies of Penta D STR.

The tri-allelic TPOX genotypes we observed were tri-allelic Type 2, which is due to a constitutional chromosomal rearrangement, while tri-allelic Type 1 is probably due to a mutation in an early somatic cell5. The tri-allelic Type 2 pattern of TPOX is present at a very low frequency in various human populations, ranging from 0.003 to 0.2%18. The highest frequency of tri-allelic TPOX genotypes observed (2.4% or 165/6827 people) was in indigenous black populations from South Africa19. Although our sample size was small, the observed frequency (3% or 3/102 people investigated) supports observations in African populations19. Moreover, the presence of tri-allelic genotypes of TPOX in Gabon (Central Africa) supports the hypothesis that the TPOX variants may have existed before the expansion of Bantus from Central Africa19,20.

In the second family, a mother with three TPOX alleles transmitted two of them to her son, supporting the hypothesis that the extra allele comes from the X chromosome, as proposed by two studies6,19, or from chromosome 2 with a potential impact of chromosomal rearrangement on the activity of Y-sperm21.

In the third family, the pattern is entirely different. Each parent transmitted an allele to their daughter (allele 8 from the mother and allele 9 from the father), but the daughter shows an extra allele de novo, allele 10. Some authors have suggested that the additional allele in TPOX is allele 106,19. However, other authors have shown that it is allele 11 in Chinese and Korean populations21. Allele 11 results from a strand slippage mutation of an extra allele 10 of TPOX originating from Bantu groups in Africa21. Our results show that allele 10 is an additional allele, as observed in the TPOX locus of the daughter in the third family. Furthermore, this allele was found in other tri-allelic TPOX genotypes in our study: a healthy mother and her healthy child (from the second family).

Limitations of the study and future directions

Due to a lack of funding, we could not sequence the extra-allele 10 of TPOX found in Family #3. This was the main limitation of this work. Future studies could sequence the de novo allele in this girl or extend population genetic study to her entire family.

Conclusion

We observed similar allele frequencies of 22 STRs to those in other Black populations. These findings suggest that these STRs are good identification markers, allowing us to diagnose aneuploidies without symptoms. The presence in chromosome 21 of a tri-allelic genotype of the Penta D locus with a new allele in our study suggests that we need more in-depth studies of this locus in sub-Saharan Africa. The presence of three subtypes (8-8-10, 8-10-11, and 8-9-10) of the tri-allelic variants of TPOX in our small sample suggests that we need an extended study of genetic polymorphism in Central Africa, where the Bantu peoples originate.

Methods

Data were collected during paternity tests on indigenous Gabonese people. As such analyses are not yet routine in Gabon, we collaborated with a partner laboratory (Labor Für DNA Analytik/Germany) for the complete analysis after DNA extraction. On this topic, DNA was extracted from buccal swabs and prepared with the nucleospin tissue kit following the manufacturer's protocol (Macherey Nagel, Freiburg, Germany). This form clearly states that the signatory parties agreed to use the results for research and publications.

To assess allelic frequencies in this study, we only considered unrelated subjects who were therefore defined in two ways. For 2-parent families, all children were excluded from the unrelated subject’s group. In the case of single-parent families, we included in this group, children without proven parentage with the presumed father.

For the purposes of this study, twenty-two STRs were used (D1S1656, TPOX, D2S1338, D3S1358, FGA, D5S818, CSFPO, F13A01, SE33, D7S820, D8S1179, TH01, vWA, D12S391, D13S317, Penta E, D16S539, D18S51, D19S433, Penta D, D21S11, D22S1045). Primers were ordered from TibMolBiol, Berlin (Germany) according to reference sequences for the 22 STR loci available at Ref.7. We labelled one primer of each primer pair in yellow (5'fluorescein). Alleles were determined based on allelic ladders. To get allelic ladders, we have amplified the Powerplex 2.1 ladder from Promega (Walldorf, Germany). Each ladder was checked with commercially available K562 and M2800 DNA (Promega, Walldorf, Germany). Finally, 0.3–0.5 µl of the PCR reaction were loaded on an acrylamide gel. For gel preparation, the Rotiphorese Sequenzier-Gel system from Carl-Roth GmbH, Karlsruhe (Germany) was used (Cat. No. A431). Based on Labor für DNA-Analytik results, likelihood ratio values were calculated according to Ref.8.

Ethics approval and consent to participate and for publication

This study was conducted according to the Declaration of Helsinki9 and the legislation in force in the Gabonese Republic. The Scientific Council of the Mother and Child University Hospital—Jeanne Ebori Foundation of Libreville, in charge of ethics approval in this medical facility, approved the study. Furthermore, Consent was obtained from all participants. Before any sampling in the context of parentage screening, the fathers and mothers give us their written consent by signing the consent form of DNA-LAB-Gabon dedicated to this purpose, preceded by the words "read and approved". This form clearly states in point 4 that the different signatory parties agreed that the results are used for research and publications. Finally, if the child was a minor, all parentage tests were carried out with the written consent of the father, mother and/or legal guardian. Where applicable, tests were conducted by decision of the court of first instance of the city of residence.

Statistical analysis

Data were compiled using Microsoft Excel 2013, and the database was analysed using R version 4.2.2. We used the Wilcoxon signed rank test with continuity correction to compare allele frequencies of STR loci between Gabonese subjects from 2002 and 2023; and between Gabonese subjects and African-American subjects. Values of p < 0.05 were considered statistically significant. Allele frequencies were calculated from the numbers of each genotype obtained in the sample set of unrelated subjects. Expected heterozygosity (He), power of discrimination (PD), probability of exclusion (PE), and exact tests of Hardy–Weinberg equilibrium were all performed using EasyDNA software (https://saasweb.hku.hk/EasyDNA/)10.