Background

Africa is considered as the origin of modern humans and is one of the most ethnically and genetically diverse regions of the world [1]. The African populations were shaped by demographic forces that influence variation on a genome-wide scale, such as ancient migration events and fluctuations in population size, and by evolutionary forces that influence individual loci, such as natural selection and mutation [2]. A great impact on human evolution is associated with malaria, one of the most significant infectious diseases in the world [3]. Selection pressure by this pathogen on the human genome occurred at least 10,000 years ago. It has been generally recognized that malaria was one of the factors determining population increases in the incidence of atypical variants of haemoglobin, such as haemoglobin S (HbS), C (HBC) or E (HBE), as well as variability in the Duffy blood group system.

The Duffy antigen is located on the surface of red blood cells. Duffy antigens are coded on the Duffy antigen receptor for the chemokine gene (DARC), located on the long arm of chromosome 1 (1.q22-1.q23) [4]. The protein encoded by this gene is a glycosylated membrane protein, which is a non-specific receptor for several chemokines and can be utilized by human malarial parasites Plasmodium vivax and Plasmodium knowlesi in order to invade erythrocytes.

Genetic polymorphisms were identified in humans, which affect the expression of the Duffy antigen. Duffy antigen is present in two major allelic forms, differing in an amino acid at position 42 in coding sequence (G125A) (i.e. glycine in FY*A and aspartic acid in FY*B) [59] and point mutation at position T-33°C (previously described as T-46°C) in GATA box responsible for the FY*AES (ES = erythrocyte silent) and the FY*BES alleles (common African allele) [3, 1016]. These four alleles: FY*A, FY*B, FY*AES and FY*BES contribute to a total of ten genotypes: FY*A/FY*A, FY*A/FY*B, FY*A/FY*AES, FY*A/FY*BES, FY*B/FY*B, FY*B/FY*AES, FY*B/FY*BES, FY*AES/FY*AES, FY*AES/FY*BES, FY*BES/FY*BES[17, 18]. The last three genotypes of those above-mentioned are responsible for the phenotype Fy(a-, b-) which is responsible for resistance to malaria. This phenotype is predominant among African people, especially those originating in West Africa and it is rare among non-African populations [11]. This lack of DARC receptors among almost all individuals of West African origin has been suggested as a natural selection induced by P. vivax[19]. Moreover, a less common variation of the allele FY*X, associated with weak expression of Fyb antigen was also identified [8, 10, 20].

Due to the characteristic distribution of DARC alleles: FY*AES and FY*BES , they have been used to assess admixture and ethnic backgrounds of populations like Saudi Arabs [21], Israeli Jews [22, 23], Grande Comore Island islanders (Njazidja) [24] and also some African descent populations [25]. Characterizing the pattern of genetic variation among ethnically diverse African populations is critical for reconstructing human evolutionary history, clarifying the population history of Africans and African-Americans and determining the proper design and interpretation of genetic disease association studies [26, 27].

Genetic variation of many populations across Africa is enigmatic [28]. There are some very uncommon and unique African tribes that are not well genetically characterized. Current demographic and political situations in Africa make any anthropological research conducted in this area very valuable. The Shagia tribe, until recently, inhabited the area from the Kurti to the Fourth Cataract. This tribe led an agricultural lifestyle and was regarded as morally and culturally governed by the Sudanese Arabs. The village populations were not numerous (40-60 persons per village) and were located at small distances from each other in difficult to access areas. Residents were isolated geographically and culturally, which led to genetic isolation. Recently, the natural residence of this tribe has been changed due to the construction of dams in the Merowe at the height of the Fourth Cataract of the Nile river (Sudan), which required massive displacement and dispersion of the Shagia tribe. For that reason the presented study is exclusive and irreplaceable. The second tribe examined, the Manasir, lives approximately 80 km from the described Shagia villages and represents a population with admixture of other populations. Approximately 50,000-78,000 people (including the ethnically related Shagia and Manasir tribes) have been displaced to the settlements built in the Nubian Desert. Despite the undoubted benefits of the construction of new hydroelectric plants, such as universal access to electricity, there are far-reaching consequences for the environment and human health (e.g., the development of diseases, such as Snail fever).

The purpose of this study was to analyse the genetic structure of the Shagia and Manasir tribes using the variability of the single nucleotide polymorphisms in Duffy gene, such as two typical for African populations SNPs: T-33 C [5, 29] and G125A [14, 30], along with two other SNPs: G298A [3133] and C5411T.

The distinction of the Shagia tribe is one of the reasons why the prevalence of gene polymorphisms in the FY (Duffy) is very interesting. The results obtained in this study, based on the biological samples collected before the tribe dislocation, may be helpful in determining the origin of this tribe and enriching the existing knowledge on population movements in the African continent.

Methods

Subjects

An analysis of polymorphisms was performed on 217 individuals (101 females and 116 males; median age 30 years). Genetic material (buccal swabs) was collected from 126 individuals of the Shagia tribe and from 91 representatives of the Manasir that consented to participate in the study. Both tribes inhabited the Fourth Cataract of the Nile at the time of the study.

The authors state their research conforms to the Helsinki Declaration and to local legislation. Informed consent from patients was a prerequisite. Institutional ethical clearance from the Ethic Committee of Pomeranian Medical University was obtained.

Genotyping

Buccal swabs for subsequent DNA extraction were collected from all individuals that consented to participate in the study. Collected swabs were carefully dried in a separate area in order to avoid contamination. For DNA extraction, a BuccalAmp DNA Extraction Kit (Epicentre, Madison, USA) was used. Oligonucleotide primers and TaqMan probes for Duffy polymorphisms (i.e. Rs2814778, Rs12075, Rs13962 and Rs863002) were designed and synthesized by Applied Biosystems (ID: C__15769614_10, C__2493442_10, C__2493443_10, C__7480790_10, respectively). The fluorescence data were analysed with allelic discrimination 7500 Software, v.2.0.2.

Statistical analysis

Statistical analyses were performed with the StatView® program using the chi2 Pearsons, Fisher test. Haplotype analysis using haplo.stats [34] package was made by R software, version 2.10.1. The validation of haplotypes and a linkage disequilibrium (LD) block of polymorphisms were performed using Haploview, version 4.02. P < 0 0.05 was considered statistically significant.

Results

The identification of genes encoding the blood group antigens, such as DARC, has been used for many years to assess population admixture and ethnic backgrounds. Duffy polymorphisms are characterized for African populations, therefore identification of the FyBES variant in non-African populations is likely to be the result of admixture of the African-American genetic background [6, 11, 35]. The genotype and allele frequency distributions recorded for four analysed SNPs in two population groups, the Shagia and Manasir tribes, are presented in Table 1. Genotype distributions were consistent with the Hardy-Weinberg expectations for T-33 C and G125A SNPs only (Table 1).

Table 1 Allele and genotype association results for Duffy in the Shagia and Manassir

Pair-wise linkage disequilibrium (LD) between T-33 C and G125A polymorphisms was calculated using the normalized disequilibrium coefficient (D') and was equal 0.51 (Corr = 0,31, p = 5e-10).

The analysis of allelic and genotype frequency in the T-33 C polymorphisms demonstrated a significant dominance of the C allele and CC genotype (OR = 0.53 [0.32-0.88]) in both tribes. Moreover, the frequency of allele C was significantly higher in Shagia than Manasir (p = 0.02). The prevalence of CC genotype in the Shagia and the Mansir was respectively 77% and 62.6%. These results are in agreement with previous reports showing that the -33 C allele was present in 79% of South African blacks [25] and 78.3% of African-American populations [30]. Allelic and genotype frequencies in other analysed SNPs, i.e. G125A and C5411T were similar in both tribes and align with the data representative for Africans. The study showed that AA homozygotes are dominant in G125A polymorphisms among the members of Shagia and Manasir tribes (88.9% and 86.8%, respectively). With regard to G298A polymorphisms the genotype frequencies were different between the tribes (p = 0,002). It is worthy of note that no single AA homozygote was found in the examined groups although the presence of the AA genotype in sub-Saharan Africa was reported (0.99%). According to HapMap the GA heterozygotes among European populations predominate in G298A polymorphisms. On the other hand, the GG homozygotes, which were found in both analysed tribes, are typical of the Euro-Brazilians, Afro-Brazilians, Asian-Brazilians and Amerindians (Cayapo/Yanomama) [31].

The haplotype frequency was designated on the basis of two SNPs: T-33 C and G125A (Table 2). Duffy-negative blood groups were identified in 83% of the Shagia and 77% of Manasir, what is consistent with the epidemiological data which demonstrated that 70 to 80% of people in north-eastern Sudan had Duffy-negative blood group [36]. The frequency of the obtained haplotypes (Table 2) was compared to SNP genotypic data from HapMap Data Rel 27 and the similarity was observed between the examined groups and the population of African ancestry in the Southwest USA (ASW: CA = 0.81; TA = 0.07; TG = 0.12).

Table 2 Two-locus (T-33 C, G125A) haplotype analysis using HAPLO.SCORE, overall p = 0,08

In the presented study, determination of Duffy genotype and phenotype was based on the T-33 C (GATA polymorphism) and G125A SNPs (Table 3) [1012, 15]. Twenty combinations of genotypes for the Shagia and Manasir tribes were examined (Table 3). There was a statistically significant difference in frequency of one of the genotype combinations, i.e. CC/AA/GG/CT (FY*BES/FY*BES ) (p = 0.001) which occurred in 45.87% of the Shagia tribe but in only 6.59% of the Manasir tribe.

Table 3 Frequency of T-33 C, G125A, G298A and C5411T genotype combinations

For homozygotes with the inactive-GATA site and G125A genotypes: FY*B/FY*B or FY*A/FY*B, the deduced genotypes were FY*BES/FY*BES and FY*AES/FY*BES , respectively and the phenotype was Fy(a-, b-).

The genotype and phenotype were consistent with G125A when the individuals were wild-type GATA site homozygous. According to Woolley et al[37] and Yazdanbakhsh et al[10], the mutation in the GATA box is associated with the functional allele FY*B[10, 37]. However, Zimmerman et al[15] suggest that the T-33 C polymorphism affect Fy expression on the FY*A and FY*B alleles in a similar manner [15]. Therefore, in the case when an individual was characterized by heterozygous GATA site (GATA+/-) and the genotype FY*A/FY*B (for G125A), both types of genotypes, i.e. FY*AES/FY*B or FY*A/FY*BES (phenotype Fy(a+, b-) or Fy(a-, b+)) can be predicted. Similarly, for FY*A/FY*A homozygous, two deduced genotypes such as FY*AES/FY*A or FY*A/FY*AES (phenotype Fy(a+, b-)) were assigned. Moreover, Zimmerman et al[15] imply that the FY*AES allele is a new allele independent of African origin, as opposed to the FY*BES alleles predominating in African and African-American populations. The FY*BES allele was also the most numerous identified in the Shagia 51 (+/- 2.5) and in the Manarsir 32.5 (+/- 4).

The most remarkable finding in the presented study is the identification of occurrence of FY*AES allele in the analysed tribes, which is consistent with the anthropological data and may suggest free flow of the genetic material in the analysed populations, i.e. admixture of non-African population in some periods. The presence of individuals with the FY*A/FY*A genotype in the Shagia tribe, who are not found in the Manasir population (Table 3), indicates more numerous re-combinations among the Shagia than the Manasir people. Interestingly, admixture of non-African population in this geographically isolated tribe was additionally confirmed by the authors in previous studies [38]. Furthermore, the genetic structure of the Shagia population differs significantly from the Manasir population, where there was a free flow of genetic material as the Manasir tribe inhabits the region of the Fourth Nile Cataract with other populations such as the Beniamer, Ababda, Foraui and Kesinger tribes. It is worth noting that the FY*AES allele is not observed in a more ethnically diverse populations, for example, in some tribes from Papua New Guinea (i.e. South Wosera) [15].

Indigenous Africans are characterized by high levels of genetic diversity within and between populations [26]. Also, the mtDNA evidence has suggested that southern Africa and the remainder of sub-Saharan Africa were regionally differentiated [39]. It would be interesting to continue research using more powerful bio-markers, such as Y-chromosome and/or mtDNA. Unfortunately, both the nature of the research conducted by this study and the small amount and unique character of the collected material make it impossible.

Conclusion

The present study demonstrated, by all accounts for the first time, Duffy-negative genotype combination i.e. FY*BES /FY*AES . Moreover, the results suggest the admixture of non-African genetic materials in both tribes. In the case of the Manasir population, it is in accordance with the collected ethnographic data, while with regard to the isolated Shagia tribe, which has not been anthropologically and genetically described so far, it is a useful trace for explaining the history of this population. The FY*AES allele, identified in both tribes, is not likely to be directly related to the exposure of a human genome to P. vivax but rather may be a consequence of admixture of the non-African genetic background.