Introduction

The dopamine receptor D4 (DRD4) gene has been of particular interest in recent studies both because it is highly variable in humans, and it seems to affect complex behaviors. Polymorphisms in this gene have been associated with behavioral traits such as novelty seeking and impulsivity (Ebstein et al. 1996; Okuyama et al. 2000), the desire for migration (Chen et al. 1999; Matthews and Butler 2011), sensation seeking (C. J. Thomson et al. 2013), as well as modern disorders such as ADHD (Bellgrove et al. 2005) and schizophrenia (Lai et al. 2010).

Certain polymorphisms in the DRD4 gene such as the 7 repeat allele in a 48 bp VNTR in exon 3 and the C allele of the −521 C/T Single Nucleotide Polymorphism (SNP) in the promoter region (rs1800955) have been discussed to cause a reduced sensitivity of the rewards system (Hattori et al. 2009; Lichter et al. 1993; Munafò et al. 2008; Okuyama et al. 2000) that can be associated with a tendency for novelty seeking behavior. Novelty seeking (NS) is a personality trait is defined by a tendency towards exploratory behavior and intense excitement in response to novel stimuli, impulsive decision making, and quick loss of temper and avoidance of frustration. Research has suggested that in resource deprived or constantly changing environments, this can be a beneficial trait, allowing a person to find novel solutions to survive. On the other hand, in populations with abundant resources and highly structured social environments, people with this trait would exhibit restless behavior, and the same trait would become detrimental to success (Jensen et al. 1997; Williams and Taylor 2006).

A recent study (Eisenberg et al. 2008) showed evidence of this pattern in a modern population. In this study on the Ariaal people of Africa, the 7 repeat allele VNTR form was found to correlate to better nutrition in nomads, but worse nutrition in agriculturalists. Another study comparing indigenous hunter-gatherers and agriculturalists in South America found a positive correlation with foraging subsistence and high frequencies of the 7 repeat allele suggesting that the subsistence mode has selective impact on the polymorphism (Tovo-Rodrigues et al. 2010). Both studies have been conducted on modern populations, thus it has to be considered that observed frequencies might have been biased by past admixture with other populations or even that the modes of subsistence as well as the grades of social complexity in these populations changed over time, meaning that e.g. populations classified has hunter-gatherers had a history of being horticulturalists, pastoralists, or agriculturalist and more recently switched to a foraging lifestyle. Here we suggest another research strategy that allows us to test possible selective influences of socioeconomic changes on DRD4 polymorphisms over time: the analysis of ancient DNA from prehistoric South American populations dating to different time periods and associated with different modes of subsistence and organization through the archaeological record. If NS has been an evolutionary beneficial adaptation in the human past, and there is a genetic predisposition for this trait, we should find high frequencies of alleles associated with NS in prehistoric hunter and gatherers. We should also observe a frequency decrease of alleles determining this personality trait with increasing social complexity, structure and the transition to producing economies in human groups, if NS is really maladaptive in these social environments, due to selective evolutionary processes.

Most studies on the DRD4 gene have largely focused on the 7 repeat allele of the 48 base-pair VNTR in exon 3. Since we aim to address our hypothesis using ancient DNA, issues of preservation would prevent us from observing a polymorphism involving so many base pairs. For this reason, this study looks at the −521 C/T SNP (rs1800955) instead. In addition, this SNP is also potentially a more reliable marker for examining novelty seeking. Meta-analysis has shown that the C allele of this SNP could account for up to 2 % of phenotypic variance of the novelty seeking personality trait, while no association with this trait was found for the VNTR (Munafò et al. 2008; Schinka et al. 2002). The promoter region SNP has been reported to be in linkage disequilibrium with the exon III polymorphism (Ekelund et al. 2001), and is associated with variation in expression of the D4 receptor, with the T allele associated with a reduction in transcription levels of up to 40 % compared with the C allele (Ronai et al. 2001).

The goal of this study is to determine whether or not the frequency of C alleles in the SNP −521 C/T decreased over time in native South American populations due to selection for the T allele, and if possible to determine which societal changes most likely had the largest impact on allele frequencies.

Materials and Methods

The samples for this study derive from 17 archaeological sites mostly from the Central Andes with the exception of the site Arroyo Seco 2 situated in the Argentinian Pampas (Fig. 1). All samples analyzed in this study have been part of previous studies investigating mitochondrial (mt) and nuclear genetic loci or are currently analyzed in genome wide sequencing projects (L Fehren-Schmitz et al. 2010; Lars Fehren-Schmitz et al. 2014, 2015). The sites studied where chosen from the collections of the UCSC Human Paleogenomics Lab (UCSC_HPL) so that they represent a diversity of archaeological periods, associated subsistence strategies, and grades of socioeconomic complexity The samples from 125 individuals selected from these sites represent the ones with the best DNA preservation observed in previous investigations (endpoint PCR, qPCR, shotgun sequencing), and therefore were most promising to also contain autosomal DNA. For comparison analysis we collected genotype data from the 1000 Genome Project Lima (PEL) population (Altshuler et al. 2012).

Fig. 1
figure 1

Map showing the archaeological sites from which samples derived for this study

All populations analyzed besides their chronological division have been assigned to basic categories with regards to their subsistence (hunting/foraging; horticulture, pastoralism, agriculture, and mixed economies), social organization following the simplified system of Service (Service, E 1962); bands, tribes, chiefdoms, states), mobility (nomadic, semi-sedentary, sedentary), and altitude (lowlands/coast, highlands above 2500 m.a.s.l.). For a list of all populations/samples studied and their assigned categories refer to Table 1 and supplementary Table 1. Assignments to categories were based on the available archaeological literature for the sites (e.g. Grieder 1988; Reindel 2009; Pucciarelli et al. 2010). It has to be acknowledged that these categorizations cannot reflect the individual complexity of the archaeological groups included in this study and that transitions between assigned categories can be more or less fluent, but as a model for the questions addressed in this study the chosen systematics are the only ones feasible.

Table 1 List of sites tested showing archaeological dates and categorizations with regards to social organization, subsistence, and altitude

DNA was extracted from the teeth samples employing the protocol described in Fehren-Schmitz et al. (2014, 2015). At least two independent extracts were made for each sample to allow an authentication of analysis results by comparison. All analyses were carried out in laboratories entirely dedicated to ancient DNA analysis at UCSC (UCSC_HPL) and the Department of Anthropology Goettingen (GoA); hence no modern DNA-based studies have been performed in the laboratory. All analyses were carried out according to the strict precautions and contamination prevention strategies necessary for ancient DNA analysis described in detail in Fehren-Schmitz et al. (2010, 2015).

To determine the genotypes of the −521 C/T (rs1800955) Single Nucleotide Polymorphism in the dopamine D4 receptor gene (DRD4) we developed a PCR amplifying 87 bp spanning the sequence containing the SNP of interest coupled with a Single Base Extension (SBE) assay to directly type the SNP. Primers for the main PCR and the SBE PCR were designed using the Primer Select software (Lasergene 8.0 package, DNAstar). The minisequencing/SBE primers were designed one base contiguous to the polymorphic site of interest in reverse orientation. An additional, polymeric-A tail was added to the 5’ end in order to ensure an effective separation of product lengths during electrophoresis (Nelson et al. 2007). Primer sequences can be found in Supplementary Table 2. The main PCR amplification was carried out in a total volume of 12.5 ul containing 6.25 ul Qiagen Multiplex PCR Master Mix (Qiagen, Hilden, Germany), 0.2 uM of each primer, 2–5 ul of DNA template, and DNase free RT-PCR grade H2O (Ambion1, Austin, TX) to complete the final reaction volume. PCR amplification took place in a C1000 Touch Thermal Cycler (Biorad, Hercules, CA) under the following conditions: initialization at 95 °C for 5 min; 40 cycles at 94 °C for 1 min, 65 °C for 1 min, and 72 °C for 1 min; final elongation at 60 °C for 30 min. PCR success and PCR product quantity were checked by gel electrophoresis on 2.5 % agarose gels.

To remove excess primers and dNTPs from the previous reaction the 7.5 ul PCR product of the basis PCR were first incubated together with 2.5 units (U) rAPid Alkaline Phosphatase (1 U/ul, Roche, Mannheim, Germany), and 1 U Exo I (20 U/ul, New England Biolabs) in a thermocycler for 1 hour at 37 °C, and following 15 min at 75 °C for heat inactivation. Afterwards 1 ul of the purified PCR product was used for the SBE reaction in a final reaction volume of 5ul consisting of the template, 2.5 ul SNaPshot Ready Reaction Mix (Applied Biosystems, Carlsbad, CA, USA), 0.1 uM SB primer and 20uM (NH4)2SO4 (Merck, Mannheim, Germany). The ammonium sulfate was added for the suppression of nonspecific peaks in the primer extension reaction (Doi et al. 2004). Subsequently, the reaction mixture was treated with 2.5 units of rAPid Alkaline Phosphatase (Roche) at 37 °C for 1 h, and 75 °C for 15 min. to remove the unincorporated fluorescent ddNTPs remaining in the mixture. The Primer extension reaction products were separated and detected by capillary electrophoresis using POP4 polymer on an ABI PRISM 310 Genetic Analyzer (Applied Biosystems). The data was analyzed using GeneScan Analysis Software Version 3.1.2 (Applied Biosystems), and DRD4 -521C/T genotypes were determined by confirming the base substitution at the SNP site. Supporting Information Table 1 shows the base substitutions for the SNP for each sample successfully tested.

To determine the mitochondrial haplogroups of individuals not tested before we employed a multiplex SBE assay typing 26 SNPs in the mitochondrial genome in parallel determining common Native American haplogroups. Protocols and conditions for the AmericaPlex26 were followed as described in Coutinho et al. (Coutinho et al. 2014). Analysis of mtDNA was performed to ensure the Native American ancestry of the individuals.

Each PCR was conducted at least two times from each of the two independent DNA extracts. Additionally 70 samples were also tested in a second ancient DNA lab at the University of Goettingen (Goa) employing the same protocols as described above. If results were inconsistent and there was a risk of false allele determination due to allelic dropout, experiments were repeated more often to find a consensus or discarded. Negative extraction controls and negative PCR controls were employed in this study.

Allele frequencies for the DRD4 SNP were obtained directly by gene counting. The Hardy–Weinberg equilibrium was assessed by the exact test of Guo and Thompson (Guo and Thompson 1992), using the ARLEQUIN v.3.5 software.

To evaluate if changes in the DRD4 rs1800955 T allele frequency observed in the time transect can be explained by genetic drift, or if natural selection needs to be considered we used a forward simulation approach based on Wilde et al. (Wilde et al. 2014) that also allows the estimation of the selection coefficient (s). To reflect uncertainty in the ancient allele frequencies due to low sample size we first drew the allele frequency estimate for the oldest population (Early/Middle Archaic), from a random Beta (n p + 1,n q + 1) distribution, where n p and n q were equal to the number of the respective ancestral and derived allele, in each forward simulation. We then obtained 10,000 MCMC samples by binomial sampling across generations for each combination of priors. We used a forward simulation for drift with selection using equation 3.5 from Maynard Smith (Maynard Smith 1998). The number of generation forward simulated was determined by defining the number of generations between the oldest population (Early/Middle Archaic) and the youngest population (Late Horizon/Modern Lima) tested (318 generations, assuming 25 years generation time). We tested 110 different combinations of priors. The variable priors tested were effective population size (Ne) at the time of the ancient sample (Ne = 100–100,000) and the selection coefficient (s = 0.00–0.02). The upper limits of both priors tested were defined by the empirical observations made during the simulations. As there is no evidence for a dominant or recessive mode of action for the SNP, an codominant model of inheritance was applied (Thomson et al. 2014). Subsequently, the simulated distribution of allele frequencies at generation 0 were compared with those observed using the equation 1–2 × |0.5 – P|, where P is the proportion of simulated modern allele frequencies that are greater than that observed, yielding a two-tailed empirical P value for the observed allele frequency changes for all prior combinations tested (Voight et al. 2005; Wilde et al. 2014).

In an attempt to go beyond the detection of a potential selective impact on the variability of DRD4 -521 we also tested the association between the allele and genotype frequency of the SNP and populations in different ways of living following the categorizations described in supplementary Table 1. Allele distributions were compared using Χ2 Tests or Fisher’s exact test depending on the sample size of the combinations tested. All tests were two-tailed, and significance level was set at 0.05. The analyses were performed with the SPSS software, version 16 (SPSS, USA). Population genetic structure was investigated using analysis of molecular variance (AMOVA; (Excoffier et al. 1992). Subsistence, socioeconomic complexity, mobility, and geography grouping approaches for subdivision were considered using the ARLEQUIN v. 3.5 program (Excoffier and Lischer, 2010).

Results

A total of 79 ancient samples out of the 125 samples tested were successfully genotyped for the DRD4 -521 C/T alleles. Only those samples were integrated into the downstream analyses that reproducibly yielded the same genotypes in at least four independent PCRs from two independent extracts to exclude wrong genotype calls due to allelic dropout. The successfully analyzed samples distribute over all archaeological sites and periods that have been tested with the later sites from the Andean highlands showing significantly better DNA preservation. The individual genotyping results are found in Supplementary Table 1. All individuals tested with the 26 mitochondrial SNP AmericaPlex assay belong to one of the four Native American mt-haplogroups A, B, C, and D (cf. Supplementary Table 1).

The successfully sequenced samples were grouped into categories by altitude, mobility, subsistence pattern, chronology, and social organization for analysis. Table 2 shows the genotype and allele frequencies for all the categories in the different group comparisons.

Table 2 Genotypic and allelic frequencies of the DRD4 -521 C/T polymorphism observed in the ancient South American samples grouped by Chronology, Altitude, Mobility, Subsistence, and political organization (ref. Supplementary Table 1 for individual genotypes)

We observed a slight increase of the DRD −541 T allele frequency from the oldest population associated with a foraging lifestyle to the youngest pre-Columbian populations (cf. Table 2). It is worth mentioning that the DRD4 SNP genotype- and allele- frequencies of the late pre-Columbian populations resemble that of the modern Peruvian population from Lima. Chi square tests show that the allele frequencies between the oldest and youngest group are slightly non- significant (df = 1, x2 = 2.24, p = 0.088) while stepwise comparison of the four chronological groups (Early, Intermediate, Late, Modern) was tested slightly significant (df = 3, x2 = 7.82, p = 0.045). Since the accuracy of these standard tests probably suffers from the small sample size we further tested if the observed allele frequency changes can be explained by neutral genetic drift alone using a binomial forward simulation approach. While neutrality (selection coefficient s = 0) for the DRD4 SNP was not completely rejected assuming very low population sizes (Ne = 100–1000) scenarios considering light selection found higher support using a codominant model (Fig.2, Supplementary Table 3). The selection coefficient that best explained the observed frequency changes of the DRD −541 T allele between the Early/Middle Archaic populations and the Late Horizon/Modern Population was s = 0.003 (Fig. 2).

Fig. 2
figure 2

Heatmap illustrating the two-tailed empirical P values for the similarity between the observed allele frequencies in the Late Horizon/Modern Central Andean population and the distribution of simulated frequencies, given the priors (x = Ne; y = s) modeled between the oldest and youngest populations

To further explore if the observed selection acting on the DRD4 -541 T allele can be explained due to the association with reduced susceptibility to NS and the proposed negative association of the trait with increasing social complexity and agricultural lifestyle we compared the allele and genotype frequencies between the social and geographical categories.

For altitude, those populations below 2500 m.a.s.l. were classed as low altitude, those above were high altitude. Higher frequencies of the C allele were observed at low altitude. A chi squared significance test showed that the difference in allele frequency between high and low altitude was not significant (df = 1, x2 = 2.39, p = 0.1221). Allele frequency comparison of the populations grouped by subsistence mode (mixed foraging/horticulture vs. agriculture) revealed no significant difference (df = 1, x2 = 0.9, p = 0.3428) as did the grouping by mobility category (p = 0.09243). Social organization was differentiated as state and non-state (band, tribe, or chiefdom) individuals. Using a chi squared test, the difference in allele frequencies was found to be statistically significant (df = 1, x2 = 4.65, p = 0.0311) while genotype difference were found to be slightly non-significant (df = 2, x2 = 5.02, p = 0.0813). The additional tests for genetic structure using AMOVA (Table 3) however found no significant population structure (P = 0.115; Table 3) even though the comparison exhibited the highest level of molecular variation between the suggested divisions (5.18 %). None of the other divisions tested by AMOVA exhibited significant amounts of among group genetic variation (Table 3). Also, none of the groups showed significant deviation from Hardy-Weinberg equilibrium.

Table 3 AMOVA for the partitions of variance considering the six types of classification

Discussion

This study is the first test the hypothesis that polymorphisms associated with the novelty seeking personality trait are under selection due to changes in subsistence modes or social complexity from a diachronic perspective using ancient DNA analysis. Using a forward simulation approach testing genetic drift plus selection and considering bias due to low sample size we find that the DRD −521 T allele has been under light selection (s = 0.003) in western South American populations considering a codominant allelic condition of the SNP. To explore the possible impact of other allelic conditions on our results we also tested dominant or recessive models for some prior combinations and found that neutrality is rejected for both while the selection coefficient estimated differs slightly.

However, to infer natural selection based on temporal differences in allele frequency it is necessary to assume population continuity. Previous ancient DNA studies, including the samples analyzed, found clear evidence of population continuity in western South America, or the Central Andes, in the monitored time frame based on mitochondrial, autosomal, y-chromosomal markers (Kemp et al. 2009; Carnese et al. 2010; Fehren-Schmitz et al. 2010; Fehren-Schmitz et al. 2011; Fehren-Schmitz et al. 2014; Fehren-Schmitz et al. 2015). Studies of uniparental inherited and genome-wide genetic markers link these ancient populations to the modern population of the Central Andes (Bisso-Machado et al. 2012; Lewis et al. 2007; Reich et al. 2012; Wang et al. 2007). These studies also suggest that geographic structure was established very early in South America followed by very limited gene flow into the Central Andes until more recent admixture events with other continental populations occurred after the European contact (Reich et al. 2012). To exclude a possible bias due to the inclusion of a geographically distant population into our population sets we excluded the Arroyo Seco samples from the Early/Middle Archaic population and re-run the simulations confirming that selection acted on the DRD4 -521 T allele.

The 1000 Genome Project and the ALFRED database demonstrate relatively low variability for the DRD4 SNP in the global population. The populations from the Americas exhibit the highest mean frequency of the T allele (0.63) referring to the 100 Genome Project data (Altshuler et al. 2012). On the individual population level ALFRED reveals a high amount of diversity between native American groups with T allele frequencies ranging from ~0.62–0.95 (Osier et al. 2002). The Karitiana, an indigenous Amazonian group practicing a foraging lifestyle mixed with horticulture and little contact to agricultural groups, exhibit one the highest DRD4 -521 T allele frequencies. In the first instance this might contradict the outcome of our study as we would expect higher frequencies of the Novelty Seeking associated G allele in the Karitiana. However, Native South American populations show a high amount of genetic diversity not shared between geographic regions (Wang et al. 2007). The previously mentioned processes of early geographic structuring followed by isolation and drift and observed high inter-population differentiation between the Central Andes and other regions of South America are suitable to explain this observation, especially when considering that the effective population size of the initial population wave dispersing into South America was very low (Hey 2005; Reich et al. 2012; Tamm et al. 2007; Wang et al. 2007). Additionally, it has to be considered that the population size of the Karitiana is very low and gene-flow is very limited increasing the impact of genetic drift on allele frequencies (Zigtkiewicz et al. 1997). The post-European population size decline had massive impact on Amazonian groups suggesting a bottleneck effect that might have influenced the patterns of genetic diversity we observe in contemporary indegenous populations even though they had very limited contact to the ouside world in the post-Colombian era (Livi-Bacci 2006; O’Fallon and Fehren-Schmitz 2011). All these observations support our hypothesis that the observed relatively low DRD4 -521 T frequency in the early Central Andean populations might be explained by founder effects, and that subsequent selection lead to an increase in the alleles frequency. This would also mean that selection for this marker was limited to specific social and environmental factors found in the research area.

The data we collected on altitude gave interesting insight to other potential sources of selective pressure. One study found an association between the T allele of the −521 C/T SNP and increased susceptibility to pre-eclampsia, an illness that would likely be fatal more often at high altitudes where considerable cardiac stress already exists (Korobochka et al. 2006). Although the difference was found not to be significant, the much higher proportion of the T allele at higher altitudes (cf. Table 2) suggests that the SNP is not associated with differential fitness at high altitudes.

In contrast to the study by Tovo-Rodrigues et al. (2010)) we also find no significant genetic differences between hunter-gatherers and agriculturalists. Considering the exploration-exploitation trade-off we would assume the comparison to be the most obvious to show significant differences when increased explorative behavior due to NS should be beneficial in settings with less resource security and susceptibility to environmental changes (Humphries et al. 2012; Yamaguchi et al. 2015). It has to be stressed that this study did not analyze the same genetic marker but looked at the 48 bp VNTR in the Exon III region of the DRD4 gene. Thus, a direct comparison of the results is not possible. The obviously low sample size of our study also might have biased the results of the comparison due to lack of sensitivity. Associations between climate and economy have been found for the Val158Met polymorphism in the COMT gene recently suggesting that future studies trying to test this hypothesis should expand the number of genetic markers studied (Piffer 2013).

We find a slightly significant lower frequency of the C allele that is thought to contribute to the novelty seeking phenotype in samples deriving from state level societies (Wari, Inca, Modern) of the Central Andes than in all earlier populations, but no significant genetic substructure could be found. The assumed contribution of the tested SNP to the NS phenotype was determined to be about 3 % (Munafò et al. 2008). A recent study also found evidence for differential susceptibility in individuals exhibiting specific DRD4 genotypes towards behavioral phenotypes highlighting the variability in individuals’ sensitivity, or responsiveness, to environmental influences, whether positive or negative (Sweitzer et al. 2013). These points and the fact that selection of course acts on phenotypes and not genotypes suggest that the selective impact detectable for the single SNP we analyzed and the chances to find significant differences between our rather general classifications is very low. Nevertheless, we found evidence for selection acting on the DRD4 -521 T allele and the comparisons we made suggest that differences in social structure and political organization might be the driving factors. This of course would make it necessary to assume that NS or risk-taking behavior is actually beneficial in human groups exhibiting lower grades of hierarchical structure and specialization. Studies of human and non-human primates have reported correlations with Dopamine receptor availability (RA) and social affiliation/status. They demonstrate that increased RA is found in individuals of higher social dominance while lower RA is observed for individuals of lower social status or desire for social dominance (Cervenka et al. 2010; Martinez et al. 2010; Yamaguchi et al. 2015). This to some extend correlates with the hypothesis that exploratory behavior and intense excitement in response to novel stimuli, impulsive decision making, and quick loss of temper and avoidance of frustration associated with the NS phenotype are less beneficial in highly structured social environments with specialized social/economic roles when the individual does not belong to a socially privileged group or has political power (Jensen et al. 1997; Williams and Taylor 2006). In the contrary, individuals living in an egalitarian society, as it is often attributed to foragers (Service, E 1962), should exhibit neither positive nor negative effects based on dopamine RA in association to social status, while increased exploratory behavior might constitute a beneficial trait to secure resources as mentioned before. This discussion is rather suggestive as our study has the power to detect selection but cannot directly relate that observation to a specific phenotype. Other factors like the function of gender in the societal context would have to be considered too. Taken together, our results neither verify nor reject the general hypothesis that it is the NS phenotype that is under selection due to specific socioeconomic factors and modes of subsistence in Central Andean populations. Other environmental and genetic factors that we could not test due to the limited sample size and geographical diversity in our study could be causal. In order to really understand the possible impact of socioeconomic and subsistence change on genes contributing to complex behavioral phenotypes, genome wide studies of prehistoric and historic human populations involving a larger and more diverse sample would be more sufficient.