Cronos, a popular personification of time during the Low Roman Empire, was sometimes represented with a four-eyed head, two in the front and two in the back, two looking to the future and two looking to the past (Cirlot, 1992). Why does it make sense to symbolize the past and future as being in front and behind a person? Moreover, why do we intuitively assume that the front eyes look to the future and the back eyes look to the past?

Conceptual metaphor theory (CMT; Lakoff & Johnson, 1980) proposes that to understand abstract concepts, we borrow structure from other concepts that we have more direct experience with, and therefore understand better. The idea that such conceptual metaphors ground our cognition has become a central part of the theoretical apparatus of embodied approaches to the mind (Barsalou, 2008, 2010), sparking a research boom in linguistics (Grady, 2010) and cognitive and social psychology (Landau, Meier, & Keefer, 2010; Williams, Huang, & Bargh, 2009).

Understanding time is strongly related to our experience of space. As we move forward, we reach our destination in front of us at a later time, and leave behind our original location at a prior moment. These correlations in experience motivate a conceptual metaphor that turns time itself into a line and the passing of time into the motion of ego from one point on that line, the past, located behind the person, to another point in the future, located in front. In a figure-ground reversal of this conceptual mapping, we can also think of future events as frontally approaching ego and receding into the past behind ego (Boroditsky, 2000; Clark, 1973; Lakoff & Johnson, 1980). The universality of these experiences and its intuitive relevance to time led Lakoff and Johnson (1980, 1999) to suggest that the linear metaphor of time is a cognitive universal. Indeed, a majority of languages in the world use spatial terms in ways that are consistent with this metaphor (Haspelmath, 1997; Radden, 2004), as when an English speaker says “in the weeks ahead of us.”

Research, however, has made clear that time has a more complex relation with space in human thought. Firstly, experiences of motion may lead to alternative images of time. For example, in some absolute reference-frame languages, speakers map time along geographical axes: in Pormpuraawan languages, time flows from East to West (Boroditsky & Gaby, 2010), and for Yupno speakers, time flows upriver (Núñez et al., 2012). Moreover, all studied languages have the possibility of adopting a different perspective, one that does not depend on ego: If a sequence of temporal moments is conceived as if they were the wagons of a train, the initial (earlier) events are placed in the front, followed by the subsequent (later) events behind (Moore, 2006; Núñez, Motz, & Teuscher, 2006). In some languages, such as Chinese, nondeictic expressions of this kind occur often (Yu, 2012). Secondly, some languages map the past in front and the future behind because they give special importance to the acquisition of knowledge through vision. For example, in Aymara (Núñez & Sweetser, 2006) and Vietnamese (Sullivan & Bui, 2016), the past is in front and the future behind because past events can be “seen” whereas future events are uncertain and therefore cannot be seen clearly. Finally, there are languages and cultures that do not seem to use linear representations of time at all, such as the Yucatec Maya (Le Guen & Balam, 2012) or the Amondawa (Sinha, Sinha, Zinken, & Sampaio, 2011). Understanding time poses a profound challenge to the human mind and different cultures map temporal concepts in different ways onto different concrete experiences to deal with this problem (see Núñez & Cooperrider, 2013, for a review).

An even greater challenge for the CMT was the finding that several conceptual mappings can coexist simultaneously within the mind of an individual (Santiago, Román, & Ouellet, 2011). As mentioned above, Chinese speakers sometimes use a past-front/future-behind mapping, but more often they use either a vertical past-up/future-down mapping or a future-front/past-behind mapping (Boroditsky, Fuhrman, & McCormick, 2011; Gu, Zheng, & Swerts, 2019; Yu, 2012). Vietnamese speakers show both a past-front/future-behind mapping and a future-front/past-behind mappings in their language and gesture (Sullivan & Bui, 2016). Analyses of gesture and experimental tasks have shown that speakers of English can also use a past-front mapping when representing serially ordered sequences (Walker, Bergen, & Núñez, 2017). Moreover, literate speakers of all languages also map time along a line that runs in the same direction in which they read and write their language (Bergen & Chan Lau, 2012; Ouellet, Santiago, Israeli, & Gabay, 2010; Santiago, Lupiáñez, Pérez, & Funes, 2007; Tversky, Kugelmass, & Winter, 1991). Thus, having more than one, and often several conceptual metaphors for time is not an exception. Santiago et al. (2011) proposed that the selection of the active mapping at any given time depends on a combination of attentional factors, task requirements, long-term entrenchment of habits, and coherence interactions within working memory.

De la Fuente, Santiago, Román, Dumitrache, and Casasanto (2014) discovered a past-in-front spatial mapping of time in an unexpected population: Moroccans. In Darija, the local Arabic dialect in Morocco, deictic (ego-centered) linguistic expressions map the future in front and the past behind. However, when Moroccans were presented with the diagram shown in Fig. 1 and asked to place a future and a past event in one of the two boxes, either in front or behind the character, Moroccans preferred to place the past event in the front box. A control group of Spanish speakers (another language that uses only future-front/past-behind metaphors in deictic expressions) showed the expected preference to place the future event in the front box.

Fig. 1
figure 1

Figure used in the temporal diagram task

Why would Moroccans prefer to place the past in front? As the explanation could be neither in their sensorimotor experiences nor in language, de la Fuente et al. (2014) suggested a cultural cause. They proposed the temporal focus hypothesis (TFH): When something is attended to, we usually orient eyes and body toward the object, which then comes to be placed in front of us. If the past is attended to more than the future, the past will tend to be conceptualized as in front. Cultures vary in their temporal values: the relative importance given to the past (tradition) versus the future (progress). Research based on the World Values Survey has found that the degree of traditionality is one of the two fundamental dimensions that explain differences among cultures (Inglehart & Baker, 2000). Kluckhohn and Strodtbeck (1961) isolated five basic types of problems to be solved by each society. One of them was whether the temporal focus should be on the present, the past, or the future. Through extended practice, cultures may instill attentional habits in their members, and these may affect how they respond in the temporal diagram task.

As predicted by the TFH, de la Fuente et al. (2014) showed that Moroccans were relatively more focused on the past than were Spaniards, and more frequently placed the past in front. They also tested a group of Spaniards that was expected to have a greater focus on the past: Spanish elders. Older Spaniards showed a level of past focus that was intermediate between young Spaniards and Moroccans. Correspondingly, they placed the past in front more often than the young Spaniards and less often than the Moroccans. The predicted differences held across cultural and subcultural groups as well as at the individual level: Participants who gave higher relative importance to past versus future values also tended to place the past in front.

The goal of the present study was to test the generality of the TFH. Does the temporal focus of individuals in cultures and subcultures that vary in many other respects reliably predict their preferred temporal spatialization? Can all cultures and subcultures be placed on a single line relating time spatialization and temporal focus in spite of stark differences in language, religion, history, and economic development? To answer this question, we first assessed temporal focus and time spatialization in seven cultural groups (N = 978) as part of a currently active research project across five countries: Spain, USA, Morocco, Turkey, and the Serb, Croat, and Bosniak parts of Bosnia-Herzegovina (B&H). Across these groups there are both differences and similarities in language, religion, history, and economic development. For example, both Moroccans and Turks share a common religion, but they differ in language, history, economics, and so on. Serbs, Croats, and Bosniaks in B&H share language, history, and socioeconomic status, but have very different cultural identities, strongly linked to religion: Serbs are Orthodox Christians, Croats are Catholic, and Bosniaks are Muslim (Sells, 2003). We widened this initial sample by adding the three groups assessed in the original report by de la Fuente et al. (2014; N = 220): young Spaniards and Moroccans, and older Spaniards.

A first set of analyses focused on the group level. We aggregated temporal focus and used it to predict the proportion of future-in-front responses in each group. The data showed a very clear linear relationship. The predictive ability of this initial group-level model was then tested by assessing how well it fitted a new set of 10 (sub)cultural groups from East Asia. The first eight Asian groups had been independently collected and already published by Dr. Heng Li and his collaborators. Li and Cao (2017; N = 563) compared pairwise six subcultural groups from China: students of history versus computer science; residents in a traditional neighborhood versus residents in modern apartments; and visitors to traditional art versus modern art exhibitions. These six groups shared culture, history, language, ethnicity, and religion, but varied in their interests, architectural context, and age. Li, Bui, and Cao (2018; N = 182) contrasted Vietnamese participants living in the North (Hanoi) versus living in the South (Ho Chi Minh City) of Vietnam. For historical reasons, Southern Vietnamese were expected to be more future focused, whereas Northern Vietnamese more past focused. The two groups were matched in language, ethnicity, religion, age, and many other factors. A ninth Chinese group of mostly university students was independently collected by Dr. Yan Gu before he joined our team (Gu et al., 2019; N = 59). Finally, as part of ongoing efforts to widen our cross-cultural database, we collected an additional Chinese group (N = 96).

East Asian participants pose a strong challenge to the regression model obtained from the initial sample, as they differ profoundly from the initial groups in linguistic, social, cultural, and other important cognitive dimensions, such as individualism-collectivism and analytic-holistic style (Nisbett, Peng, Choi, & Norenzayan, 2001; Varnum, Grossmann, Kitayama, & Nisbett, 2010). Moreover, there is evidence that both Chinese (Li, 2018; Yu, 2012) and Vietnamese people (Sullivan & Bui, 2016) show past-in-front mappings in language and gesture, while this has only been hinted at in the gestures, but not the language, of one group of the initial set (Moroccans; de la Fuente et al., 2014).

After testing the predictive power of the initial group-level regression model on these 10 new (sub)cultural groups, we used the whole data set (N = 2,097) to fit a logistic mixed model with random intercepts and slopes over groups. This model allowed us to assess how the linear relation observed at the group level arises from the individual choice of placing the future in front or behind. It also allowed a precise estimation of the percentage of variance in individual responses in the time spatialization task that can be accounted for by the individuals’ temporal focus, as well as the variance between and within groups that remains unexplained.

Finally, during the preparation of this paper, a new study on the TFH was published (Bylund, Gygax, Samuel, & Athanasopoulos, 2020). Two conditions in this study used comparable methods to the other conditions included in the present analyses: British and South African university students (N = 140). Both can be considered Western cultures, although South African culture is more traditionally oriented than British culture (Bylund et al., 2020), so we did not expect to observe strong differences with the (sub)cultural groups included in our initial model. To provide a final test of the generalizability of the TFH, we assessed how well they fit the predictions of the model. Thus, all published data sets collected using the methods of de la Fuente et al. (2014) were included in the present study.

Methods

All analyses were carried out using R (R Core Team, 2018). The present study is part of a project aimed to assess time conceptualization across a wide range of cultures using various tasks. Here, we focus only on the question of whether it is possible to find a single linear relation between cultural temporal focus and time spatialization that fits all (sub)cultural groups despite their wide differences in cultural dimensions and socioeconomic development. Findings regarding related questions will be reported separately.

Participants and data analysis

The data came in three waves of data collection. In the first wave (obtained in 2015), we collected data in Granada, Spain (N = 96), Pittsburgh, PA, USA (N = 64), and Banja Luka, in the Serb part of B&H (N = 96). In the second wave (2016), we collected additional data in Granada, Spain (N = 96), Pittsburgh, PA, USA (N = 96), and Banja Luka, B&H (N = 93), and added new samples from Istanbul, Turkey (N = 96), Mostar, in the Croat part of B&H (N = 100), and Tuzla, in the Bosniak part of B&H (N = 99). We also added two samples from Morocco, one from Tetouan (N = 96) and another from Tangier (N = 46). There were 978 participants in total: 256 in the first wave (169 females and 85 males, two nonresponses) and 722 in the second (451 females, 268 males, one other, two nonresponses). All participants were university students, mostly in their early 20s (Mage = 21.6 years). All participants in the initial data set provided written informed consent for entry into the study according to the declaration of Helsinki principles. Names or any personal identification details were not collected. The study was approved by the Ethics Committees of the University of Granada, Koç University, and Duquesne University.

We defined our cultural groups according to the country of testing (Spain, Turkey, USA, and Morocco), with the exception of Bosnia-Herzegovina, where we distinguished three cultural groups: Banja-Luka (Serbs), Mostar (Croats), and Tuzla (Bosniaks). We are aware that not all participants at each site necessarily belong to the majority cultural group. However, by carefully and qualitatively considering their demographic information about cultural identity, native language, country of birth, parents’ country of birth, and religion, we believe that most of the participants of each sample were members of the local majority culture. Thus, the cultural groups were relatively homogeneous by the operational definition of culture employed in the current study (raw data and scripts for descriptive analysis are available as Supplementary Material). We could have filtered out some participants to increase the homogeneity of the groups. However, the participants in prior published data sets were not filtered in this way, and we preferred to keep our data as comparable as possible to prior data. In fact, our approach runs against the present hypotheses by increasing the amount of random noise in the data (see also Inglehart & Baker, 2000, for data supporting the representativity of country as a unit of cross-cultural analysis).

The sample size of each group was established at 96 before the beginning of data collection. This number resulted from doubling the minimum number (48) necessary for a full run of the counterbalancing of all the tasks that the participants were going to perform during the session (which included several tasks not described here, some of which had several versions). This number was greater than the sample sizes collected in the only study using the present tasks published at the time of starting the second wave of data collection (de la Fuente et al., 2014). The actual number of participants that could be tested at each site and wave varied from the standard, usually because less, but in some occasions slightly more, participants volunteered for the study. To this initial data set, we added the data from Experiment 4 in de la Fuente et al.’s (2014) study (which are publicly available at http://osf.io/uh3in). In this study, there were two groups of university students: 55 Spanish (Mage = 20.2 years) and 93 Moroccan (Mage = 28.6 years) from Granada and Tetouan, respectively. There were also 72 Spanish elders (Mage = 73.6 years). All in all, in this first phase of the study there were 1,198 participants.

We wanted to submit the regression model derived from this phase to the strongest possible test by using it to predict new samples from East Asian cultures, taking only a group-level approach in this first set of analyses. That is, we aggregated the individual temporal focus values and obtained a group-level index, and regressed the group-level indexes over the proportion of future-in-front responses in each group. If the group-level-only model, based on only 10 data points, is able to successfully fit the new set of (sub)cultural groups, it will provide a very strong test of the predictive capability of the linear model derived from the initial sample.

The first set of new groups came from Studies 1, 2, and 3 in Li and Cao (2017). In their Experiment 1, there were 71 highly motivated Chinese graduate students of history or archaeology (Mage = 21.9 years) and 68 grad students of computer science or electronical engineering (Mage = 22.5 years). In Experiment 2, Chinese participants who had resided for 10–15 years in either the traditional neighborhood of Hutong (N = 102, Mage = 31.8 years) or modern apartment buildings in Beijing (N = 107, mean age not reported) were interviewed at their homes. In Experiment 3, the participants were visitors to the Ancient China Bronze Art Exhibition in the National Museum of China (N = 112, Mage = 30.9 years) or visitors to the Modern Painting Exhibition in the Hive Center for Contemporary Art in Beijing (N = 103, Mage = 29.3 years), all of whom had spent at least half an hour at the exhibition and reported being highly interested in it. The second set of new groups came from Li et al. (2018), who tested 90 participants in Ho Chi Minh City, South Vietnam (Mage = 25.9 years), and 92 participants in Hanoi, North Vietnam (Mage = 23.8 years). All participants self-identified as ethnically Kinh and atheist, and the two groups were matched in education level and socioeconomic background. An additional Chinese group (N = 58) came from Experiment 3 of Gu et al. (2019). This experiment was designed to assess the effect of using spatial terms in the linguistic expressions that described a 3-D version of the temporal diagram task to the participants. We included only the data from valid participants in the control condition. This condition did not use any spatial terms and was the only condition directly comparable to all the other groups in the present study. Gu et al. (2019) reported a mean age of 29.99 years and mostly university-level education for the total sample of 206 participants in this experiment. Last, we collected a Chinese group in 2019 at Xuzhou (N = 96, Mage = 19.3 years, 96% atheists) as part of a currently ongoing third wave of data collection (this is the only group of the third wave that has completed data collection until now).

After these initial group-level analyses, we used the overall data set (including all 2,097 participants in 20 groups) to fit a mixed (multilevel) model. The TFH proposes that the balance of attention devoted to past and future by the individual is the relevant factor for predicting his or her spatialization of time. That is, the TFH proposes that the relation between temporal focus and time spatialization is a phenomenon that arises at the individual level, and not at the group (contextual) level. Moreover, a group-level-only regression model does not provide a realistic estimation of the percentage of variability that is accounted for by the predictors, as the model is computed on a reduced (aggregated) data set. To estimate variance components, we need to work from the individual responses. As those responses were binomial (a single response per participant, either future in front or future behind; see task description below), we computed a generalized mixed model assuming a binomial distribution with a logit link. The model predicts the probability of success (defined as future-in-front) as a function of the set of predictors, which in this case included only the temporal focus of the individual. As random factors we included both group intercepts and the slopes of temporal focus within each group. We will therefore decompose the total variability in the portions explainable by the fixed factor (individual temporal focus) and the random factors (intercepts across groups and slopes of the temporal focus effect within groups).

Finally, we included the two cultural groups in Bylund et al. (2020) that were comparable to the other groups in the present study: British (N = 70, Mage = 22.2 years) and South African (N = 70, Mage = 20.8 years) university students (both tested in English). We assessed how well these two groups were fitted by the initial model and recalculated the logistic mixed model and overall percentage variance estimation accounted by temporal focus using the complete data set (N = 2,237).

Materials

In what follows we describe the methods of the initial data set, which is published here for the first time. The methods of the selected conditions from de la Fuente et al. (2014), Li and Cao (2017), Li et al. (2018), Gu et al. (2019), and Bylund et al. (2020) are described in those papers, and they were fully comparable to the ones described here. The tasks were translated into the language of each sample. Back-translations confirmed translation equivalence between different language versions.

Temporal diagram task

The temporal diagram task was used to evaluate the location of the future and the past across the front-back spatial axis. It was conceived by Casasanto (2009) and adapted to the domain of time by de la Fuente et al. (2014). In this task, a simple schematic drawing is shown to the participant (see Fig. 1), while it is explained that “yesterday” the character depicted in the drawing went to “visit a friend who likes animals” and that “tomorrow he will go to visit another friend who likes plants.” Participants are then asked to place the initial letter of the word “animal” in the box that best represents past events and the initial letter of the word for “plant” in the box that best represents future events. Four versions of the task were created to counterbalance the order of mention and combination of animals and plants, and future and past. The task thus consists of a single binomial trial. Placing the future event in the front box was coded as 1, in the back box as 0.

Temporal Focus Questionnaire

The Temporal Focus Questionnaire created by de la Fuente et al. (2014) measures cultural temporal values: how much the participant agrees with past-related values (e.g., “Young people need to preserve the values of their parents and grandparents”) and future-related values (e.g., “Young people’s values and beliefs must be different from those of their elders”). We slightly adapted de la Fuente et al.’s Temporal Focus (TF) Questionnaire by dropping one item to have the same number of items in the past and future categories. The scale contains 20 items: 10 items referred to past-related values, and 10 items referred to future-related values. No item refers to a value that is explicitly religious in nature. Each item is followed by a Likert scale ranging from 1 (total disagreement) to 5 (total agreement). In the first wave, the items were presented in random order. In the second wave, they were presented in strict alternating order, as in de la Fuente et al. (2014). The third wave used also alternating order except for two items which exchanged places due to experimenter error. The American version of the questionnaire in the first wave and the Turkish version in the second wave used 9-point scales. The responses to these versions were converted to the range 1–5. Past and future focus indexes were computed by averaging the ratings given to all the items in each category. Following de la Fuente et al. (2014), an overall TF Index was computed (TF index = [mean of future focused items − mean of past focused items] / [mean of future focused items + mean of past focused items]). For each participant, the TF Index expressed the balance between agreement with past-related and future-related values on a scale from −1 (strong past focus) to +1 (strong future focus). In the initial data set, the TF Questionnaire was found to have a Cronbach’s alpha of 0.85 in the past scale and 0.63 in the future scale. Both values are within the range of acceptable values, taking into account that they come from a substantially large sample, although the future scale’s alpha is at the lower end of that range. Using a Vietnamese translation of the TF Questionnaire, Li et al. (2018) observed an alpha of 0.81–0.83 for the past scale and 0.81–0.82 for the future scale in their two groups of participants (de la Fuente et al., 2014; Gu et al., 2019; Li & Cao, 2017, and Bylund et al., 2020, did not report the observed alpha at their studies).

Procedure

The tasks were completed in the facilities of the corresponding universities for each sample, using pen and paper. Participants received a leaflet with a battery of the different tasks and questionnaires. The leaflet started with the instructions and the consent form. The participants then filled a demographic questionnaire, followed by one of the four different versions of the temporal diagram task. This was followed by several additional tasks (e.g., temporal distance, temporal depth, time discounting, religiosity) from different studies (to be reported elsewhere). The penultimate test was always the TF Questionnaire. The instructions emphasized that participants were not to turn the page until the exercise on that page had been completed, or to look ahead or back to other pages. This warning was repeated at the bottom of each page.

Results

Group-level analyses

The TF Index had an approximately normal distribution overall, but it departed from normality in several (sub)cultural groups. Therefore, we took medians as indexes of central tendency within each group. We submitted the proportion of future-in-front responses in each group to a regression analysis using the median of the TF Index per group as predictor (see Table S1 in the Supplementary Material). The first analysis included the cultural groups of the initial data set (Spain, Morocco, Turkey, USA, Banja Luka, Mostar, and Tuzla) plus the (sub)cultural groups of Experiment 4 from de la Fuente et al. (2014; young Spaniard, old Spaniard, Moroccan). Visual inspection of the medians suggested the presence of a strong linear relation, which was supported statistically (β = 1.25, 95% CI [0.93, 1.57], R2 = .90, F(1, 8) = 80.69, p < .001. Figure 2 (top) shows the best fit line, 95% confidence intervals, and 95% prediction intervals. We then added the 10 (sub)cultural groups from Li and Cao (2017), Li et al. (2018), Gu et al. (2019), and the Chinese group from our third wave of data collection to the chart. The model showed an impressive predictive capacity: As shown in Fig. 2 (bottom), only two groups (Chinese History students and the Wave 3 Chinese group) felt outside the 95% prediction interval of the model (another one, Chinese visitors to the Modern Painting Exhibition, fell right on the limit; see also Fig. S2 in the Supplementary Material).

Fig. 2
figure 2

Linear relation between median TF Index and the proportion of future-in-front responses. Top panel: Data from the 10 (sub)cultural groups of the initial data set and of de la Fuente et al. (2014). The solid line shows the best fitting linear model. The gray area shows the 95% confidence interval (the area where the mean prediction should fall 95% of the time). The dashed lines delimit the 95% prediction interval (the area where 95% of individual predictions should fall). Bottom panel: The same plot with the addition of the 10 (sub)cultural groups of Li and Cao (2017), Li et al. (2018), Gu et al. (2019), and the third wave Chinese group

Individual-level analyses

The group-level analysis shows that the median TF of 10 Eastern (sub)cultural groups can be successfully predicted by a linear model computed from 10 Western and Middle East (sub)cultural groups. However, this analysis does not allow a realistic estimation of the percentage variance that is accounted for by the model, as this is computed over group aggregates and not individual responses. To get such estimation, a mixed (multilevel) model analysis is in order. Mixed models take into account variation between groups by assuming that group means are sampled randomly from a population of groups. They can also take into account variation within groups by assuming that the relation between predictors and response can also adopt different slopes in different groups. As mixed models need to compute a variance parameter among groups, a minimum number of groups is needed. There is debate about what is a reasonable lower limit for the number of groups, with current recommendations going from five or six (Bolker, 2020) to 40–50 (Sommet & Morselli, 2017). The complete data set, with 20 groups, would provide reasonably stable estimations.

Analyses were carried out in R using the lme4 package (Bates, Mächler, Bolker, & Walker, 2015) by fitting generalized linear mixed models using the binomial family and a logit link. Following advice in Sommet and Morselli (2017), we started by grand-mean centering and scaling the TF predictor (see Fig. S1 in the Supplementary Material for a directly comparable group-level analysis). We then fitted a null model (a model containing only the overall intercept; df = 1, AIC = 2,828.7) and compared its goodness of fit with a model including random intercepts per group (df = 2, AIC = 2,688.9). The goodness of fit improved significantly, χ2(1) = 141.84, p < .001. A third model added random slopes of temporal focus within each group to the random term, which provided and additional increase in goodness of fit, df = 4, AIC = 2,492.3; χ2(1) = 200.53, p < .001. Finally, we compared this model to a model without any random term that included only TF as a fixed effect (df = 2, AIC = 2,621.2). The model with a random term provided a better fit, χ2(1) = 132.83, p < .001. Therefore, we kept both intercepts and slopes in the random term of the model and finally compared such a model with a model that added individual TF as a fixed effect (df = 5, AIC = 2,484.4). This improved fit significantly, χ2(1) = 9.92, p = .002. The final model revealed a substantial effect of TF on the probability of a future-in-front response (β = 0.79, OR = 2.19, 95% CI [1.40, 3.53]). Figure 3 shows probabilities estimated by the model and confidence intervals using only the individual TF as predictor and compares them to the observed group-level TF medians.

Fig. 3
figure 3

Estimated probability of a future-in-front response for each participant in the whole data set as predicted by the individual TF only (without the contribution of the random term including random intercepts per group and random slopes of TF over groups). Observed TF medians per group are added. Unconditional confidence intervals were obtained using the ciTools package (Haman & Avery, 2019), under a parametric approach

To quantify the proportion of variance in time spatialization accounted for by TF, we followed the delta approach developed by Nakagawa (Nakagawa, Johnson, & Schielzeth, 2017; Nakagawa & Schielzeth, 2013) for generalized mixed models and implemented in the MuMin package (Bartoń, 2019). The full model (including both the fixed and random factors) explained 27% of total variance in individual binomial responses. The fixed effect of TF alone explained 11%, and therefore the random term (including intercepts and slopes) explained the remaining 16%.

Addition of data from Bylund et al. (2020)

As a final test of the generalizability of the model derived from the initial data set plus de la Fuente et al.’s (2014) data, we included data from the British and South African groups in Experiment 1 of Bylund et al. (2020). Figure 4 adds these two groups to the contents of the bottom panel of Fig. 2 (see also Table S1 in the Supplementary Material for the full set of group-level data). As seen in Fig. 4, the British data are well within the prediction interval of the model, but the South African group falls clearly outside those limits and is very different from any other (sub)cultural group.

Fig. 4
figure 4

Bottom panel of Fig. 2 with the addition of the British and South African groups in Bylund et al. (2020, Experiment1)

We then included Bylund’s data to the data set, scaled and centered the TF Index, and recalculated the individual-level mixed-model analysis using all available data. The TF still had a clear and significant effect on the probability of producing a future-in-front response in the temporal diagram task (β = 0.69, OR = 2.00, 95% CI [1.31, 3.11]). The proportion of total variance explained by TF decreased to 8.4%, and the variance due to the random term increased to 17%.

General discussion

In the present study. we modeled the relation between temporal focus and time spatialization in a wide sample of cultural and subcultural groups that includes all of the studies that conducted replications of de la Fuente et al.’s (2014) study, using close adaptations of the original methods. Temporal focus as defined here is the balance of importance given to past (tradition) versus future (progress) values in a particular group, and it is measured through the Temporal Focus Questionnaire developed by de la Fuente et al. (2014). Time spatialization refers here specifically to the location of the past and the future along the front-back axis (in front or behind the person), measured with the temporal diagram task also proposed by de la Fuente et al. (2014). The selected groups varied widely in many ways (e.g., language, age, religion, socioeconomic development, ethnicity). Does a single line describe the relation between temporal focus and time spatialization across cultural and subcultural groups in spite of their stark differences? The temporal focus hypothesis (de la Fuente et al., 2014) proposes that such a relation arises at the level of the individual, and therefore should generalize across contextual factors.

The first analysis was computed over group-aggregated indexes of temporal focus in 10 Western and Middle East (sub)cultural groups (N = 1,199) and then used to predict 10 East Asian, independently collected groups (N = 899). All new groups but two fell within the 95% prediction interval of the initial model (see Fig. 2, bottom), a surprisingly high success rate given the small number of groups that led to the initial model, the extent of the differences among them, and with the groups used for testing. We then used the whole data set to fit a generalized linear mixed model at the individual level. This model showed that the linear relation observed at the group level arises from the individual-level relation between temporal focus and the probability of choosing to place the future in front. Specifically, individual temporal focus explained 11% of the total variance in the choice of location for future and past. Random variation in intercepts and slopes of temporal focus over groups explained an additional 16% of total variance. In a final analysis, we included data from a study published during the preparation of this article that assessed two additional Western cultures: Great Britain and South Africa. British data were again within the prediction interval of the initial model, but South African data were not. After including these two groups in the individual-level analysis, temporal focus remained a significant predictor of future-in-front responses, although its accounted variance decreased to 8.4% and random term variance increased to 17%.

The random term in the model includes variance that reflects systematic differences across and within groups. The fact that this term made a significant contribution to the overall mixed model suggests that there remains an important degree of heterogeneity both over group means (intercepts) and slopes that calls for the identification of moderating factors. Candidate moderators are any of the myriad variables that differ across cultural and subcultural groups (age, language, religion, etc.). It is possible that many of those variables do not have strong effects, and only their combined influence explains a significant part of random term variance. However, the search might reveal moderators of strong influence and wide applicability across cultural groups.

Do the current data offer any hint of such wide moderators? In other words, what is the expected range of applicability of the temporal focus hypothesis across cultures? One advantage of the present model is that it provides clear 95% prediction intervals on which to test whether a new (sub)cultural group does challenge the model or not. Among the 22 relevant conditions published so far, only the South African group stands up as a clear exception to the expectations of the model. (Although the group of Chinese history students and the Wave 3 Chinese students also fall outside of the 95% interval predicted by the initial model, all other six Chinese groups do not. This makes us think that this deviation maybe due to within-group noise.) Available data, therefore, suggest that having a frequent use of past-in-front mappings in language and/or gesture, as it occurs in Chinese and Vietnamese (Gu et al., 2019; Li, 2018; Sullivan & Bui, 2016), does not push cultural groups out of the boundaries of the group-level model. If Sullivan and Bui (2016) are correct in linking the Vietnamese pattern to a greater importance given to the acquisition of knowledge by sight, we would expect that the current model also encompasses Aymara groups (Núñez & Sweetser, 2006). We can also be confident that many other potential moderators such as age, language, religion and religiosity, socioeconomic level, individualism-collectivism, and analytic-holistic processing do not push a group outside of the boundaries of the group-level model.

Why does the South African group stand so far from the model’s prediction limits? Bylund et al. (2020) suggest that because Afrikaner culture is “associated with the apartheid regime,” it “may carry implicit negative connotations that preclude any inclination to place it in front” (p. 180). Other groups in the current study may be said to come from a recent, troubled past, i.e., groups from B&H. While both B&H and South African groups have similar past temporal focus, cultural identities in the Balkans (Serbs, Bosniaks, and Croats) became reinforced after the Yugoslavian war. However, young Afrikaner South Africans may see themselves as actively developing ways to express a coherent cultural identity. Post-apartheid Afrikaners are still learning how to balance respect for Afrikaner culture, language, and traditions with feelings of shame over apartheid, while their future role in South African society remains significantly charged with both fear and optimism (Cloete, 1992; Fairbanks, 2017). We suggest this population’s very recent, fraught and complex political trajectory may affect the balance of attention paid to the future versus the past, whether it is interpreted as a reluctance to “face the past,” a desire to “put the past behind them,” or an emphatic “view toward the future.”

All in all, present findings show that individual temporal focus is a chief factor explaining the positioning of many cultural and subcultural groups along the line that relates temporal focus with time spatialization. The temporal focus hypothesis (TFH) proposes that the underlying mechanism is related to Núñez and Sweetser’s (2006) account of the past-in-front mapping observed in the Aymara. In the Aymara, the past is in front because it can be seen. Under the TFH, the past can be in front because it is attended to. Attention triggers eye, head, and body movements that serve to place the attended object in front of us, thus affording further exploration. Attention brings the past to the front so that we can see it.

But the past is not a physical object. How is it possible to place it in front of us? Santiago, Ouellet, Román, and Valenzuela (2012; Santiago et al., 2011), leaning heavily on Johnson-Laird’s (1983) mental model theory, proposed a theory of the mechanisms that achieve this feat. Attentional mechanisms do not work directly on external reality, but on the contents of internal models of the situation. Using perceptual data, these models can reflect faithfully the external environment, but they can also be flexibly manipulated to represent alternative situations. All kinds of concepts, including abstract ones, when subjected to scrutiny in working memory, are represented by means of concrete elements of mental models. Imagine that a person is asked, “If we exchange the places of Mars and Venus, which one would be closer to the Sun?” To solve this problem, she may create a mental model containing one big dot on the left, standing for the Sun, and several smaller dots at different distances to the right, standing for the planets of the solar system. She can then exchange the positions of the dots corresponding to Mars and Venus to find the solution. This example highlights several important points. First, mental models are always contemplated from a cognitive vantage point, a deictic origin, the “mind’s eye.” Second, the relevant elements of the mental model, those that are attended to, are brought to occupy the position in front of the mind’s eye. Third, the model can contain both objects (Earth, Venus) and structural dimensions on which those objects are located (distance to the Sun). Fourth, the deictic origin can also be placed on a particular point of a structural dimension. If instead of the solar system we think of the events in a week, we can construe the model as if contemplating the whole time line (either horizontally or vertically) in front of us, without occupying any specific position on it (a nondeictic model). The events on the line are then located in their sequential order, from earlier to later. But we can also place the ego on the time line, at a point usually taken to mark the present, which lets us distinguish past from future events. In this mental model we contemplate only one side of the line, as we cannot be on the line looking simultaneously in both directions. Which side is in front of us depends on which side is being attended to (see Santiago et al., 2011, for a more detailed description of the theory).

Operations of mental model construction are affected by mental habits. These can be acquired in many ways (Casasanto, 2014). Language can instill habits of thought, for example, using a left–right continuum to represent political parties (van Elk, van Schie, & Bekkering, 2010) or thinking of pitch in terms of thickness (Dolscheid, Shayan, Majid, & Casasanto, 2013). They can also be established because of systematic sensorimotor experiences, as placing good things in the side of the dominant hand and bad things in the side of the nondominant hand (Casasanto, 2009). Interaction with cultural artifacts, such as written pages and books, calendars, and charts, can induce a tendency to represent time and numbers as flowing horizontally (Dehaene, Bossini, & Giraux, 1993; Ouellet et al., 2010). Cultural values can also instill habits of mental model construction. By means of conventions, rules, norms, role models, and explicit instruction, cultures train their members on what is more and what is less important. Mental habits develop as to what should receive more attention. Thus, when we represent the ego as placed on the time line, these habits affect which pole (past or future) tends to occupy the front position in the model, forcing the other pole to be behind ego. This way of representing time is probably of a static nature in most cases, with past events sitting in front of us at different distances, as it seems to be the case for Aymara (Núñez & Sweetser, 2006) and has been argued for many ancient languages (Graham, 2018), but it could also be animated with motion, with past events receding in front of us and the future approaching from behind (as it has been defended for Vietnamese by Sullivan & Bui, 2016, and for Toba by Klein, 1987). In any case, a default past-in-front conceptualization is also compatible with the use of alternative conceptualizations of time in different moments, as required by attentional and task demands, among other factors (Santiago et al., 2011). Cronos may be looking at the past sometimes with his front eyes and sometimes with his rear eyes.

To conclude, the present study has revealed that the balance between temporal values that place importance on the past (tradition) and values that favor the future (progress) is a central factor in giving shape to the way that people around the world think of time in spatial terms. It has also suggested that this relation may be moderated by other factors, opening up a research program aimed at identifying them.