1 Introduction

Numerous studies in economics and psychology report a relationship between experimentally elicited time preferences and economically relevant field behavior. For instance, patient adults perform better in their job (Burks et al. 2009), have less credit card debt (Meier and Sprenger 2010), pursue healthier lifestyles (Bickel et al. 1999; Kirby and Petry 2004; Chabris et al. 2008; Weller et al. 2008) and exhibit higher cognitive skills (Dohmen et al. 2010). Positive effects of patience on important behavior have not only been documented for adults, but also for children and adolescents. Sutter et al. (2013) show for 10- to 18-years-old that being patient is positively related to their attitudes toward saving, conduct at school and health-related behavior. Bartling et al. (2010) find that patient 5- to 6-year-old children exhibit higher verbal cognitive abilities and more pronounced pro-social behavior. Most importantly, Golsteyn et al. (2014) find that children’s time preferences measured at age 13 predict economically and socially relevant behavior in adulthood. In their study, impatient children attain lower levels of education, show weaker performance in compulsory and secondary school, have lower earnings at middle age, have more days being unemployed and have a higher risk of obesity as well as teenage motherhood.

Most economic studies on time preferences employ choice lists to elicit time preferences (see Frederick et al. 2002, for a review). In such experiments, subjects are asked to make several choices in a series of binary tasks. In each task, the participant has to choose between a smaller, but earlier, payoff and a larger payoff that is delayed in time. Either the earlier payoff is held constant and the delayed payoff increases across choice tasks, or the delayed payoff is fixed and the earlier payoff is decreasing from roughly the amount of the delayed payoff down to a fairly small amount. Subjects are expected to switch from the immediate to the delayed payment option at some point as the latter becomes more and more attractive. The switching point serves as a proxy for patience: the sooner a subject turns down the earlier payment in favor of the later one, the more patient he or she is. Typically, after choices have been made, one task is randomly selected for payment.

Measuring time preferences with choice lists has a few major advantages and disadvantages. One important advantage of choice lists is that they can identify inconsistent choice patterns which cannot be rationalized with standard theory. A choice pattern is classified as inconsistent if the subject waits for a payoff of x, but not for a payoff of y when y > x. Inconsistent choices are not uncommon in experiments with choice lists: up to twenty percent of adult subjects choose inconsistently (see, for instance, Coller and Williams 1999; Bettinger and Slonim 2007).Footnote 1 While we are not aware of any study which systematically investigates time preferences and how individual characteristics influence the likelihood of inconsistent choices, we provide evidence that (1) inconsistent choice patterns do not arise due to misunderstanding and (2) standard control variables can explain time preferences of subjects making consistent choices but not of subjects making inconsistent choices (see Table 3). Since individuals with inconsistent choice patterns systematically differ from consistent ones, identifying inconsistent individuals is an important virtue of choice list tasks. Furthermore, choice lists can provide a fairly precise estimate of discount rates for subjects who make consistent choices.

Despite the widespread application of choice lists to elicit time preferences, they bear at least two important disadvantages. First, long lists of choices—of which only one is paid out—may reduce the amount of cognitive effort that decision makers invest into completing each task on the list. Second, having to make multiple choices adds complexity that can entail greater problems of comprehension. It is vital for measuring time preferences with choice lists that subjects not only understand each single binary choice, but also that they understand (1) that only one choice problem is payoff relevant and (2) the procedure with which the relevant decision is determined. Therefore, more effort has to be made to explain the task and this becomes more difficult the lower the cognitive development of the participants. Ensuring comprehension is particularly challenging in the growing field of experiments with non-student subject pools, such as low-educated persons, indigenous groups or children (see, for instance, Eckel et al. 2013; Angerer et al. 2015).

Given the importance of patience, respectively, the ability to delay gratification, already in childhood and adolescence, we examine two different methods to measure intertemporal preferences in children to compare them against each other. In particular, we present choice data of 561 children, aged 7–11 years, from two experimental tasks to elicit time preferences: a standard choice list (CL) and a simpler time-investment-exercise (TIE) which only requires one decision. Employing a within-subject design and eliciting intertemporal preferences at an interval of 6 months, we show that both measures exhibit quasi-identical aggregate patterns and correlate significantly within subjects.

In a recent paper, Burks et al. (2012) compare four measures of time preferencesFootnote 2 with respect to their capability to predict economically important outcomes (smoking, BMI, credit scores and job related outcomes). They find that all approaches have some predictive power and that the (β, σ)-model (Frederick et al. 2002), a formulation derived from choice list exercises, performs best. In this paper, we extend Burks et al.’s (2012) insights toward an assessment of CL and TIE with respect to their connections to variables which are widely used to explain time preferences in the literature: age, gender, risk-taking propensity and IQ. We find that patience measured with both methods can be explained by virtually the same independent variables and thus argue that CL and TIE yield robust proxies for time preferences in children. The only exception is gender, with females behaving more impatiently in CL but not in TIE. Closer inspection of our data reveals that risk-taking propensity is significantly related to patience as measured by CL and TIE in boys, but not in girls while IQ is significantly correlated with patience across both tasks and gender, except for girls in the CL task.

Our TIE method is closely related to what Andreoni and Sprenger (2012) have introduced as the convex time budget method. In their experiment, subjects had to allocate a fixed amount of money between an earlier and a later date. They let each subject make 45 convex time budget decisions which varied with respect to the interest rate for delaying the money to the later date, the date of the earlier payment and the delay length between the earlier and the later date. From the choice data, the authors estimate discount rates, utility function curvature as well as present bias. We abstained from these parameter variations and let children only make a single decision with the TIE method to keep the exposition as simple as possible. Comparing it with the traditional CL method, we are then interested in the factors explaining choices in both methods.

Furthermore, our data allow us to investigate the behavioral traits of a relatively large number of subjects who choose inconsistently in the traditional choice list design (N = 127). While their decisions in TIE correlate with the relative number of patient choices in CL, we find that the explanatory variables which predict our consistent subjects’ behavior very well prove insignificant for participants with inconsistent choice patterns. It is interesting to note that this lack of predictive power persists across tasks, but does not contaminate the results in the whole sample. As TIE picks up roughly the same behavioral traits as CL, the single choice using TIE seems like a simple and easily comprehensibleFootnote 3 alternative to traditional choice lists, at least when doing experiments with children.

2 Experimental design

We present data from 561 children who participated in an experiment that measured their time preferences in two stages. These children were attending the Italian-speaking primary schools in the city of Meran in South Tyrol, Italy, and were aged 7–11 years.Footnote 4 In the first stage, we used the choice list (CL) design. Each child had to make choices in three binary decision tasks. In each task, the child could either get 2 tokens at the end of the session or a larger number of tokens (either 3, 4, or 5) in 4 weeks’ time. Tokens could be exchanged for small presents (e.g., stickers, arm wrists, sweets, pencils). The three tasks were first explained—either in ascending order (starting with 2 tokens now versus 3 tokens in 4 weeks) or descending order (starting with 2 now versus 5 in 4 weeks)—and then children had to indicate their choice for each task. One choice was randomly implemented for payment.

In the second stage, run 6 months after the first one, we employed the TIE. Children were endowed with 5 tokens and had to decide how many tokens to consume immediately (by exchanging them into small presents) and how many tokens to invest into the future. Each invested token was doubled and paid out 4 weeks after the experiment (as in CL).

In both stages, the classroom teachers delivered the presents to the children in sealed envelopes exactly 4 weeks after the experiment in the case a child had chosen a delayed payment. With this approach, we avoided transaction costs for our subjects and minimized uncertaintyFootnote 5 about the delayed payments.

Note that, in contrast to Burks et al. (2012), our experimental design does not allow us to distinguish between simple discounting (σ) and present bias (β) (Laibson 1997) but rather incorporates both traits in our single measures. While discriminating between σ and β is certainly important in research on time preferences, the combined measure should predict behavior similarly as both single measures separately (which is usually sufficient if a researcher wants to use impatience merely as a control variable). In fact, Burks et al. (2012) find that simple exponential discounting (i.e., ignoring present bias) successfully predicts three out of six outcomes (smoking, credit score and absenteeism) while four outcomes can be predicted when also considering present bias (job training drop-outs in addition to the previously mentioned outcomes). Thus, omitting this distinction is a reasonable trade-off in experiments with children, especially as our most important endeavor was to simplify and shorten the tasks as much as possible.

For logistical reasons, we collected all the CL decisions in the first stage and all the TIE decisions in the second stage. Therefore, the order might play a role. Since TIE was administered half a year after CL, however, it is reasonable to assume that children could not perfectly recall their decisions in the former while participating in the latter. Thus, we consider it highly unlikely that order effects drive our main results. As a further check for that, we can compare the number of invested tokens of 35 (additional) children who only participated in TIE, but not in CL, with the TIE decisions of participants of both stages.Footnote 6 We find no statistical difference between the two groups (Mann–Whitney U test; p > 0.1; see also the Ordinal Probit Regression in Table A1 in the Online Resource). Thus, it seems that prior exposure to CL does not influence behavior in TIE. Note, however, that this is a low powered test: Our power analysis reveals that we can only detect order effects of 0.76 tokens invested in TIE (or 0.48 standard deviations of the outcome variable) or more at the 5-percent significance level with 80 % power. Consequently, we cannot exclude the possibility that significant order effects below this cutoff exist.

3 Results

Table 1 presents the age and gender composition of the subject population among all 561 subjects who completed both stages of the experiment. Over all subjects, the average share of patient choices in CL is 0.99 (out of three), and 1.63 out of five tokens are invested into the future in TIE, on average.

Table 1 Number of subjects participating in both experiments, by age and gender

Figure 1 shows the average share of patient choices in both experiments for all subjects (N = 561). The average numbers in CL (TIE) range from 0.74 to 1.26 (1.14–2.15) for 7/8- to 10/11-years old. We find that, irrespective of the measurement method, patience increases significantly with age (p < 0.01 Cuzick’s Wilcoxon-type tests for trend). For subjects who decided consistently in CL (N = 434) a similar picture occurs. However, when considering only subjects who made inconsistent choices in CL (N = 127), we find no significant age trend for both measures (see Figures A1 and A2 in the Online Resource).

Fig. 1
figure 1

Frequency of patience in both experiments, all subjects (N = 561 overall)

The fact that both CL and TIE display a similar age trend is consistent with the conjecture that both measures pick up the same behavioral trait. To examine this issue in more detail, we now turn to the analysis of within-subject behavior.

To examine the relationship of the two time preference measures within subjects, we ran a series of Spearman’s rank correlations between both measures of relative patience. Table 2 shows that for both the full sample and the subsample with consistent choices the two measures of patience are correlated (ρ > 0.2 with p < 0.02 for each age group; except for 7/8 year olds). Subjects with inconsistent choices exhibit a similar pattern: the Spearman rank correlation coefficient for this subsample is 0.150 (p < 0.1).Footnote 7 This finding casts doubt on the conjecture that inconsistencies arise due to misunderstandings and suggests that the relative sum of patient decisions in a CL task could be used for “rescuing” inconsistent choice patterns.Footnote 8 These positive and significant correlations underpin the fact that both measures reflect the same underlying behavioral trait, even for subjects with inconsistent choices in CL.

Table 2 Spearman’s rank correlations between both measures (p values in parentheses)

Inspired by Burks et al. (2012), we ran a series of ordered probit regressions with the number of patient choices in CL, respectively, and the number of invested tokens in TIE, as the dependent variable to investigate whether both measures are mainly explained by the same independent variables. This exercise aims to investigate which variables can explain the two time preferences measures. Table 3 shows that well-established results on the determinants of patience can be reproduced with both of our measures for our whole sample (see specifications [1] and [2]) and the consistent subsample (see specifications [3] and [4]). In accordance with prior findings (Anderhub et al. 2001; Bettinger and Slonim 2007; Bartling et al. 2010) age, the propensity to take risks and IQ have a significant effect on our subjects’ degree of patience. In contrast to that, we find a negative and significant female-effect in CL, but not in TIE. As the evidence on gender differences in time preferences seems inconclusive,Footnote 9 it might be that gender effects depend upon the measurement method employed and thus should be treated with caution. Apart from that, our analysis reveals that major explanatory variables for time preferences in children perform well for both measurement methods.

Table 3 Ordered probit regressions with different measures as dependent variable

In specifications [5] and [6], we only consider subjects with inconsistent choice patterns in CL and find that a rather different picture emerges: None of the before mentioned variables can explain time preferences of these subjects while there is a marginally significant and positive female-effect in TIE, but not in CL.Footnote 10, Footnote 11 The fact that our control variables perform equally (badly) when predicting the behavior of these subjects in both tasks reveals a remarkable similarity of both measures. It is reassuring, however, that, while we cannot establish well-known determinants of patience as predictors within our inconsistent sample, including these subjects in the whole sample renders all effects of the consistent subsample identical in sign and significance.

To examine whether there are any significant differences in the determinants of patience for the two measures, we ran additional regressions with two observations per subject and with the interactions of method and control variables as independent variables (see Online Resource Table A4). These regressions test whether the estimated coefficients in Table 3 between columns 1 and 2 (the whole sample), between columns 3 and 4 (the subsample of subjects making consistent choices), and between columns 5 and 6 (the subsample of subjects making inconsistent choices) are statistically different from each other. Table A4 shows that neither the method indicator variable (TIE) nor the interaction terms (TIE*age, TIE*female, TIE*risk-taking propensity or TIE*relative IQ) are significant (p > 0.1 in all cases), using either the whole population or the subsample of subjects making consistent choices. For these subsamples, the post-estimation Wald tests show that the levels of patience generally do not differ significantly between TIE and CL within each age group among both genders (the only weakly significant exceptions are the youngest two female age cohorts in column 2 where 0.05 < p < 0.1). Thus, we cannot reject the null hypotheses that the determinants of patience for CL and TIE are the same for the variables we collected. For the subjects making inconsistent choices in CL, however, we find significant differences between the level of patience in TIE and CL in most comparisons (see the post-estimation Wald tests beneath specification [3]).

4 Conclusion

In this paper, we have compared two methods that can be used to measure time preferences. The methods have been applied to let children make intertemporal choices. One of the methods we used is a standard choice list task where children have to make multiple choices, of which one is randomly selected for payment. The other method requires a single choice how much of an endowment to invest into the future. We have found that both methods yield very similar results—on the aggregate and on the individual level—and that behavior under both methods is basically explained by the same factors. This is particularly true for male participants.

We see one straightforward extension of this research: Both the CL and the TIE can be easily augmented to discriminate between discounting (σ) and present bias (β). A “horse race” between the predictive power of σ and β as measured by CL and TIE would, therefore, be very useful to investigate whether TIE would also be a convenient device for measuring these dimensions.