How to measure time preferences in children: a comparison of two methods
We measure time preferences in a sample of 561 children aged 7–11 years. Using a within-subject design, we compare the behavior of our subjects using two distinct experimental measures of time preferences: a standard choice list with multiple decisions and a single choice time-investment-exercise requiring one decision only. We find that both measures yield very similar aggregate results, correlate significantly within subjects and can be explained by basically the same explanatory variables. Advantages and disadvantages of both measures are discussed. Our findings are relevant for the design of experiments to measure time preferences.
KeywordsTime preferences Experiment Method Children
JEL ClassificationC91 D03 B41
Numerous studies in economics and psychology report a relationship between experimentally elicited time preferences and economically relevant field behavior. For instance, patient adults perform better in their job (Burks et al. 2009), have less credit card debt (Meier and Sprenger 2010), pursue healthier lifestyles (Bickel et al. 1999; Kirby and Petry 2004; Chabris et al. 2008; Weller et al. 2008) and exhibit higher cognitive skills (Dohmen et al. 2010). Positive effects of patience on important behavior have not only been documented for adults, but also for children and adolescents. Sutter et al. (2013) show for 10- to 18-years-old that being patient is positively related to their attitudes toward saving, conduct at school and health-related behavior. Bartling et al. (2010) find that patient 5- to 6-year-old children exhibit higher verbal cognitive abilities and more pronounced pro-social behavior. Most importantly, Golsteyn et al. (2014) find that children’s time preferences measured at age 13 predict economically and socially relevant behavior in adulthood. In their study, impatient children attain lower levels of education, show weaker performance in compulsory and secondary school, have lower earnings at middle age, have more days being unemployed and have a higher risk of obesity as well as teenage motherhood.
Most economic studies on time preferences employ choice lists to elicit time preferences (see Frederick et al. 2002, for a review). In such experiments, subjects are asked to make several choices in a series of binary tasks. In each task, the participant has to choose between a smaller, but earlier, payoff and a larger payoff that is delayed in time. Either the earlier payoff is held constant and the delayed payoff increases across choice tasks, or the delayed payoff is fixed and the earlier payoff is decreasing from roughly the amount of the delayed payoff down to a fairly small amount. Subjects are expected to switch from the immediate to the delayed payment option at some point as the latter becomes more and more attractive. The switching point serves as a proxy for patience: the sooner a subject turns down the earlier payment in favor of the later one, the more patient he or she is. Typically, after choices have been made, one task is randomly selected for payment.
Measuring time preferences with choice lists has a few major advantages and disadvantages. One important advantage of choice lists is that they can identify inconsistent choice patterns which cannot be rationalized with standard theory. A choice pattern is classified as inconsistent if the subject waits for a payoff of x, but not for a payoff of y when y > x. Inconsistent choices are not uncommon in experiments with choice lists: up to twenty percent of adult subjects choose inconsistently (see, for instance, Coller and Williams 1999; Bettinger and Slonim 2007).1 While we are not aware of any study which systematically investigates time preferences and how individual characteristics influence the likelihood of inconsistent choices, we provide evidence that (1) inconsistent choice patterns do not arise due to misunderstanding and (2) standard control variables can explain time preferences of subjects making consistent choices but not of subjects making inconsistent choices (see Table 3). Since individuals with inconsistent choice patterns systematically differ from consistent ones, identifying inconsistent individuals is an important virtue of choice list tasks. Furthermore, choice lists can provide a fairly precise estimate of discount rates for subjects who make consistent choices.
Despite the widespread application of choice lists to elicit time preferences, they bear at least two important disadvantages. First, long lists of choices—of which only one is paid out—may reduce the amount of cognitive effort that decision makers invest into completing each task on the list. Second, having to make multiple choices adds complexity that can entail greater problems of comprehension. It is vital for measuring time preferences with choice lists that subjects not only understand each single binary choice, but also that they understand (1) that only one choice problem is payoff relevant and (2) the procedure with which the relevant decision is determined. Therefore, more effort has to be made to explain the task and this becomes more difficult the lower the cognitive development of the participants. Ensuring comprehension is particularly challenging in the growing field of experiments with non-student subject pools, such as low-educated persons, indigenous groups or children (see, for instance, Eckel et al. 2013; Angerer et al. 2015).
Given the importance of patience, respectively, the ability to delay gratification, already in childhood and adolescence, we examine two different methods to measure intertemporal preferences in children to compare them against each other. In particular, we present choice data of 561 children, aged 7–11 years, from two experimental tasks to elicit time preferences: a standard choice list (CL) and a simpler time-investment-exercise (TIE) which only requires one decision. Employing a within-subject design and eliciting intertemporal preferences at an interval of 6 months, we show that both measures exhibit quasi-identical aggregate patterns and correlate significantly within subjects.
In a recent paper, Burks et al. (2012) compare four measures of time preferences2 with respect to their capability to predict economically important outcomes (smoking, BMI, credit scores and job related outcomes). They find that all approaches have some predictive power and that the (β, σ)-model (Frederick et al. 2002), a formulation derived from choice list exercises, performs best. In this paper, we extend Burks et al.’s (2012) insights toward an assessment of CL and TIE with respect to their connections to variables which are widely used to explain time preferences in the literature: age, gender, risk-taking propensity and IQ. We find that patience measured with both methods can be explained by virtually the same independent variables and thus argue that CL and TIE yield robust proxies for time preferences in children. The only exception is gender, with females behaving more impatiently in CL but not in TIE. Closer inspection of our data reveals that risk-taking propensity is significantly related to patience as measured by CL and TIE in boys, but not in girls while IQ is significantly correlated with patience across both tasks and gender, except for girls in the CL task.
Our TIE method is closely related to what Andreoni and Sprenger (2012) have introduced as the convex time budget method. In their experiment, subjects had to allocate a fixed amount of money between an earlier and a later date. They let each subject make 45 convex time budget decisions which varied with respect to the interest rate for delaying the money to the later date, the date of the earlier payment and the delay length between the earlier and the later date. From the choice data, the authors estimate discount rates, utility function curvature as well as present bias. We abstained from these parameter variations and let children only make a single decision with the TIE method to keep the exposition as simple as possible. Comparing it with the traditional CL method, we are then interested in the factors explaining choices in both methods.
Furthermore, our data allow us to investigate the behavioral traits of a relatively large number of subjects who choose inconsistently in the traditional choice list design (N = 127). While their decisions in TIE correlate with the relative number of patient choices in CL, we find that the explanatory variables which predict our consistent subjects’ behavior very well prove insignificant for participants with inconsistent choice patterns. It is interesting to note that this lack of predictive power persists across tasks, but does not contaminate the results in the whole sample. As TIE picks up roughly the same behavioral traits as CL, the single choice using TIE seems like a simple and easily comprehensible3 alternative to traditional choice lists, at least when doing experiments with children.
2 Experimental design
We present data from 561 children who participated in an experiment that measured their time preferences in two stages. These children were attending the Italian-speaking primary schools in the city of Meran in South Tyrol, Italy, and were aged 7–11 years.4 In the first stage, we used the choice list (CL) design. Each child had to make choices in three binary decision tasks. In each task, the child could either get 2 tokens at the end of the session or a larger number of tokens (either 3, 4, or 5) in 4 weeks’ time. Tokens could be exchanged for small presents (e.g., stickers, arm wrists, sweets, pencils). The three tasks were first explained—either in ascending order (starting with 2 tokens now versus 3 tokens in 4 weeks) or descending order (starting with 2 now versus 5 in 4 weeks)—and then children had to indicate their choice for each task. One choice was randomly implemented for payment.
In the second stage, run 6 months after the first one, we employed the TIE. Children were endowed with 5 tokens and had to decide how many tokens to consume immediately (by exchanging them into small presents) and how many tokens to invest into the future. Each invested token was doubled and paid out 4 weeks after the experiment (as in CL).
In both stages, the classroom teachers delivered the presents to the children in sealed envelopes exactly 4 weeks after the experiment in the case a child had chosen a delayed payment. With this approach, we avoided transaction costs for our subjects and minimized uncertainty5 about the delayed payments.
Note that, in contrast to Burks et al. (2012), our experimental design does not allow us to distinguish between simple discounting (σ) and present bias (β) (Laibson 1997) but rather incorporates both traits in our single measures. While discriminating between σ and β is certainly important in research on time preferences, the combined measure should predict behavior similarly as both single measures separately (which is usually sufficient if a researcher wants to use impatience merely as a control variable). In fact, Burks et al. (2012) find that simple exponential discounting (i.e., ignoring present bias) successfully predicts three out of six outcomes (smoking, credit score and absenteeism) while four outcomes can be predicted when also considering present bias (job training drop-outs in addition to the previously mentioned outcomes). Thus, omitting this distinction is a reasonable trade-off in experiments with children, especially as our most important endeavor was to simplify and shorten the tasks as much as possible.
For logistical reasons, we collected all the CL decisions in the first stage and all the TIE decisions in the second stage. Therefore, the order might play a role. Since TIE was administered half a year after CL, however, it is reasonable to assume that children could not perfectly recall their decisions in the former while participating in the latter. Thus, we consider it highly unlikely that order effects drive our main results. As a further check for that, we can compare the number of invested tokens of 35 (additional) children who only participated in TIE, but not in CL, with the TIE decisions of participants of both stages.6 We find no statistical difference between the two groups (Mann–Whitney U test; p > 0.1; see also the Ordinal Probit Regression in Table A1 in the Online Resource). Thus, it seems that prior exposure to CL does not influence behavior in TIE. Note, however, that this is a low powered test: Our power analysis reveals that we can only detect order effects of 0.76 tokens invested in TIE (or 0.48 standard deviations of the outcome variable) or more at the 5-percent significance level with 80 % power. Consequently, we cannot exclude the possibility that significant order effects below this cutoff exist.
Number of subjects participating in both experiments, by age and gender
Age (in years)
The fact that both CL and TIE display a similar age trend is consistent with the conjecture that both measures pick up the same behavioral trait. To examine this issue in more detail, we now turn to the analysis of within-subject behavior.
Spearman’s rank correlations between both measures (p values in parentheses)
Age (in years)
All (N = 561)
Consistent (N = 434)
Ordered probit regressions with different measures as dependent variable
No. of patient choices (CL task)
No. of tokens invested (TIE)
No. of patient choices (CL task)
No. of tokens invested (TIE)
No. of patient choices (CL task)
No. of tokens invested (TIE)
Age (in years)
In specifications  and , we only consider subjects with inconsistent choice patterns in CL and find that a rather different picture emerges: None of the before mentioned variables can explain time preferences of these subjects while there is a marginally significant and positive female-effect in TIE, but not in CL.10, 11 The fact that our control variables perform equally (badly) when predicting the behavior of these subjects in both tasks reveals a remarkable similarity of both measures. It is reassuring, however, that, while we cannot establish well-known determinants of patience as predictors within our inconsistent sample, including these subjects in the whole sample renders all effects of the consistent subsample identical in sign and significance.
To examine whether there are any significant differences in the determinants of patience for the two measures, we ran additional regressions with two observations per subject and with the interactions of method and control variables as independent variables (see Online Resource Table A4). These regressions test whether the estimated coefficients in Table 3 between columns 1 and 2 (the whole sample), between columns 3 and 4 (the subsample of subjects making consistent choices), and between columns 5 and 6 (the subsample of subjects making inconsistent choices) are statistically different from each other. Table A4 shows that neither the method indicator variable (TIE) nor the interaction terms (TIE*age, TIE*female, TIE*risk-taking propensity or TIE*relative IQ) are significant (p > 0.1 in all cases), using either the whole population or the subsample of subjects making consistent choices. For these subsamples, the post-estimation Wald tests show that the levels of patience generally do not differ significantly between TIE and CL within each age group among both genders (the only weakly significant exceptions are the youngest two female age cohorts in column 2 where 0.05 < p < 0.1). Thus, we cannot reject the null hypotheses that the determinants of patience for CL and TIE are the same for the variables we collected. For the subjects making inconsistent choices in CL, however, we find significant differences between the level of patience in TIE and CL in most comparisons (see the post-estimation Wald tests beneath specification ).
In this paper, we have compared two methods that can be used to measure time preferences. The methods have been applied to let children make intertemporal choices. One of the methods we used is a standard choice list task where children have to make multiple choices, of which one is randomly selected for payment. The other method requires a single choice how much of an endowment to invest into the future. We have found that both methods yield very similar results—on the aggregate and on the individual level—and that behavior under both methods is basically explained by the same factors. This is particularly true for male participants.
We see one straightforward extension of this research: Both the CL and the TIE can be easily augmented to discriminate between discounting (σ) and present bias (β). A “horse race” between the predictive power of σ and β as measured by CL and TIE would, therefore, be very useful to investigate whether TIE would also be a convenient device for measuring these dimensions.
While several suggestions on how to handle such cases have been made, all of them have their limitations: Some scholars exclude subjects with inconsistent choice patterns from the analysis (see, for instance, Harrison et al. 2002) and others force subjects to make consistent choices (see, for instance, Dohmen et al. 2010). While the former approach neglects potentially important behavioral traits (inconsistent choices do not necessarily coincide with a lack of understanding, see our results section), the latter restricts a subject’s choice set by design. Another way to treat multiple switchers is to use their first consistent switching point and ignore subsequent inconsistencies (see, for instance, Meier and Sprenger 2010), an approach that ignores parts of the data. Yet other solutions are to use the midpoint of the interval over which the subject is indifferent (see, for instance, Lammers and van Wijnbergen 2007), or to simply use the relative number of patient choices in a choice list as a proxy for patience. Finally, the approach of Burks et al. (2009) assigns inconsistent subjects the level of patience of the “nearest consistent individual” by counting the number of patient choices and translating this number into a consistent choice pattern.
The four measures are (1) the simple discount factor (σ), (2) an indicator of present bias (β), (3) a new proxy for impulsivity and (4) a measure elicited through a battery of survey questions.
Among all children participating in the two experiments (see Table 1), only 7 children had problems understanding the CL task, while only 3 were excluded from the analysis due to comprehension problems in the TIE task. We report only data for the 561 children who understood both tasks.
In companion paper (Sutter et al. 2015), the data used here are a subset of 1415 children with which we investigate how the language children speak affects intertemporal choices.
Uncertainty was furthermore reduced by the fact that the choice list experiment was the third experiment conducted with these children. Therefore, children had already experience with economic experiments (on social preferences; see Lergetporer et al. 2014, or Angerer et al. 2015) and receiving delayed payoffs (from previous experiments that were not related to intertemporal choice, though).
These subjects were absent when CL was conducted. While we have no information on the reason for their absenteeism, it is most likely that they were ill.
While these correlations are not particularly high, sign and significance are consistent with the assumption that both methods measure the same latent variable. Furthermore, low correlations between different measures of patience have also been found for adults: Burks et al. (2012) report rank correlations between their new proxy for impulsivity and β, σ and survey-based measures for patience of ρ = 0.066, ρ = 0.071 and ρ = 0.042, respectively.
However, we find that inconsistent choice patterns are not significantly related to the age of our participants, their IQ or any other of our standard control variables (see Table A2 in the Appendix).
The regression results in column  and  are largely insignificant. Note, that this could in principle also be due to the smaller number of observations compared to the other regressions. To check for this possibility, we draw 1000 random samples (for CL and TIE task separately) of size N = 126 of all participants who are consistent in the CL task and rerun our regressions. The results for the age effect reveal that in 90 % of our random samples the age effect is at least significant on a 10 %-level in CL, while in the TIE task this is true for even 97 % of the sample. Thus, it is very unlikely that the insignificant age effect for the inconsistent sample is due to the lower number of observations. The results for the other coefficients are more mixed. The gender effect is significant at least on a 10 %-level in 37 % of cases in CL and in 22 % of cases in TIE. Risk taking propensity is significantly at least on a 10 % level in 56 % of all cases for CL and 46 % of all cases for TIE. Finally, the relative IQ is significant at least on a 10 % level in 59 % of all cases for CL and 65 % of all cases in TIE.
To detect differences between consistent and inconsistent subjects in the decisions in CL and TIE, Table A3 in the Appendix analyzes differences between the two subsamples by introducing the interactions of controls and inconsistent choice pattern. The results suggest that for CL there exists a significant difference between consistent and inconsistent choices for all but the two oldest age cohorts among male participants. Also for TIE there is a difference between consistent and inconsistent choices among girls in the two youngest age cohorts and among boys in the two oldest age cohorts (see Wald tests for each combination of age group and gender beneath Table A3).
We are particularly grateful to the editor, Robert Slonim, an anonymous referee, as well as Achim Zeileis, Thomas Stöckl and seminar participants at the Ph.D-seminar at the University of Linz and the eeecon Workshop at the University of Innsbruck for very helpful comments. We thank Rudolf Meraner from the South Tyrolean State Board of Education (Pädagogisches Institut für die deutsche Sprachgruppe in Südtirol), the schools’ headmasters (Maria Angela Madera, Vally Valbonesi), and the parents of the involved children for making this study possible, and the children for participation. Financial support from the Government of the autonomous province South Tyrol through Grant 315/40.3 is gratefully acknowledged.
- Angerer, S., Glätzle-Rützler, D., Lergetporer, P., Sutter, M., (2015). Donations, risk attitudes and time preferences: a study on altruism in primary school children. Journal of Economic Behavior and Organization, 115, 67–74.Google Scholar
- Lammers, J., & van Wijnbergen, S., 2007. HIV/AIDS, risk aversion and intertemporal choice. Tinbergen Institute Discussion Paper 07-098/1.Google Scholar
- Meier, S., & Sprenger, C. (2010). Present-biased preferences and credit card borrowing. American Economic Journal: Applied Economics, 2, 193–210.Google Scholar
- Sutter, M., Angerer, S., Glätzle-Rützler, D., & Lergetporer, P., 2015. The effect of language on economic behavior: experimental evidence from children’s intertemporal choices IZA Discussion Paper 9383.Google Scholar