Magnitude effect in intertemporal allocation tasks

Intertemporal choices are affected by both discount rate and utility curvature. We investigate how the two aspects of time preference are affected by the size of the total budget using an intertemporal allocation task. At the aggregate level as well as at the individual level, we find magnitude effects both on the discount rate and on intertemporal substitutability (i.e., utility curvature). Individuals are more patient when dealing with larger budgets and also regard larger budgets to be more fungible. The latter effect suggests that the degree of asset integration is increasing in the stake.


Introduction
The prediction of the standard consumption-savings model, that people always discount an income at the market interest rate, has been found inconsistent with empirical results. 1 One important anomaly, dating back to Thaler (1981), is the magnitude 1 3 effect: when comparing a smaller-sooner reward with a larger-later reward, people favor the later reward more often as the amounts of the two rewards are scaled up. Studies on the magnitude effect help us better understand how people make intertemporal choices differently for small and large rewards, for instance, how people allocate inheritance or lottery income over time.
While several experiments have found a magnitude effect, no study has explored the underlying mechanisms. 2 Intertemporal choices are affected by both the discount rate and the curvature of the atemporal utility function, where the latter determines intertemporal substitutability. Accordingly, the magnitude effect could have two potential causes: people are more patient for larger rewards, and/or people find larger rewards more substitutable across time. The two mechanisms are indistinguishable in a single-reward task where a decision-maker can only receive a reward on a single date (either sooner or later), but they have different implications and lead to different behavioral patterns in more general situations.
One such situation is the intertemporal allocation task, where a decision-maker allocates a fixed budget between two dates, given a (usually positive) return to the share allocated to the later date. Choices in intertemporal allocation tasks can be characterized by two attributes: the average shares allocated to a particular date given a set of return rates, and the responsiveness of these shares to changes in the rate of return. The former attribute is determined by the discount rate and the latter one by the intertemporal substitutability. Thus, studying the magnitude effect in intertemporal allocation tasks can help us identify the mechanism underlying the magnitude effect.
In this paper, we perform a lab experiment to investigate how choices in intertemporal allocation tasks change with the size of the total budget, and in particular, whether the stakes impact intertemporal preferences through patience (the discount rate) or intertemporal substitutability (the atemporal utility function).
Some theories provide an explanation for the magnitude effect in single-reward tasks. Benhabib et al. (2010) posit that a fixed cost of waiting makes people impatient to small outcomes, but it matters less when outcomes are large. Noor (2011) establishes a magnitude-dependent discounting model where the discount rate of a dated outcome is decreasing in the size of the outcome. Fudenberg and Levine (2006) predict that people exert costly self-control when stakes are high but indulge themselves when stakes are low, which generates a magnitude effect. Holden and Quiggin (2017) assume that people take into account more background consumption when experimental rewards are larger, which also explains the magnitude effect in single-reward tasks. When those theories (with proper extension) are applied to intertemporal allocation tasks, Benhabib et al. (2010) and Noor (2011) predict a magnitude effect on the discount rate, while Fudenberg and Levine (2006) and Holden and Quiggin (2017) predict a magnitude effect on the utility curvature. Our experiment helps to further distinguish the explanatory power of these different models.
We employ the Convex Time Budget (CTB) method introduced by Andreoni and Sprenger (2012). It allows subjects to form a portfolio of a sooner reward and a later reward given a budget constraint. The possibility for subjects to make interior choices (and not only corner choices as in single-reward tasks) enables researchers to simultaneously identify the discount rate and the intertemporal substitutability. In particular, the identification of the intertemporal substitutability comes from subjects' allocation of rewards as a response to changes in the interest rate, which is exactly the definition of the elasticity of intertemporal substitution, and hence the method provides a robust measure against misspecification. 3 The design of our experiment has three main features. First, all subjects receive equal amounts of participation fees on the sooner date and the later date regardless of their choices, and the payment conditions are constant across time. Thus, the transaction costs and the trustworthiness of the payments are equalized across periods. Second, we implement two treatments. In one treatment subjects allocate between today and 4 weeks later, while in the other treatment subjects allocate between 4 weeks later and 8 weeks later. This allows us to assess whether the magnitude effect is affected by the inclusion of a front-end delay. Finally, by assuming a simple yet popular model, the CTB method allows us to identify the discount rate and the atemporal utility function simultaneously. As a result, we can disentangle the channels of the magnitude effect.
We find evidence of the magnitude effect in intertemporal allocation tasks: the budget share allocated to the later date is increasing in the total budget. The size of the magnitude effect is found to be decreasing in the stake. The pattern is not affected by whether or not a front-end delay is present. At the aggregate level as well as at the individual level, we find magnitude effects both on the discount rate and on intertemporal substitutability. Both channels have considerable impacts on predicted choices. We find that the latter effect is not the same as the magnitude effect on risk attitudes found in previous studies, and hence it might be problematic to correct for the curvature of utility functions by risk attitudes. Instead, the magnitude effect on intertemporal substitutability is consistent with theories proposing that people integrate experimental rewards with more background wealth as the size of rewards gets larger.
The remaining part of this paper is structured as follows: We introduce our experimental design in Sect. 2. In Sect. 3 we formulate our hypotheses. In Sect. 4, we investigate non-parametrically the magnitude effect and its relation with the frontend delay. We explore the channels by parametric estimation both at the aggregate level and at the individual level in Sect. 5. In Sect. 6, we discuss the interpretations of our findings. We conclude in Sect. 7.

The Convex Time Budget method, parameters, and implementation
The basis of our experimental design is the Convex Time Budget method introduced by Andreoni and Sprenger (2012). The method consists of a set of intertemporal allocation tasks. In each decision, subjects are asked to allocate N tokens to two dates: t days from today, and (t + ) days from today. Each token allocated to t is worth P t euros, while each token allocated to (t + ) is worth P t+ euros. If a subject allocates n t tokens to the sooner date and n t+ to the later date, the sooner reward is z t = P t ⋅ n t euros and the later reward is z t+ = P t+ ⋅ n t+ euros.
Choices are subject to the budget constraint, n t + n t+ ≤ N , and the non-negativity constraints, 0 ≤ n t , n t+ ≤ N . Subjects are told that they can allocate any number of tokens they like to one of the two dates. Examples of both corner choices and interior choices are given to remove any hesitation in making either type of choice.
Decisions with the same total budget, N , are grouped in one decision form, which is displayed on one page. There are seven decisions in each decision form. The return to each token allocated to the later date is fixed as P t+ = €0.20, while the return to each token allocated to the sooner date is varied and takes the values P t = €0.20, €0.19, €0.18, €0.17, €0.16, €0.15, and €0.14. Hence, those returns imply seven gross interest rates, R = 1, 1.05, 1.11, 1.18, 1.25, 1.33, and 1.43, respectively, over a period of days. The constraints can be rewritten as where m is the total budget and m = P t+ ⋅ N.
We implement the CTB method by a zTree program (Fischbacher, 2007). Figure 1 shows the interface of a typical decision form. Each decision takes a row. Decisions can be made by scrolling the bars. Once an adjustment is made for one decision, the amounts of the sooner reward and of the later reward in that decision are automatically calculated and displayed.
To avoid any possible effects of initial values, the amounts of rewards are initially blank. Decisions cannot be submitted until all the scroll bars have been adjusted at least once.

Procedures
There are two parts to our experiment. Part I consists of five decision forms, with N = 100, 200, 300, 400, and 800, respectively. The order is randomly drawn for each subject. Subjects can move to a specific decision form by clicking the button with the corresponding number. One can go to any decision form at any time, regardless of whether the current decision form is completed. Decisions are automatically stored when one switches to another decision form. This makes comparisons across total budgets very easy to the subjects in case they would want to make such comparisons. Decisions can only be submitted when all the 35 decisions are completed.
We randomly assign subjects to one of two treatment groups. In the Present Group, the sooner date is today while the later date is 4 weeks from today, i.e., t = 0 and = 28 . In the Delayed Group, the sooner date is 4 weeks from today while the later date is 8 weeks from today, i.e., t = 28 and = 28 . Comparing the two groups enables us to check if there exists a present bias on average and, more importantly, if there exists a magnitude effect when no rewards are available in the present. 4 We also test the intertemporal independence of preferences over monetary rewards. 5 One alternative hypothesis is that a subject in the Delayed Group may allocate less to the sooner date if she has allocated a large amount of money to an even sooner date since the desire for extra consumption has already been partly satisfied. A similar hypothesis applies to the Present Group: a subject in the Present Group Fig. 1 The interface of a typical decision form in Part I 4 Identifying present bias would be possible under weaker assumption by varying the presence of a front-end delay within subjects. We choose for a between subject design on this dimension to keep the experiment relatively short and to focus on within-subject variations of the interest rate and the budget size which are central to our analysis. 5 The property that preferences over alternatives are not affected by a common increase in outcome in the same period when uncertainty is not involved is often called "time separability" in early literature of time preference. However, when uncertainty is considered, the term time separability also has the meaning of correlation neutrality (as opposed to correlation aversion or seeking) and hence the use of it may cause confusion. We thank an anonymous referee for pointing this out. may allocate less to the later date if she has already allocated a large amount of money to an even later date since the guilt for not saving has been partially released. If preferences are intertemporal dependent, the use of a model with a time-separable preference is likely to be problematic. Thus, we want to test the hypothesis of intertemporal independence before we perform a parametric estimation with a time-separable model.
We use Part II to test the intertemporal independence. It is composed of an extended CTB decision form with seven decisions. Subjects are asked to allocate 400 tokens to three dates, today, 4 weeks from today, and 8 weeks from today. One additional restriction is imposed, depending on which group one is in. A subject in the Present Group can allocate either 0 or 200 tokens to 8 weeks from today; she cannot choose other numbers. But she is still free to allocate any number of tokens between today and 4 weeks from today. Similarly, a subject in the Delayed Group can allocate either 200 or 400 tokens to today. She is still free to allocate any number of tokens (if there remains some) between 4 weeks from today and 8 weeks from today. The restrictions and the returns to one token allocated are shown in Table 1.
The additional date (8 weeks from today for the Present Group or today for the Delayed Group) is accompanied by a very high return for the Present Group and a very low return for the Delayed Group so that subjects are induced to allocate 200 tokens to this additional date. If they do so, the remaining task is equivalent to the one with a total budget of 200 tokens in Part I. This characteristic makes the two decision forms comparable.
We do not directly give a fixed reward on the additional date. This is because a fixed reward might be mentally isolated from the allocation task due to narrow bracketing, and hence the test of intertemporal independence in the allocation task may be invalid.
At the end of the experiment, subjects were asked to finish a questionnaire. As in previous studies with the CTB method, we asked about subjects' expenditures in a typical week. The average response was €55.22 per week or €7.89 per day.

Experimental payments
The payments are composed of two parts. First, all subjects receive a €5 participation fee on each of the two dates scheduled in Part I. Second, each subject has a 10% chance to receive earnings from decisions. Before the experiment starts, each subject is randomly given a lottery number, ranging from 0 to 9. After all subjects in a session finish the questionnaire, the experimenter invites one of the subjects to roll a ten-sided die in front of all subjects in the session. Subjects who have a lottery number that equals the die roll get the earnings from decisions. One decision is randomly selected from the 42 decisions in the two parts as the decision that counts. If the decision that counts is from Part I, the allocation in that decision will be realized as the earnings from decisions. If the decision that counts is from Part II, the allocation will be realized and the subject will also receive a €5 participation fee on the additional date in Part II; hence a subject will receive three participation fees if a decision in Part II is realized. All the rules above were articulated in the instructions, and the instructions were always read aloud before either part of the experiment. The earnings were paid by bank transfer to subjects' checking accounts. We made orders of transfers soon after the experiment and sent reminder emails with information about the incoming amounts on the day of the experiment and all the payment dates. Given the reliability of the banking service, subjects can expect to receive all delayed payments exactly on the appropriate payment dates, while some of the present payments might be received one day after the experimental day due to the inter-bank processing.
We believe the payment tool we used was as good as cash in terms of liquidity. Checking accounts are used in private transactions such as paying for rents. Checking accounts are also linked to debit cards. In the Netherlands, debit cards are widely used for daily transactions in almost all kinds of stores including supermarkets, university restaurants, and bookstores without any transaction fees. We held a survey about subjects' use of debit cards in the questionnaire. The responses show that bank transfers give high liquidity to the rewards so that no isolation effect should be expected due to the payment method. 6

Transaction costs and credibility of payments
For our experiment, it is extremely important to equalize the transaction costs and the trustworthiness of the payments across periods, because a difference in the transaction costs over the two periods can be a confounding factor of the magnitude effect.
Several facilities were employed to equalize the transaction costs across periods and to increase the credibility of the payments. The transaction costs include the 1 3 costs to collect rewards, to confirm that the rewards have been received with correct amounts, and to remember the earnings so that they can be consumed on the expected dates.
First, we sent reminder emails with information about the incoming amounts on the day of the experiment and all the payment dates. Subjects knew this from the instructions, so they did not need to worry about forgetting the earnings on the payment dates, a situation in which the expected marginal utility of the delayed rewards might be lowered.
Second, as Andreoni and Sprenger (2012) did, we delivered our business card and told the subjects to contact us immediately in case they would not receive a payment on time. It increased the credibility of payments and meanwhile served as a reminder of the payments.
Third, we asked subjects to fill in a payment reminder card with the amounts of their rewards on the corresponding dates just after their earnings were displayed. This served as a second reminder in case they forget to check emails.
In sum, the characteristics that one will receive a participation fee on each payment date and that all payments will be received by bank transfer help equalize the transaction costs of receiving payments on all dates. At the same time, the business cards, the payments reminder cards, and the reminder emails reduced the risk of forgetting the rewards. The business cards also lowered the perceived default risks. Even though the risk might still be perceived by some subjects, it should be equal across periods since the payment tools and all auxiliary facilities were the same.

Sample
Our experiment was conducted at the CentERlab, Tilburg University. 7 203 students of the university participated in one of the 11 sessions, 94 in the Present Group and 109 in the Delayed Group. Each subject made 42 decisions. One session took one hour and ten minutes on average. 22 subjects got the earnings from decisions, which averaged €69.16. The overall average earning was €17.49.

Hypotheses
Most previous studies define the magnitude effect in single-reward tasks. Denote a reward z t on a sooner date t by z t , t . In a single-reward task, a subject chooses between a sooner reward z t , t and a later reward z t+ , t + , where > 0 is the delay. A subject displays a (positive) magnitude effect if for all z t > 0 , z t+ > 0 and > 1, In words, the later reward is more favorable if the amounts are scaled up. We adapt the definition of the magnitude effect to the intertemporal allocation task. A subject makes a choice z * (R, m) out of a linear budget set z t , z t+ ∶ Rz t + z t+ = m , where z * = z * t , z * t+ . She displays a (positive) magnitude effect if for all m > 0 and > 1, and In words, people put a larger share of the budget on the later date as the total budget is increased. She may only fail to do so in case the sooner reward is already zero. The adapted definition is consistent with the original one. Options with a larger fraction of later reward become more favorable if all options in the menu are scaled up by the same factor.
Hypothesis 1 (magnitude effect on budget share): We are also interested in whether the magnitude effect is affected by the presence of a front-end delay. Benhabib et al. (2010) suggest that a fixed cost of delaying rewards can account for the magnitude effect in single-reward tasks since the fixed cost matters less in case the rewards are scaled up. However, it is not clear if this cost is incurred only when a present reward is delayed or if it applies equally to delaying a future reward. We thus test whether the magnitude effect is smaller or even non-existent if the sooner reward is also in the future.
Hypothesis 2 (a front-end delay leads to a smaller magnitude effect): z * t+ m changes less with m in the Delayed Group than in the Present Group.
The two hypotheses above can be tested without assuming a specific model. Conditional on finding a positive magnitude effect, we wish to explore the channels of the magnitude effect. If intertemporal independence is supported by our results (which it will, as we show below), we will estimate the parameters of preferences, with the assumption that subjects maximize a time-separable utility function with power atemporal utility functions and quasi-hyperbolic discounting, i.e., subjects maximize where is the present bias parameter, is the daily discount factor, is the exponent parameter. z t and z t+ are the sooner reward and the later reward, respectively. > 0 (1) is the background consumption mentally integrated with the experimental reward when the decision is made. 8 When the power utility function is assumed, the elasticity of intertemporal substi- , is equal to 1 1− ( c t and c t+ are the consumption on the sooner date and on the later date, respectively.). Thus, the exponent parameter, , is a positive transformation of e c . If → 1 , the atemporal utility function becomes linear, and the elasticity goes to infinity. In that case, subjects just go for the largest present value, and hence rewards are perfectly substitutable between dates. In case → −∞ , the atemporal utility function is Leontief, and the elasticity goes to zero. In that case, subjects always divide the total budget into two equal amounts. In general, the larger the value of , the more substitutable the subject considers the two rewards to be. Therefore, is a measure of intertemporal substitutability.
It brings several advantages to assume such a model. First, the parameters in this model have important economic meanings. The discount factor determines the average choice across interest rates and hence measures the patience of the subject; if a subject is more patient, she will allocate more tokens to the later date for all interest rates. The intertemporal substitutability of consumption between different points in time relates to the dispersion of the choices across interest rates since it measures how sensitive the choices are to the interest rate. These behavioral measures are hard to estimate without assuming a model. Due to the non-negativity constraint, choices are censored at the corners if the preference parameters are extreme. As a result, directly measuring the average choice (as a measure of patience) and the dispersion of choices (as a measure of intertemporal substitutability) leads to biases. In contrast, the model we assume is tractable and easy to estimate. Moreover, the model is widely used in both theoretical and empirical applications. 9 Given the model above, we test the following two hypotheses.
Hypothesis 3 (magnitude effect on the discount factor): is increasing in m.
Hypothesis 4 (magnitude effect on intertemporal substitutability): is increasing in m. 8 The atemporal utility function here satisfies the subproportionality defined by Loewenstein and Prelec (1992) if and only if > 0 . Thus, the magnitude effect on monetary discount rate is guaranteed as long as the background consumption is positive. 9 To address the concern about misspecification, in Online Appendix D, we check the robustness of our results by estimating the background consumption parameter. The estimate can be a mixture of real background consumption and an inherent shift of the utility function. Depending on the sign of the shift, the elasticity of intertemporal substitution could be increasing or decreasing in the consumption, or be a constant. This kind of flexibility is especially important when the stake is varied in the experiment. The results are the same.

3
Magnitude effect in intertemporal allocation tasks 4 Overall effects

Magnitude effect on budget share
In our data, 28% of the choices are interior, and 62% of our subjects make at least one interior choice. This is very close to the 30% and 63%, respectively, in Andreoni and Sprenger (2012). The relationships between the budget shares and the interest rates are also similar. A discussion about the rationality of subjects in the CTB task is provided in Appendix A.
In Fig. 2 we plot the mean budget share allocated to the sooner date against the gross interest rate, R , of each CTB decision in Part I. We plot separate points for the five total budgets ( m = €20, €40, €60, €80, €160). The budget share allocated to the sooner date declines with the total budget.
The difference seems to be larger when the interest rate is smaller but still positive. This is mainly due to censoring. When the interest rate is zero ( R = 1 ) or the highest ( R = 1.43 ), most choices are at the corners for both smaller and larger total budgets.
To judge whether there is a significant magnitude effect, we perform Hotelling's T-squared tests on the mean differences in budget shares between total budgets, taking seven choices with the same total budget as a vector (see Table 2). 10 The null 10 Hotelling's T-squared test is asymptotically nonparametric, so it can be applied to a large sample in nonnormal cases. We also perform a multivariate signed-rank test (Oja and Randles 2004) and the results are basically the same: the magnitude effects are significant between the stakes of €20 and €40 hypothesis is that the means of choices are the same across total budgets, taking into account the correlation within-subject. This class of tests makes sense because individual heterogeneity may have made different subjects reveal magnitude effects on tasks with different interest rates (e.g., Subject 1 on Interest Rate 1 while Subject 2 on Interest Rate 2) so that the magnitude effects on all choices would be jointly significant, but the effect on choices with any single interest rate might be insignificant. The results show that the magnitude effect is significant between the total budgets of €20 and €40 and between any two non-adjacent total budgets. These results support Hypothesis 1, which states that a larger share of the budget is allocated to the later date when the size of the budget increases. 11 The results also show that the differences are insignificant between adjacent total budgets larger than €20. Since the allocation is monotonic in the total budget and the differences are significant between non-adjacent total budgets, the insignificance suggests that the magnitude effect is the greatest when comparing between the smallest total budgets (€20 and €40) and becomes smaller for larger stakes. The pattern is consistent with the fact that Andersen et al. (2013) found a "statistically significant" but "not economically significant" magnitude effect when they elicited time preferences using very high stakes. 12 Hotelling's T-squared tests on the mean differences in the budget shares between two total budgets for all gross interest rates. 203 sets of observations for each total budget. The degrees of freedom of the F-statistics are (7,196). ***, **, and *Indicate significance at the 1% level, 5% level, and 10% level, respectively The results are robust against multiple hypotheses testing problems since performing a Holm-Bonferroni correction on any family of independent hypotheses does not change the significance at the 1% level €40 €60 €80 €160 €20 F-statistic 3.3184*** 5.1162*** 6.3670*** 6.0477*** Footnote 10 (continued) and between any two non-adjacent stakes at the 5% level for the entire sample and significant at the 10% level for either group. The Holm-Bonferroni correction is already performed. 11 In Online Appendix B, we present the results of group-wise tests. The results are the same. 12 Andersen et al. (2013) compare stakes of 1,500 Danish kroner and 3,000 Danish kroner, which are roughly equivalent to €200 and €400, respectively.
In Online Appendix B, we show group-wise results of the same tests. The results are the same.

Conditional on the presence of an immediate reward?
We test if the magnitude effect is affected by the existence of a front-end delay. First, Table B.1 shows the results of Hotelling's T-squared tests on the magnitude effects for the Present Group and the Delayed Group, respectively. We find significant magnitude effects in both groups. This implies that the presence of an immediate reward is not a necessary condition for the magnitude effect.
Second, we plot separate graphs for the two groups in Fig. 3. Subjects in the Delayed Group seem to be slightly more patient than those in the Present Group. However, when we perform Hotelling's T-squared test on all the 35 decisions in Part I between groups, the null hypothesis that the two groups have the same mean responses is not rejected. The p value is 0.2424 when the degree of freedom is (35, 167). Thus, we find no evidence that the magnitude effect is affected by the existence of a front-end delay, and we reject Hypothesis 2. 13 This finding has an implication for the modeling of the magnitude effect: no matter what generates the magnitude effect, it applies equally to situations with an immediate reward and those without. For instance, if it is a fixed cost of delaying rewards that generates the magnitude effect, as proposed by Benhabib et al. (2010), the cost applies equally to delaying an immediate reward and to delaying a future reward.

Intertemporal independence
Our results show that Part II is a valid test of intertemporal independence since most subjects chose 200 tokens for the additional date in Part II. Only nine out of 203 subjects selected a different number than 200 to the additional date, which involved 41 (2.9%) out of 1415 decisions.
After removing those decisions, we compare the choices with the total budget of €40 between Part I and Part II, separately for each group. Table 3 shows that Hotelling's T-squared tests fail to reject the null hypothesis that responses to the two parts have the same means. Those results support intertemporal independence, which will be assumed in the next section. 14

Channels
To disentangle the magnitude effect into two channels, we perform parametric estimations both at the aggregate level and at the individual level. We then test if the preference parameters change with the size of the total budget. Hotelling's T-squared tests on the mean differences in the budget shares in the decisions with the total budget of €40 between Part I and Part II. Subjects who chose a different number from 200 tokens for the additional date in Part II such that their choices were not comparable between the two parts have been removed from the sample. ***, **, and *Indicate significance at the 1% level, 5% level, and 10% level, respectively

3
Magnitude effect in intertemporal allocation tasks

Estimation strategy
In our main specification, we assume a time separable utility function with power atemporal utility functions as in Eq. (1). We set (background consumption) equal to the average response to the question about one's typical daily expenditure, €7.89, as Andreoni and Sprenger (2012) did in two of their specifications. 15 Given the intertemporal utility function, solving the optimization problem yields the tangency condition

Taking logs gives a linear equation
where 1 t=0 is the indicator for the Present Group.
The parameters to be estimated are the present bias parameter, , the discount factor, , and the power curvature parameter, . The present bias parameter is identified by the differences in allocation between the Present Group and the Delayed Group. If there is a present bias, subjects in the Present Group will allocate more tokens to the sooner date than those in the Delayed Group. The discount factor is identified by one's average choice across different experimental interest rates. A more patient subject will allocate more tokens to the later date in all decisions. The curvature parameter is identified by the dispersion of one's choices across interest rates. Those who consider rewards highly substitutable over time are likely to make corner choices in all decisions, while those with lower elasticity of intertemporal substitution will make choices closer to equal splits.
Following the practice in previous studies Sprenger, 2012, andAugenblick et al., 2015), we assume a normally distributed error term additive to the log-consumption ratio and consider censoring, to yield the two-limit Tobit model: To fix the background consumption across subjects brings the advantage that all effects come from the variation in choices rather than also from the variation in the self-reported background consumptions, which may be noisy. We check the robustness by setting as individual background consumption and average/individual background consumption combined with the participation fee (See Online Appendix C). The results are basically the same.
where i = 1, … , 203 denotes Subject i , j = 1, … , 7 denotes Interest rate j , and k=1,...,5 denotes Total budget k. The error term is allowed to vary across total budgets since giving a larger number of tokens might induce a larger noise, which might be a competing explanation of a larger sensitivity to the interest rate. The Tobit model can predict corner choices with a natural interpretation. When the background consumption is positive, the marginal utility at a zero reward is finite. If the implied interest rate is much higher than the discount rate, the model predicts a latent choice with a negative budget on the sooner date. The individual would be willing to give up part of her background wealth on the sooner date to earn a larger amount on the later date. But in the experiment, she is not allowed to do that. This is naturally captured by the Tobit model by censoring at the later corner. The opposite case occurs when the implied interest rate is much lower than the discount rate. Then the individual would be willing to borrow money from the experimenter if she could, but her choice is censored at the sooner corner. 16 The model is estimated by the quasi-maximum-likelihood method: when performing the estimation, the error term, , is assumed to be i.i.d., while in computing the standard errors, the error term is assumed to be independent across subjects, but might be correlated within-subject. Estimates of the parameters can be recovered and standard errors can be inferred by the delta method.
Since we are interested in the magnitude effect, we also perform the estimation with interaction terms of the parameters and the budget dummies. Thus, tests can be performed on the differences between the parameters for different total budgets.
To see why the measure of the intertemporal substitutability is robust against misspecification, notice that is identified from the sensitivity of the log-ratio of the consumption (i.e., ln ) to the logarithm of the gross interest rate (i.e., ln R j ). This is approximately the sensitivity of the percentage change in the ratio of consumption to the percentage change in the relative price, which is exactly the An alternative way to model the idiosyncratic shock is to assume the Logit model. We perform estimation for both error structures and find that the Tobit model predicts better than the Logit model, especially in predicting corner choices. Details are available upon request.
definition of the elasticity of intertemporal substitution. Therefore, even if the utility function is misspecified, is still a measure of intertemporal substitutability. In Online Appendix D, we assume a more flexible specification, in which the utility function has one additional free parameter. There the background consumption, , is also a parameter to be estimated. In this way, we address the concern that the average self-reported background consumption may not match the true background consumption integrated with the experimental rewards in decision making, or the elasticity of intertemporal substitution of the utility function may not be constant (i.e., the power utility function with a fixed consumption shift is misspecified). The results are basically the same. Table 4 reports the magnitude-invariant estimates and the magnitude-specific estimates of the parameters, respectively. A salient feature is that none of the estimates of is significantly different from 1, implying no evidence of present bias, which is consistent with our finding in the model-free analysis. The annual discount rate for all the budgets together is 52.7%, which is in the range found by previous studies. The curvature parameters are always significantly smaller than 1, implying that the subjects on average consider the monetary rewards received on different dates imperfectly substitutable, which is also consistent with other studies (e.g. Andreoni & Sprenger, 2012;Andreoni et al., 2015;Augenblick et al., 2015;Cheung, 2020). Table 4 Discounting and curvature parameter estimates in the aggregate-level estimation Two-limit Tobit estimators. Power utility function with = 7.89 (average reported background consumption). Column 1: assuming that parameters are invariant to total budgets. Column 2-6: assuming that parameters vary with total budgets. Clustered standard errors in parentheses. Log-likelihood has been corrected for the transformation of dependent variables. Standard errors are calculated via the delta method.

Model:
Tobit Tobit Most importantly, both the discount factor and the curvature parameter are increasing in the stake. To judge if these magnitude effects are significant: Table 5 presents Wald tests over the differences of parameters between total budgets. We find significant magnitude effects both on the discount factor, , and on the exponent Table 5 Estimates of parameter differences between total budgets in the aggregate-level estimation Estimates of parameter differences are inferred from the Two-limit Tobit estimation by the delta method. The estimation assumes power utility with = 7.89. Separate parameters are estimated for each total budget among €20, €40, €60, €80, and €160. There are 1,421 observations (203 clusters) for each total budget. Clustered standard errors in parentheses. Standard errors are calculated via the delta method. ***, **, and *Indicate significance at the 1% level, 5% level, and 10% level, respectively parameter, , which is a positive transformation of the elasticity of intertemporal substitution. The discount factor is increasing in the total budget, meaning that the decision weights on later rewards shift upward when the total budget increases. The elasticity of intertemporal substitution is increasing in the total budget, meaning that the rewards on the two dates are more substitutable to the subjects when a larger total budget is provided. This results in choices closer to the two corners (to which corner depends on whether R > 1 ). Thereby, we verify Hypothesis 3 and Hypothesis 4.
To get an idea about the size of the magnitude effects, we compare the discount rates and the predicted monetary discount rates, respectively, between stakes. The continuous annual discount rate at stakes of €20, €40, €60, €80, and €160 are 0.696, 0.519, 0.384, 0.370, and 0.237, respectively. Thus, the discount rate is 1.3 to 1.6 times larger when the stake is halved. This is larger than the effects found by Andersen et al. (2013). Their discount rate at a stake of 1,500 Danish kroner is only 1.0 to 1.1 times larger than that at a stake of 3,000 Danish kroner. As we mentioned earlier, the difference is consistent with the pattern that the magnitude effect becomes smaller at higher stakes. To incorporate the magnitude effect on the utility curvature, we predict the monetary discount rates at the lowest and the highest stakes. Take the monetary discount rate at the stake of €20 as an example, it is the continuous annual discount rate implied by an indifference relation between a later reward of €20 and an equally good sooner reward, assuming linear utility. We find the monetary discount rate at €20 is 1.058 and that at €160 is 0.257. The former is 4.1 times larger than the latter. Therefore, the two channels of the magnitude effect between the stakes of €20 and €160 are both large. Halevy (2015), for instance, finds that the monetary discount rates measured with a budget of $10 are 1.4 to 1.9 times larger than those with a budget of $100.
To illustrate the relative importance of the two channels of the magnitude effect, we use the estimates above to predict choices in the 35 questions for both the Present Group and the Delayed Group. Table 6 presents the marginal effects of allowing one parameter to vary with the total budget: in each row, we allow only one parameter, Table 6 Marginal effects of allowing a parameter to vary with total budgets in the aggregate-level estimation The changes in choices predicted by the Tobit model using the parameter values indicated by the row title compared with 1 , 1 , 1 , for the two groups separately. k in the row titles stands for the total budget in the column title. For instance, the first cell in the first row is the difference between the choices made in the seven decisions with the total budget of €40 predicted by the model with parameter values 1 , 2 , 1 and those predicted by the model with parameter values 1 , 1 , 1 . In other words, it is the marginal effect of allowing to vary with the total budget from €20 to €40. The unit is 1% of the total budget either or , to vary with the total budget of the decisions (as indicated by the column title), but fix the other two parameters at the value estimated from the budget of €20. Each number in a cell is the total change (in units of N k 100 , the percentage of the total budget) in the seven decisions with the corresponding total budget. The results show that the marginal effect of allowing to vary with the total budget is at least as large as the marginal effect of allowing to vary. This suggests that the magnitude effect on the elasticity of intertemporal substitution is at least as important as the magnitude effect on the discount rate.

Individual-level estimation
The aggregate-level estimation provides evidence of positive magnitude effects on the discount factor and intertemporal substitutability. One may wonder whether these results are purely a compositional effect: a bias resulting from forcing all subjects to have the same preferences and the same distribution of noise. To deal with this concern, we also perform individual-level estimation and tests.

Estimation and testing procedure
We keep all the assumptions that underlie Eq. (1) except for since it is not identified in individual-level estimations. We estimate the discount factor ( ) and the intertemporal substitutability ( ) for each combination of subject and stake, and then test if the two parameters are increasing in the magnitude within-subject.
One important difference from the aggregate-level estimation is that under-identification occurs when a subject made no or only one interior choice at one stake. There are 627 out of 1015 (62%) subject-stake combinations for which this is the case. We thereby adopt a conservative way to test the magnitude effect. First, we obtain point estimates of and if possible by estimating the Tobit model specified in Sect. 5.1. Whenever there is an under-identification problem, we remove the error term from the model and then infer the intervals of and that can generate the observations. Second, we perform a one-sided sign test on the two parameters, respectively, with the null hypothesis that they do not change with the magnitude. The sign test is flexible in that it does not impose assumptions on the distributions of idiosyncratic shocks to the parameters. For a comparison between a point estimate and an interval estimate, we recognize a difference only if the point is not in the interior of the interval. For a comparison between two interval estimates, we recognize a difference if the two intervals do not overlap. 17 17 To illustrate the procedure of our test, we take a numerical example. Suppose a subject made seven interior choices when the total budget is €20 and made seven corner choices when the total budget is €160. In the former case, point estimates can be obtained through the Tobit estimation: for example, ̂ = 0.761 and ̂= 0.830 . In the latter case, the observations can only be generated by ̂ ∈ [0.95, 1] and ̂∈ [0.992, 1] . Since the point estimate of ̂ at the stake of €20 is left to the range of feasible parameter values of ̂ at the stake of €160 (i.e., 0.830 < 0.992), we take it that ̂ increases as the total budget rises from €20 to €160. Suppose, instead, that the point estimate of ̂ is 0.970 at the stake of €20 while keeping the possible range of ̂ at the stake of €160 the same. Since in this case the point estimate falls into the latter range of feasible values (i.e., 0.992 < 0.970 < 1), we take ̂ to be "unchanged" as the total budget rises from €20 to €160 We also categorize the subjects based on how their preference parameters change with the stake. For each parameter, a subject presents values for five stakes, allowing for 10 pairwise comparisons across stakes. We use the following two criteria for our categorization: first, we define subjects who present no magnitude effect as those who have "unchanged" results in all the 10 comparisons across stakes, subjects who present positive magnitude effects as those who have no "decrease" and at least one "increase", and subjects who present negative magnitude effects as those who have no "increase" and at least one "decrease", and call the rest as subjects who present a mixed magnitude effect. Arguably this criterion is too stringent in that it does not allow for any error. Our second criterion allows for error: we define subjects who present positive magnitude effects as those who have more "increases" than "decreases" and subjects who present negative magnitude effects as those who have more "decreases" than "increases". Table 7 shows the results of the tests at the individual level. We reject the null hypotheses of no magnitude effect on the two parameters, in favor of positive magnitude effects. This shows that the two channels of the magnitude effect on intertemporal choices are robust against individual heterogeneity. Table 8 shows the results of subject categorization. Under both criteria, subjects who present positive magnitude effects are much more frequent than those who present negative magnitude effects. This is consistent with our main finding that positive magnitude effects exist. 18

Interpretations
The results above imply that when an average subject faces a larger budget in an intertemporal allocation task, she behaves more patiently, but also she regards rewards to be more substitutable between dates.

Relation with the magnitude effect on risk aversion
According to the Discounted Expected Utility (DEU) theory, when the atemporal utility function is a power function, the risk attitude and the elasticity of intertemporal substitution are represented by the same parameters, since risk aversion and imperfect fungibility both originate from diminishing marginal utility. Therefore, 1 3 one may wonder whether the magnitude effect on intertemporal substitutability is the same as the magnitude effect on risk attitudes.
We find evidence against this equivalence when we compare our results with previous findings. Studies using risky prospects find that the Arrow-Pratt measure of relative risk aversion (Pratt, 1964) is larger when outcomes are scaled up (Binswanger, 1981;Kachelmeier & Shehata, 1992;Holt & Laury, 2002;Harrison et al., 2005;Fehr-Duda et al., 2010;Bouchouicha & Vieider, 2017). This is in the direction that is opposite to what we find. Increasing relative risk aversion suggests an increase in the concavity of a power utility function as the stake increases while our results show a movement towards linearity. This contradiction suggests that the magnitude effect on relative risk aversion is not driving the magnitude effect on intertemporal substitutability.
Some other studies also suggest that risk aversion and intertemporal substitutability should be separated. Andreoni and Spregner (2012) found no significant correlation at the individual level between the curvature estimated by the CTB method and the risk attitude elicited by the MPL method. Abdellaoui et al. (2013b), Miao and Zhong (2015), Cheung (2015). also found that the utility curvature elicited from intertemporal tasks is quantitatively different from that elicited from tasks with risk. We provide evidence from a different perspective: while the previous studies showed that the degrees of concavity are different for the two kinds of utility functions, we show that the degrees of concavity change in opposite directions when the stake is varied.
This finding has implications for both theories and experimental methods. First, it lends support to the theories which separate intertemporal substitutability from risk aversion (Kihlstrom & Mirman, 1974;Richard, 1975;Kreps & Porteus, 1978;Epstein & Zin, 1989;Weil, 1990;Bommier, 2007;Ebert and van de Kuilen 2016, etc.) Second, it casts doubt on the use of a risk-elicitation task to correct for the curvature when eliciting time preferences.

Relation with borrowing constraints
In theory, a binding borrowing constraint can lead to a magnitude effect on the monetary discount rate in a single-reward task if the background consumption is expected to grow over time, as shown by Epper (2015). However, Meier and Sprenger (2010) found that experimentally elicited long-run discount rates are uncorrelated with credit constraints, suggesting that on average, whether the borrowing constraint is binding does not affect intertemporal choices in experiments where all outcomes are in the future.
Moreover, given the fact that subjects may have savings that provide some but small liquidity, the number of subjects whose borrowing constraints are binding should be increasing in the stake. For this reason, if the borrowing constraint is the main issue, we should observe that the intertemporal substitutability is decreasing in the stake, which is inconsistent with our results. Therefore, we believe that a binding borrowing constraint is not the main driver of our results. experimental reward as the size of the reward increases. If the budget increases, individuals 'zoom out' as it were and take a broader perspective on the decision problem. One reason may be that individuals are likely to divide and use up a bigger windfall over a longer period. Based on the data collected from their field experiment with Malawian peasants, Holden and Quiggin showed that the magnitude effect on the discount rate in single-reward tasks would disappear if the unobserved background consumption is assumed to be an increasing function of the stake.
In intertemporal allocation tasks, the increasing background consumption can generate a magnitude effect on intertemporal substitutability. To see why, denote the observed elasticity of intertemporal substitution in experimental rewards by e z . The relationship between e z and preference parameters is Since e z is increasing in both and , an increase in and an increase in are competing explanations for the magnitude effect on intertemporal substitutability. If subjects take into account more background consumption as the total budget increases, we would observe a greater sensitivity to the interest rate, i.e., a greater e z . When we assume a fixed background consumption, however, the pattern will be attributed to a magnitude effect on .
Both the mental-accounting Fudenberg-Levine model and the mental zooming theory point to partial integration with lifelong wealth, which seems to be an important mechanism of the magnitude effect on intertemporal substitutability. Andersen et al. (2018) showed empirically that subjects only partially integrate experimental rewards with wealth in risk preference tasks. While they provide evidence of partial asset integration by exploiting variation in personal wealth, we provide withinsubject evidence suggesting that the degree of asset integration is increasing in the stake.
None of the current models can explain both a magnitude effect on the discount factor and a magnitude effect on the intertemporal substitutability. Of course, the two channels can be explained by a mode-switching model in which individuals are assumed to have different preferences for different stakes. However, a truly unified explanation is still lacking.

Conclusion
Our study investigates the magnitude effect on intertemporal choices in a setting that is more general than a single-reward task, namely the intertemporal allocation task. After adapting the definition of the magnitude effect to the new task, we verify its existence: people allocate a larger share of the budget to the later reward as the total budget increases. The magnitude effect is not affected by whether the sooner reward is immediate or in the future. The size of the magnitude effect is smaller when the stakes are higher.

Appendix A. Rationality of subjects in the Convex Time Budget method
The CTB method allows subjects to make interior choices and hence makes it possible to measure discount rates and utility curvature simultaneously. However, Chakraborty et al. (2017) found that a proportion of subjects, especially those who make interior choices, violate wealth monotonicity in the CTB dataset of Andreoni and Sprenger (2012) and that the magnitude of wealth monotonicity violations conditional on violating at least once is as large as that generated by uniform random choice, and hence questioned the rationality of subjects in making CTB decisions.
In this appendix, we follow Chakraborty et al. (2017) to examine price monotonicity and wealth monotonicity of our dataset. In specific, we look at fractions of monotonicity violations among all subjects and among subjects who make at least one interior choice, respectively. We also measure the average magnitude of wealth monotonicity violations for those who violate wealth monotonicity at least once, and we compare it with the distribution of the magnitude generated by uniform random choice.
One advantage of our sample in testing wealth monotonicity is that we have 70 pairs of tasks that only differ in total budget for each subject, compared to one pair in Andreoni and Sprenger (2012). This endows our test with greater power than that in Chakraborty et al. (2017). Table 9 shows the rationality indices of the full sample and the subsamples as well as those generated by uniform random choice. The fractions of price monotonicity violations (2-3%) are less than the fractions of wealth monotonicity violations (10-20%). A possible reason is that prices are varied within decision forms but total budgets are varied across decision forms, and hence it is easier to make comparisons across prices than across total budgets. Nevertheless, those fractions and the magnitudes of violations are much lower than those generated by uniform random choice, suggesting that our dataset and the subsample of subjects who make interior choices are highly informative.

Table 9
Rationality of subjects compared to uniform random choice The first three rows present fractions of price monotonicity violations and wealth monotonicity violations as well as average magnitudes of violations in terms of euros in the full sample, the subsample of subjects who make at least one interior choice, and the subsample of subjects who violate wealth monotonicity at least once. The last row presents the means and the standard deviations of the same four indices generated by uniform random choice. The means and standard errors are calculated by simulating 10,000 times