Decision making is inseparable from daily life. Animals, including humans, frequently make choices between alternative options, both consciously and unknowingly. With respect to consequences, these choices are usually considered from normative and descriptive perspectives in many disciplines, including economics, social sciences, and psychology. The rationality of choices is addressed through normative aspects. In contrast, descriptive analysis investigates decisions as preferences, regardless of whether they are logical, beneficial, or harmful (Burns & Bechara, 2007; Kahneman & Tversky, 1984).

In many circumstances, the outcomes of the available options are subject to uncertainties such as delay and/or risk, which decrease the value of the choices in comparison to choices with immediate outcomes. Within a behavioral economic framework, this devaluation of reward is referred to as discounting (Ainslie, 1975). Decision-making theories, including expected utility theory (Von Neumann & Morgenstern, 1944), prospect theory (Kahneman & Tversky, 1979; Tversky & Kahneman, 1992), and reinforcement learning theories (Sutton & Barto, 1998), are based on the principal assumption that all dimensions of an option are integrated into a single measure called the subjective value of a choice, which is parameterized by the rate of discounting.

The first step to develop a mathematical model to describe behavior in an experimental setting of discounting is to evaluate different options on the basis of the available information, such as the type of a reward and the associated uncertainties. In a simple setting of temporal discounting, the rational candidate would be the exponential decay function, V = Ae –k o D, where the subjective value V of an outcome of amount A, delivered after a delay D, diminishes exponentially according to the discounting rate k o. Early studies in economics mostly employed this function to evaluate different options (Ainslie, 1975). Nevertheless, humans tend to deviate systematically from this value function (Doyle, 2013; Green, Fry, & Myerson, 1994; Madden & Johnson, 2010; Rachlin, Raineri, & Cross, 1991; Simpson & Vuchinich, 2000). Consequently, the most common value function in behavioral psychology seems to be Mazur’s model (Mazur, 1987),

$$ V=\frac{A}{1+{k}_o D}. $$
(1)

This equation states that the subjective value of a delayed reinforcer declines hyperbolically according to the discounting rate k o > 0. Moreover, with a transformation of the probability to the odds against winning, θ = (1 – p)/p, the same hyperbolic discounting function has been used to describe the declining subjective values of probabilistic outcomes (Rachlin et al., 1991):

$$ V=\frac{A}{1+{k}_o\theta}. $$
(2)

To describe choice behavior for individuals or groups of individuals, the discount rate k o is usually inferred by assessing several indifference points at separate delays. Fitting Eq. 1 or 2 then gives the best estimate of the discounting rate. Normalized indifference points have also been used to measure discounting rates, employing not a value function but the area under the indifference curve (Myerson, Green, & Warusawitharana, 2001).

Although a range of more complex mathematical models have been tried (Doyle, 2013), the simple hyperbolic function has been widely used to model experimental data for various incentives, real or hypothetical, as well as positive or negative outcomes (Baker, Johnson, & Bickel, 2003; Johnson & Bickel, 2002; Petry, 2003). Typically, economists assess discounting by simply asking participants directly for their indifference value (Loewenstein, 1988), whereas the most common method in neuroscience and psychology is the binary-choice method (Mazur, 1987), employing a titration procedure through which the indifference point is inferred from a series of choices. Such procedures have been implemented with either a set of fixed, predefined choices (Madden, Petry, Badger, & Bickel, 1997) or, more commonly, using adjustments of amount or delay based on the individual’s choices (Loewenstein, 1988; Madden et al., 1997; Rachlin et al., 1991). Adjusting-amount procedures have been used mostly in studies with human subjects, whereas animal research has used delay adjustments in addition. As compared to nonadaptive methods, adjustment procedures have been found to have no effect on the processes that underlie discounting (Green, Myerson, Shah, Estle, & Holt, 2007; Holt, Green, & Myerson, 2012). Because they are based on mapping a set of indifference points, these tasks require a large number of choice trials, which can be time-consuming, and thus limiting in certain applications (Smith & Hantula, 2008). Recently, a hierarchical Bayesian model was developed to assess the temporal discounting rate (Vincent, 2016), and a five-trial adjusting-delay task has been shown to quickly measure discount rates in humans (Koffarnus & Bickel, 2014). Nonetheless, the latter approach does not allow controlling for unsystematic or illogical data.

Research investigating aberrant decision-making patterns within several mental disorders, such as drug abuse, is growing constantly. In particular, delay discounting, as a proposed transdisease process, demands further investigation, including directing interventions toward changing individuals’ discounting rates (Bickel, Jarmolowicz, Mueller, Koffarnus, & Gatchalian, 2012; Koffarnus, Jarmolowicz, Mueller, & Bickel, 2013). Furthermore, due to the partial overlap but also the evident differences between delayed and probabilistic choice behavior, the advantage of similar experimental procedures for different facets of decision making has been emphasized (Green & Myerson, 2004). Increased delay-discounting rates are seen in chronic users of alcohol and other drugs (Bjork, Hommer, Grant, & Danube, 2004; Dom, D’haene, Hulstijn, & Sabbe, 2006; MacKillop et al., 2011; Mitchell, Fields, D’Esposito, & Boettiger, 2005; Petry, 2003), and higher rates of risk taking are seen in pathological gamblers (Madden, Petry, & Johnson, 2009). Impulsive and risky decision making has also been linked to such clinically relevant constructs as treatment outcomes (Blanco et al., 2009; Krishnan-Sarin et al., 2007; MacKillop & Kahler, 2009; Petry, 2012; Stanger et al., 2012). Therefore, it has become increasingly desirable to measure choice behavior in a variety of contexts with improved, precise, and consistent methods, which may further our understanding of the mechanisms underlying value-based decision making, and thus potentially advance clinical care for people suffering mental disorders related to such behaviors.

Individuals may behave differently when they make decisions based on values. More sensitive participants may change their preferences sharply on the basis of small differences, whereas others may be neutral to the same changes. This can also be interpreted as a degree of consistency: A consistent choice policy gives a higher probability of choosing the option with the higher value. It has been shown that different samples behave differently in terms of consistency, which has been estimated by using the softmax function (Eq. 3 below) in maximum likelihood models (Hare, Hakimi, & Rangel, 2014; Yechiam, Busemeyer, Stout, & Bechara, 2005), or by constructing receiver operating characteristic curves for the sets of choices by individual participants (Ripke et al., 2012).

In the present study, we developed a mathematical framework for an isochronous amount and delay/risk adjusting procedure based on a Bayesian estimation approach (Garvert, Moutoussis, Kurth-Nelson, Behrens, & Dolan, 2015). Employing simulations and a sample of healthy participants, we show the robust and efficient estimation of the parameters of interest. The algorithm is presented and discussed in detail for the case of delay discounting. The adaptation of the mathematical framework for assessing probability discounting and loss aversion is straightforward and is omitted for the sake of brevity. Taken together, we here present a novel adaptive approach to measure different facets of value-based decision making, including choice consistency.

Mathematical framework

In this section, we provide a detailed discussion of the mathematical modeling and parameter estimation algorithm for the delay discounting case. The main idea is to employ a Bayesian approach to improve our initial assumptions about the value of the discounting parameter trial-by-trial by using the choices that an agent—for example, a person—makes between a smaller immediate and a larger delayed monetary reward. We used the hyperbolic discounting function, Eq. 1, for evaluating the subjective value of the delayed offers.

In what follows, x d represents the amount of the delayed offer associated with a delay of d units, and the subjective value is shown as V d . Similarly, x i and V i are the immediate offer and its subjective value, respectively. The subjective value of the immediate offer is trivially the value of the offer itself—that is, V i = x i . The offers vary between r 1 and r 2, measured in a currency unit, and the delays are chosen from the set D = {d 1, d 2,…, d 7}, in days. Now suppose that a i and a d are the actions of choosing the immediate and delayed offers, respectively, and let Q(a i ) and Q(a d ) be the values of taking the corresponding actions (Sutton & Barto, 1998). Furthermore, we assume that the likelihood of choosing between the two offers follows a softmax probability function with an inverse temperature parameter, β o > 0, as

$$ P\left({a}_i\left|{k}_o,{\beta}_o\right.\right)=\frac{ \exp \left({\beta}_o Q\left({a}_i\right)\right)}{ \exp \left({\beta}_o Q\left({a}_i\right)\right)+ \exp \left({\beta}_o Q\left({a}_d\right)\right)}, $$
(3)

and hence, P(a d | k o , β o ) = 1 – P(a i | k o , β o ).

Large values of β in Eq. 3 represent consistent choices—that is, a high probability of taking the most valuable action—whereas small values reveal some inconsistency.

Both parameters, k o and β o , are nonnegative. Therefore we expected them to be positively skewed—that is, approximately lognormal (Lovric, 2010). To take advantage of close-to-normal distributions, we transformed the parameters to the natural-logarithmic scale and defined k = ln(k o ) and β = ln(β o ). For estimation purposes, we discretized the parameter space over an equally spaced 2-D region, R, such that –8 ≤ k ≤ 2 and –5 ≤ β ≤ 5. For simplicity, we assumed that the two parameters are independent and imposed liberal univariate priors on the parameters, such that k and β had a Beta and a uniform distribution, respectively. Under the independence assumption, one can build a joint probability distribution P(k, β) serving as a prior for the following Bayesian framework.

Given the prior distribution, the immediate and delayed offers were presented to the agent. After observing the choice at the first trial, we updated the prior using Bayes’s rule,

$$ P\;\left( k,\beta \left| a\right.\right)=\frac{1}{Z} P\left( a\left| k,\beta \right.\right) P\left( k,\beta \right), $$
(4)

where P(k, β) is the joint distribution over the parameters and P(a | k, β) is the likelihood of observing the action a ∊ {a i , a d }, which is computed using Eq. 3. At every trial t, the posterior distribution over the parameters, P(k, β | a), was updated by multiplying the prior by the likelihood of the agent’s action and then served as the prior for the upcoming trial. Note that 1/Z, in Eq. 4, is a simple normalization factor over the discrete domain R. At the end of each trial, the expected values of k and β were considered the current parameter estimations, \( {\widehat{k}}_t \) and \( {\widehat{\beta}}_t \). Using the current estimates based on the previous choice, we presented the following offers close to the indifference point, where the choices were equally likely, in order to retrieve the most informative data (Lewi, Butera, & Paninski, 2008; Sebastiani & Wynn, 2000).

At each trial, we therefore intended to provide two offers with the same subjective values—that is, \( {x}_i=\frac{x_d}{1+{\widehat{k}}_t d} \), where these values lay in the offer range. In other words, the condition

$$ {r}_1\le \frac{r_2}{1+{\widehat{k}}_t{d}_i}\le {r}_2-\delta, \kern1em i=1,\cdots, m, $$

should hold for all feasible delays, where δ is the minimum difference between the two offers. Then a feasible delay was chosen randomly, and the next offers were provided such that they differed at least by δ and had the same subjective values. In extreme cases in which the agent was too patient (or impulsive), even the minimal possible fractional increment was not feasible for the longest (or shortest) delay—that is, \( \left({r}_2-\delta \right)\left(1+{\widehat{k}}_t{d}_m\right)<{r}_2\;\mathrm{or}\kern0.24em {r}_2<{r}_1\left(1+{\widehat{k}}_t{d}_1\right) \). In these cases, amounts very close to the boundary values of the offer range and either the highest or the lowest possible delay were considered as the next options. In all cases, we chose randomly from the set of feasible delays, if any, and adjusted for the immediate and delayed amounts in such a way that the subjective values of both offers were approximately the same according to the current estimate, \( {\widehat{k}}_t \). The procedure continued for a certain number of trials, N, and \( {\widehat{k}}_N \) was considered the estimated parameter. The risk with adaptive designs is that individuals will reverse-engineer the design. Therefore, to reduce the chance of an agent learning the pattern, we added random offers that were distributed throughout all trials.

The same framework is valid for the concepts of probability discounting and loss aversion. We used Eq. 2, analogous to Eq. 1 for delay discounting, to evaluate the probability discounting of gains and losses in corresponding tasks. Finally,

$$ V=\frac{1}{2}\left( G-\lambda L\right) $$
(5)

was used to evaluate mixed prospects and to estimate a behavioral measure of loss aversion, λ. Equation 5 has a simple linear form in which loss aversion is the ratio of the contribution of the loss magnitude, L, to the contribution of the gain magnitude, G, to the participant’s decisions (Frydman, Camerer, Bossaerts, & Rangel, 2011; Tom, Fox, Trepel, & Poldrack, 2007).

Simulations

Once the mathematical model was built, we produced large number of simulated data to examine the estimation procedure in the following situations. First, we assumed that the choices were made with the prescribed parameters of the model. As an example, Fig. 1 shows how the prior distributions changed over the course of the estimation procedure for a decision-maker behaving according to k = –2 and β = 1.5. We depict five starting, middle, and final trials, which, based on the data from Table 1, show improving estimations across trials.

Fig. 1
figure 1

Simulation, on a logarithmic scale. From left to right, the pictures show the five starting, middle, and final trials for a simulation with k = –2 and β = 1.5. The final distribution is narrower over k than over β

Table 1 First and last five trials of a simulation with k = –2 and β = 1.5

For the case of consistent behavior, we performed simulations for every k = –5.5, –5,…, –0.5, with a fixed value of β = 1.5. The estimated parameters are shown in Fig. 2a for k = –4, –3, –1 on a trial-by-trial basis. Figure S1 shows boxplots for all values of k. The values of k were reproduced well, and the accuracy of the estimations is supported by the small errors, represented by the low standard deviations. However, the estimations of β resulted in higher standard deviations, which is an indication of a suboptimal estimation (see Fig. 2b). Then, simulations for a fixed discounting rate, k = –3, and different β = –4.5, –3.5,…, 4.5 showed that k was estimated more accurately with higher values of β (Fig. 2c, Fig. S2). We demonstrate the trial-by-trial changes across the estimation procedures to emphasize the convergence of the algorithm.

Fig. 2
figure 2

Simulation, on a logarithmic scale. The blue lines in each panel, sampled from 2,000 simulations, show the trial-by-trial values of the estimation procedure for the first (left) and last (right) trials, and red lines are the prescribed values. For consistent behavior, the stability of the results is visually apparent and is supported by low variances. (a) Different values of k are precisely estimated over 50 trials, assuming consistent choice behavior with β = 1.5. (b) Estimates of β now have bigger variances. (c) Higher values of β result in better estimates of k

To investigate the reliability of the procedure, we assumed a sample of normally distributed parameters such that k ~ N(–3, 1) and β ~ N(0.5, 2). Data sets of 50 trials were simulated for random pairs from these distributions, and the resulting data are shown in Fig. 3a. The correlations between the initial (true scores) and estimated parameters were .98 for k and .95 for β, which are depicted in Fig. 3b.

Fig. 3
figure 3

Simulation, on a logarithmic scale. (a) Distribution of the true and estimated values of k and β. The initial parameters were chosen from normal distributions as k ~ N(–3, 1) and β ~ N(0.5, 2). We found no significant differences between the true and estimated values, as shown by the Kolmogorov–Smirnov test. (b) Correlations between the true and estimated values are used as a measure of reliability, which here are .98 and .95 for k and β, respectively

Simulations with appropriate settings of priors and initial offers were also performed for probability discounting and mixed-gambles models (see Fig. 4 ). For the probability discounting rates, we restricted the domain to the logarithmic scale such that –3 ≤ k o ≤ 3, given the fact that a k o of 1—that is, k = 0—is considered a baseline at which the subjective value corresponds to the expected value, and any value of k o different from zero is considered to be an indicator of risk seeking or risk aversive behavior. The loss aversion parameter was set to 0 ≤ λ ≤ 4 without a transformation to the logarithmic scale. The reason was that any value of λ beyond this range was infeasible with our offer range, and any transformation could result in a change of direction of any potential skewness. The consistency parameter had the same range as before, –5 ≤ β ≤ 5. Predefined values of the discounting and loss aversion parameters were reliably recovered. As before, the consistency parameter was reproduced with less precision.

Fig. 4
figure 4

Transfer of the mathematical framework to the concepts of probability discounting and loss aversion, with k and β on a logarithmic scale. (ac) Simulations of parameter estimates for probability discounting for gains, probability discounting for losses, and mixed gambles, respectively, based on different values of k and λ but assuming consistent behavior, β = 1.5. Every picture represents one simulation, and ten out of 2,000 runs are depicted. The dashed red lines are the original values of the parameters

Comparison to standard methods

To compare our approach to standard methods, we chose a value-adjusting method as a representative of a class of titration procedures that adjust the amount by half of the difference between the delayed and immediate offers depending on the previous choice (Ripke et al., 2012). Assuming consistent choices, after a certain number of trials this method will locate the indifference point for a given delay. Finally, the hyperbolic function was fit to indifference points of several delays to estimate the discounting rate.

To make the comparison more reliable in terms of controlling for accuracy, the offer ranges, and similar initial assumptions, we simplified the question of estimating the discounting rate to finding the indifference points for certain discounting rates, immediate amounts, and delays. In other words, given a discounting rate of k, an immediate amount of x i , and a delay of d days, we aimed to locate the indifference point—that is, to find the delayed amount x d such that x d = x i (1 + kd). We further assumed that the choices were made using Eq. 3 with fixed values of β.

We conducted the comparison in three different settings with increasing complexity. (1) For a single immediate amount x i = 10, single delay d = 30, and consistent choices with β = 1.5, we looked at the number of iterations each algorithm needed to locate the indifference point for k ∊ {–6, –5.9, ⋅ ⋅ ⋅, 1} (see Fig. 5a). Although the performance of both methods overlaps for small values of k, for stronger discounting one needs to provide larger delayed values, which makes the amount-adjusting method inefficient. (2) The number of iterations was observed for multiple immediate amounts x i ∊ {5, 10, 15, 20}, multiple delays d = {10, 30, 60, 120, 180}, and consistent choices with β = 1.5 for k ∊ {–6, –5.9, ⋅ ⋅ ⋅, 1}. Our approach exhibited more stable performance in this case, too (Fig. 5b). (3) Starting with an immediate amount of x i = 10 for multiple delays d = {10, 30, 60, 120, 180}, and allowing 50 trials in total, we applied both methods to reproduce discounting rates of k ∊ {–6, –5.9, ⋅ ⋅ ⋅, 1}. For every k, we conducted 1,000 simulations with β ~ N(0, 2) (Fig. 5c). The median estimated k using our approach (Fig. 5c, top) closely tracked the true values (black line), and the 5th and 95th percentiles (dotted lines) were evenly distributed for different k values. In contrast, the amount-adjusting procedure performed somewhat less well in terms of the medians and skewed residuals (Fig. 5c, bottom).

Fig. 5
figure 5

Comparison of our approach to the classical amount-adjusting method, on a logarithmic scale. (a) Number of iterations to find the indifference point with an immediate amount of 10 and a delay of 30, for k varying from –6 to 1. The solid and dashed lines show the median and 75th percentile of the numbers of iterations for our algorithm (blue) and the amount-adjusting method (red). For smaller ks both methods showed similar performance, whereas for bigger values of k our algorithm exhibits better results in terms of reaching the indifference point faster. (b) The difference between the two methods is more pronounced when we add more immediate and delay values. (c) Simulations of the two methods for the same range of k as in panels a and b. The estimated values are shown on the vertical axes. Black solid lines are ideal scenarios in which the estimated and initial values coincide. Colored solid and dotted lines depict the median and the 5th and 95th percentiles, which show higher precision for our approach

For the first two cases, we applied each algorithm 10,000 times for every k value and random combinations of x i and d. We terminated the algorithm either upon reaching the indifference point, up to an absolute error of .05, or after 200 trials if the results did not converge. The delayed offer at the first trial for both methods was set to twice the amount of the true indifference point, 2x i (1 + kd). The algorithms had no restriction on the amounts they could offer. To avoid any bias, we used a flat prior on k for our algorithm.

We also compared the two methods using empirical data that were collected using a delay discounting task from an early implementation of the algorithm, with slightly different settings of priors, parameter domains, and offer range. We invited a total of 88 participants from a follow-up study to Ripke et al. (2012) to perform the adaptive task in addition to the main task of the study. The standard task was a 50-trial amount-adjusting task designed to assess indifference points for delays of 10, 30, 60, 120, and 180 days. The temporal discounting rates for the standard method were computed by fitting the hyperbolic value function, Eq. 1, to these indifference points. The resulting estimates of the delay discounting rates from the two methods were highly correlated, r = .66, p < .001, as is shown in Fig. 6a.

Fig. 6
figure 6

Comparison to an amount-adjusting method. (a) Participants (n = 88) performed a classical amount-adjusting task and an early implementation of our algorithm for temporal discounting. The two methods resulted in discounting rates with a significant correlation of r = .66. (b) A Bayesian estimation for the data from the amount-adjusting task is highly correlated with the estimates for the curve-fitting method

Furthermore, we applied our algorithm—that is, the sequential Bayesian update—to the data from the standard method. The data comprised a total of 50 choices between delayed offers and a fixed immediate offer. The iterative procedure ran through all trials and updated the estimations on the basis of the trial-by-trial likelihood of the choices. The resulting discounting rates were correlated to the results from the standard method, r = .99, p < .001, as is given in Fig. 6b, which shows that our approach gave almost the same results as fitting to the indifference points. This introduces a new and computationally easy way of estimating discounting rates in data from standard methods, and it could also serve as a proof of validity for our approach: Assuming that the classical method measures a construct, our algorithm does so to the same degree.

The battery

The adaptive procedure for binary choice presentation was employed to develop a task battery for the measurement of different facets of impulsive and risky decision making, with four independent tasks: delay discounting (DD), probability discounting for gains (PDG), probability discounting for losses (PDL), and mixed gambles (MG) (see Fig. 7).

Fig. 7
figure 7

Schematic overview of the tasks included in the value-based decision-making battery

During each task, participants are supposed to choose one of the two offers presented simultaneously for 5 s on a computer screen. The time limit was set in light of the average response times from previous assessments. For each trial, the participant’s choice is highlighted with a frame before presenting the next offer. Presenting the outcomes of gambles during the experiment and the time interval of each trial, as well as the number of trials, are all optional and can be set initially. In general, participants are informed by instructions before each task that at the end of the experiment one trial per task will be selected randomly from among their choices and credited toward their compensation. However, these instructions are integrated into the battery and can be modified on the basis of alternative task designs. The temporal delays in DD are set to 3, 7, 14, 31, 61, 180, and 365 days. For PDG and PDL, gambles are played with five possible probability values: 2/3, 1/2, 1/3, 1/4, and 1/5. The task length for DD, PDL, and PDG is 50 trials, and monetary gains/losses range from €3 to €50. For MG, 50 trials are performed, presenting amounts of €1–€40 for gains and €5–€20 for losses. The number of trials, 50, was chosen according to data acquired by previous implementations of the algorithm, so as to end up with stable estimates. At the beginning of the MG task, participants receive €10 as “house money.” During all tasks, offers are randomly assigned to presentation on the left or the right of the screen.

The experiments, including instructions, binary choices, and outcomes, were initially implemented using MATLAB, Release 2010a (The MathWorks, Inc., Natick, MA) and Psychtoolbox 3.0.10, based on the Psychophysics Toolbox extensions (Brainard, 1997; Pelli, 1997), and are now available under GNU Octave. The initial settings of the tasks, including reward types and ranges, temporal delays, probabilities for gains and losses, and gambling, together with the instructions and payment schemes, are easily accessible through the source code. This enables an end user to modify the layout and initial settings on the basis of different hypotheses and requirements.

Piloting

We piloted the battery on a sample of 26 (15 female, 11 male) healthy adults with a mean age of 26.2 years (SD = 7.8). The participants were recruited through flyers distributed around the university campus and neighborhood, and they were paid a fixed amount of money for compensation. Participants completed the battery within 19 min, on average, of which the estimation procedure required 13 min (i.e., 6 min for the instructions). The estimated parameters are shown in Fig. 8a as boxplots, and the summary statistics for the sample are presented in Table 2. To test the convergence of the estimation procedure, we calculated the absolute difference between the estimated value of each parameter on any trial, \( {\widehat{k}}_t\left({\widehat{\beta}}_t\right) \), and the final estimation, \( \widehat{k}\left(\widehat{\beta}\right) \). The median of this error term was then depicted for all participants, together with the 75th percentile and maximum values. Figure 9 shows a decreasing pattern that can roughly be interpreted as the convergence of the procedure. Regarding the distributions of the parameters, the Kolmogorov–Smirnov test rejected the normality null hypothesis for DD, PDG, and MG at α = .05 in our sample. Therefore, we used Spearman’s rank correlation coefficient to look for any associations between the parameters, which is summarized in Table 3.

Fig. 8
figure 8

Parameters from pilot data, with k and β on a logarithmic scale. (a) Boxplots of temporal and probability discounting rates, loss aversion, and all consistency parameters. (b) Correlation between the probability discounting rates for gains and losses

Table 2 Participant sample description, with k and β on a logarithmic scale
Fig. 9
figure 9

Convergence of parameter estimations in the pilot data. The medians of the absolute differences between the estimation at each trial and the final estimation for all participants are shown trial by trial by the black solid lines. The dash–dotted and dotted lines depict the 75th percentiles and maximum values, respectively. The decreasing pattern in black lines is a sign of convergence, though the maximum values show that for a few participants the convergence was poor. The top row depicts the absolute differences for discounting rates and loss aversion; the bottom row does so for the consistency parameter

Table 3 Correlations between parameters

Previous studies have shown a negative correlation between the probability discounting rates for gains and losses (Shead & Hodgins, 2009; Takahashi, Takagishi, Nishinaka, Makino, & Fukui, 2014). Nevertheless, we observed no correlation, Spearman’s ρ = –.0003, though a linear trend was present, as is shown in Fig. 8b, and a post-hoc analysis revealed a power of .35 for our sample size to detect a correlation of –.25 with α = .05. Furthermore, the probability discounting rates for gains and losses in our sample did not show any significant differences in terms of means and medians. For mixed gambles, the resulting distribution was in line with the literature, although we endowed the participants with money in advance and used different framings (including zero values). Such differences in the presentation and/or context of instruction have been shown to affect choice behavior (Ert & Erev, 2008; Kahneman & Tversky, 1979; Silberberg, Murray, Christensen, & Asano, 1988; Thaler & Johnson, 1990). Moreover, we observed significant associations between probability discounting for gains and both loss aversion and temporal discounting, ρ = –.41, p < .05, and ρ = .47, p < .05, respectively. We also observed a nonsignificant association between loss aversion and probability discounting for losses, ρ = –.37, p = .061.

We next examined the reproducibility of β in the low value range using simulations. Specifically, we conducted simulations for the delay discounting case using samples of the initial parameters distributed according to Table 1 [1,000 simulations with k ~ N(–5.24, 2) and β ~ N(–1.35, 1.32)]. The estimated parameters had significant correlations of .92 and .86 for k and β, respectively. Hence, in simulations the reproducibility appeared acceptable, even at the lower end of the parameter range.

Discussion

In this study, we developed and implemented a novel adaptive algorithm to measure different metrics related to impulsive and risky decision making, such as the temporal and probability discounting rates and loss aversion.

The main advantage of this approach is its isochronous adaptive nature, which aims at providing the most informative offers at each trial on the basis of the choices that have been made earlier. Theoretically, this should allow for a very efficient inference of behavioral parameters. Accordingly, we showed through simulations and analysis of participant samples that model parameters converge in a few initial trials, both in simulations and in the course of tasks for real experiments. This provides stable estimates and can differentiate between participants, assuming that the hyperbolic and linear value functions, Eqs. 1, 2, and 5, capture the behavior to an acceptable degree. The temporal discounting rates obtained with computerized adjusting-amount procedures (Reynolds, Richards, Horn, & Karraker, 2004; Richards, Zhang, Mitchell, & Wit, 1999; Ripke et al., 2012) or the Monetary Choice Questionnaire (Kirby, Petry, & Bickel, 1999; Koff & Lucas, 2011) vary in terms of their summary statistics between different studies reported in the literature. This makes the results from different studies incomparable with regard to the values. Nevertheless, we showed a correlation of r = .66 between our approach and a standard amount-adjusting procedure for the same participants, which seems to be acceptable in terms of test–retest reliability, considering that the two tasks had different settings of amounts and delays (Beck & Triplett, 2009; Craig, Maxfield, Stein, Renda, & Madden, 2014; Kirby, 2009).

Moreover, we showed through simulations that our approach outperforms a standard amount-adjusting method specifically for the higher values of delay discounting rates. Recent advances (Koffarnus & Bickel, 2014; Yoon & Chapman, 2016) have introduced methods to estimate discounting parameters with very few trials (five and ten, respectively, for Koffarnus & Bickel and Yoon & Chapman). Both methods are theoretically variants of titration procedures. Such extremely brief tasks are advantageous because they save time, but due to theoretical considerations we assume that this might come at the cost of precision. Future research might therefore compare these very short task variants with our algorithm by utilizing simulations and participant samples.

The algorithm was also used to reestimate the discounting rates for data that were acquired by nonadaptive methods. The initial estimates and reestimates were nearly identical (r = .99). Furthermore, the medians of the probability discounting rates in our piloting were very close to zero (log-transformed), which is theoretically consistent and corresponds to the evaluation of the risky option by its expected value. For loss aversion, the medians were comparable to what has been reported in the literature (Tom et al., 2007).

The described algorithm was used to implement four independent measures of value-based decision making as an experimental package. This shows that the Bayesian framework can easily be adapted to assess a range of behavioral concepts and suggests that future work can be directed toward the development of additional tasks. For our test battery, we provide a graphical user interface that gives access to the model and runtime variables, as well as task-specific settings such that one can set the values for every single experiment. This and the efficient inference of behavior enhance the flexibility to cover different ranges and commodities of outcomes, even when time is a limiting factor. Most importantly, behavioral estimations are immediately available along with other data such as response times, rendering post-hoc parameter inference unnecessary.

The adaptiveness of the estimation procedure and the instruction disclosing that the outcome of a randomly picked trial will be credited as compensation make our approach vulnerable to reverse-engineering. For example, in DD after observing a delayed choice, the algorithm increases the relative amount of the immediate option for the next offer. Some participants will deliberately pick up the delayed option upon learning this pattern until the immediate offers reach a maximum. To reduce this effect, we distributed random offers in the course of the tasks, to make the patterns less obvious. This, however, decreases the power of the algorithm because of added noise and sampling from less informative data. On the other hand, in exceptional cases of extreme behavior—for example, if a participant tends to take just one type of offer, such as the immediate ones in delay discounting—the procedure runs normally, but these cases could be treated in a different way by allowing for adaptive offer ranges. However, these instances are easily detectable by their value and choice behavior and can be considered for treatment as outliers.

In summary, we developed a new approach for adaptive offer presentation in binary choice settings to estimate discounting parameters. We showed that this approach is quick, reliable, and outperforms the most widely used classical method. Furthermore, it can be easily transferred to other concepts of decision making. Our findings support construct validity under the mathematical framework and we conclude that the proposed Bayesian approach is a functional alternative to those existing in the literature. Thus, our work might advance the evaluation of decision-making processes in single studies or in research consortia that need to collect high numbers of datasets in a flexible and efficient way. Nevertheless, future studies will be required, in order to improve the estimation precision of the consistency parameter, β, as well as to compare with more recent methods and other complex decision models with more parameters.