Value-based decision-making battery: A Bayesian adaptive approach to assess impulsive and risky behavior

Pooseh, Shakoor; Bernhardt, Nadine; Guevara, Alvaro; Huys, Quentin J. M.; Smolka, Michael N.

doi:10.3758/s13428-017-0866-x

Value-based decision-making battery: A Bayesian adaptive approach to assess impulsive and risky behavior

Published: 13 March 2017

Volume 50, pages 236–249, (2018)
Cite this article

Download PDF

Behavior Research Methods Aims and scope Submit manuscript

Value-based decision-making battery: A Bayesian adaptive approach to assess impulsive and risky behavior

Download PDF

Shakoor Pooseh¹,
Nadine Bernhardt¹,
Alvaro Guevara^1,2,
Quentin J. M. Huys^3,4 &
…
Michael N. Smolka¹

3529 Accesses
27 Citations
1 Altmetric
Explore all metrics

Abstract

Using simple mathematical models of choice behavior, we present a Bayesian adaptive algorithm to assess measures of impulsive and risky decision making. Practically, these measures are characterized by discounting rates and are used to classify individuals or population groups, to distinguish unhealthy behavior, and to predict developmental courses. However, a constant demand for improved tools to assess these constructs remains unanswered. The algorithm is based on trial-by-trial observations. At each step, a choice is made between immediate (certain) and delayed (risky) options. Then the current parameter estimates are updated by the likelihood of observing the choice, and the next offers are provided from the indifference point, so that they will acquire the most informative data based on the current parameter estimates. The procedure continues for a certain number of trials in order to reach a stable estimation. The algorithm is discussed in detail for the delay discounting case, and results from decision making under risk for gains, losses, and mixed prospects are also provided. Simulated experiments using prescribed parameter values were performed to justify the algorithm in terms of the reproducibility of its parameters for individual assessments, and to test the reliability of the estimation procedure in a group-level analysis. The algorithm was implemented as an experimental battery to measure temporal and probability discounting rates together with loss aversion, and was tested on a healthy participant sample.

Steep Discounting of Future Rewards as an Impulsivity Phenotype: A Concise Review

The Delay Discounting Procedure: Methodology and Flexibility

A Fuzzy-Trace Theory of Risk and Time Preferences in Decision Making: Integrating Cognition and Motivation

Decision making is inseparable from daily life. Animals, including humans, frequently make choices between alternative options, both consciously and unknowingly. With respect to consequences, these choices are usually considered from normative and descriptive perspectives in many disciplines, including economics, social sciences, and psychology. The rationality of choices is addressed through normative aspects. In contrast, descriptive analysis investigates decisions as preferences, regardless of whether they are logical, beneficial, or harmful (Burns & Bechara, 2007; Kahneman & Tversky, 1984).

In many circumstances, the outcomes of the available options are subject to uncertainties such as delay and/or risk, which decrease the value of the choices in comparison to choices with immediate outcomes. Within a behavioral economic framework, this devaluation of reward is referred to as discounting (Ainslie, 1975). Decision-making theories, including expected utility theory (Von Neumann & Morgenstern, 1944), prospect theory (Kahneman & Tversky, 1979; Tversky & Kahneman, 1992), and reinforcement learning theories (Sutton & Barto, 1998), are based on the principal assumption that all dimensions of an option are integrated into a single measure called the subjective value of a choice, which is parameterized by the rate of discounting.

The first step to develop a mathematical model to describe behavior in an experimental setting of discounting is to evaluate different options on the basis of the available information, such as the type of a reward and the associated uncertainties. In a simple setting of temporal discounting, the rational candidate would be the exponential decay function, V = Ae ^–k _o ^D, where the subjective value V of an outcome of amount A, delivered after a delay D, diminishes exponentially according to the discounting rate k _o. Early studies in economics mostly employed this function to evaluate different options (Ainslie, 1975). Nevertheless, humans tend to deviate systematically from this value function (Doyle, 2013; Green, Fry, & Myerson, 1994; Madden & Johnson, 2010; Rachlin, Raineri, & Cross, 1991; Simpson & Vuchinich, 2000). Consequently, the most common value function in behavioral psychology seems to be Mazur’s model (Mazur, 1987),

$$ V=\frac{A}{1+{k}_o D}. $$

(1)

This equation states that the subjective value of a delayed reinforcer declines hyperbolically according to the discounting rate k _o > 0. Moreover, with a transformation of the probability to the odds against winning, θ = (1 – p)/p, the same hyperbolic discounting function has been used to describe the declining subjective values of probabilistic outcomes (Rachlin et al., 1991):

$$ V=\frac{A}{1+{k}_o\theta}. $$

(2)

To describe choice behavior for individuals or groups of individuals, the discount rate k _o is usually inferred by assessing several indifference points at separate delays. Fitting Eq. 1 or 2 then gives the best estimate of the discounting rate. Normalized indifference points have also been used to measure discounting rates, employing not a value function but the area under the indifference curve (Myerson, Green, & Warusawitharana, 2001).

Although a range of more complex mathematical models have been tried (Doyle, 2013), the simple hyperbolic function has been widely used to model experimental data for various incentives, real or hypothetical, as well as positive or negative outcomes (Baker, Johnson, & Bickel, 2003; Johnson & Bickel, 2002; Petry, 2003). Typically, economists assess discounting by simply asking participants directly for their indifference value (Loewenstein, 1988), whereas the most common method in neuroscience and psychology is the binary-choice method (Mazur, 1987), employing a titration procedure through which the indifference point is inferred from a series of choices. Such procedures have been implemented with either a set of fixed, predefined choices (Madden, Petry, Badger, & Bickel, 1997) or, more commonly, using adjustments of amount or delay based on the individual’s choices (Loewenstein, 1988; Madden et al., 1997; Rachlin et al., 1991). Adjusting-amount procedures have been used mostly in studies with human subjects, whereas animal research has used delay adjustments in addition. As compared to nonadaptive methods, adjustment procedures have been found to have no effect on the processes that underlie discounting (Green, Myerson, Shah, Estle, & Holt, 2007; Holt, Green, & Myerson, 2012). Because they are based on mapping a set of indifference points, these tasks require a large number of choice trials, which can be time-consuming, and thus limiting in certain applications (Smith & Hantula, 2008). Recently, a hierarchical Bayesian model was developed to assess the temporal discounting rate (Vincent, 2016), and a five-trial adjusting-delay task has been shown to quickly measure discount rates in humans (Koffarnus & Bickel, 2014). Nonetheless, the latter approach does not allow controlling for unsystematic or illogical data.

Research investigating aberrant decision-making patterns within several mental disorders, such as drug abuse, is growing constantly. In particular, delay discounting, as a proposed transdisease process, demands further investigation, including directing interventions toward changing individuals’ discounting rates (Bickel, Jarmolowicz, Mueller, Koffarnus, & Gatchalian, 2012; Koffarnus, Jarmolowicz, Mueller, & Bickel, 2013). Furthermore, due to the partial overlap but also the evident differences between delayed and probabilistic choice behavior, the advantage of similar experimental procedures for different facets of decision making has been emphasized (Green & Myerson, 2004). Increased delay-discounting rates are seen in chronic users of alcohol and other drugs (Bjork, Hommer, Grant, & Danube, 2004; Dom, D’haene, Hulstijn, & Sabbe, 2006; MacKillop et al., 2011; Mitchell, Fields, D’Esposito, & Boettiger, 2005; Petry, 2003), and higher rates of risk taking are seen in pathological gamblers (Madden, Petry, & Johnson, 2009). Impulsive and risky decision making has also been linked to such clinically relevant constructs as treatment outcomes (Blanco et al., 2009; Krishnan-Sarin et al., 2007; MacKillop & Kahler, 2009; Petry, 2012; Stanger et al., 2012). Therefore, it has become increasingly desirable to measure choice behavior in a variety of contexts with improved, precise, and consistent methods, which may further our understanding of the mechanisms underlying value-based decision making, and thus potentially advance clinical care for people suffering mental disorders related to such behaviors.

Individuals may behave differently when they make decisions based on values. More sensitive participants may change their preferences sharply on the basis of small differences, whereas others may be neutral to the same changes. This can also be interpreted as a degree of consistency: A consistent choice policy gives a higher probability of choosing the option with the higher value. It has been shown that different samples behave differently in terms of consistency, which has been estimated by using the softmax function (Eq. 3 below) in maximum likelihood models (Hare, Hakimi, & Rangel, 2014; Yechiam, Busemeyer, Stout, & Bechara, 2005), or by constructing receiver operating characteristic curves for the sets of choices by individual participants (Ripke et al., 2012).

In the present study, we developed a mathematical framework for an isochronous amount and delay/risk adjusting procedure based on a Bayesian estimation approach (Garvert, Moutoussis, Kurth-Nelson, Behrens, & Dolan, 2015). Employing simulations and a sample of healthy participants, we show the robust and efficient estimation of the parameters of interest. The algorithm is presented and discussed in detail for the case of delay discounting. The adaptation of the mathematical framework for assessing probability discounting and loss aversion is straightforward and is omitted for the sake of brevity. Taken together, we here present a novel adaptive approach to measure different facets of value-based decision making, including choice consistency.

Mathematical framework

In this section, we provide a detailed discussion of the mathematical modeling and parameter estimation algorithm for the delay discounting case. The main idea is to employ a Bayesian approach to improve our initial assumptions about the value of the discounting parameter trial-by-trial by using the choices that an agent—for example, a person—makes between a smaller immediate and a larger delayed monetary reward. We used the hyperbolic discounting function, Eq. 1, for evaluating the subjective value of the delayed offers.

In what follows, x _d represents the amount of the delayed offer associated with a delay of d units, and the subjective value is shown as V _d. Similarly, x _i and V _i are the immediate offer and its subjective value, respectively. The subjective value of the immediate offer is trivially the value of the offer itself—that is, V _i = x _i. The offers vary between r ₁ and r ₂, measured in a currency unit, and the delays are chosen from the set D = {d ₁, d ₂,…, d ₇}, in days. Now suppose that a _i and a _d are the actions of choosing the immediate and delayed offers, respectively, and let Q(a _i) and Q(a _d) be the values of taking the corresponding actions (Sutton & Barto, 1998). Furthermore, we assume that the likelihood of choosing between the two offers follows a softmax probability function with an inverse temperature parameter, β _o > 0, as

$$ P\left({a}_i\left|{k}_o,{\beta}_o\right.\right)=\frac{ \exp \left({\beta}_o Q\left({a}_i\right)\right)}{ \exp \left({\beta}_o Q\left({a}_i\right)\right)+ \exp \left({\beta}_o Q\left({a}_d\right)\right)}, $$

(3)

and hence, P(a _d | k _o, β _o) = 1 – P(a _i | k _o, β _o).

Large values of β in Eq. 3 represent consistent choices—that is, a high probability of taking the most valuable action—whereas small values reveal some inconsistency.

Both parameters, k _o and β _o, are nonnegative. Therefore we expected them to be positively skewed—that is, approximately lognormal (Lovric, 2010). To take advantage of close-to-normal distributions, we transformed the parameters to the natural-logarithmic scale and defined k = ln(k _o) and β = ln(β _o). For estimation purposes, we discretized the parameter space over an equally spaced 2-D region, R, such that –8 ≤ k ≤ 2 and –5 ≤ β ≤ 5. For simplicity, we assumed that the two parameters are independent and imposed liberal univariate priors on the parameters, such that k and β had a Beta and a uniform distribution, respectively. Under the independence assumption, one can build a joint probability distribution P(k, β) serving as a prior for the following Bayesian framework.

Given the prior distribution, the immediate and delayed offers were presented to the agent. After observing the choice at the first trial, we updated the prior using Bayes’s rule,

$$ P\;\left( k,\beta \left| a\right.\right)=\frac{1}{Z} P\left( a\left| k,\beta \right.\right) P\left( k,\beta \right), $$

(4)

where P(k, β) is the joint distribution over the parameters and P(a | k, β) is the likelihood of observing the action a ∊ {a _i, a _d}, which is computed using Eq. 3. At every trial t, the posterior distribution over the parameters, P(k, β | a), was updated by multiplying the prior by the likelihood of the agent’s action and then served as the prior for the upcoming trial. Note that 1/Z, in Eq. 4, is a simple normalization factor over the discrete domain R. At the end of each trial, the expected values of k and β were considered the current parameter estimations, $ {\widehat{k}}_t $ and $ {\widehat{\beta}}_t $. Using the current estimates based on the previous choice, we presented the following offers close to the indifference point, where the choices were equally likely, in order to retrieve the most informative data (Lewi, Butera, & Paninski, 2008; Sebastiani & Wynn, 2000).

At each trial, we therefore intended to provide two offers with the same subjective values—that is, $ {x}_i=\frac{x_d}{1+{\widehat{k}}_t d} $, where these values lay in the offer range. In other words, the condition

$$ {r}_1\le \frac{r_2}{1+{\widehat{k}}_t{d}_i}\le {r}_2-\delta, \kern1em i=1,\cdots, m, $$

should hold for all feasible delays, where δ is the minimum difference between the two offers. Then a feasible delay was chosen randomly, and the next offers were provided such that they differed at least by δ and had the same subjective values. In extreme cases in which the agent was too patient (or impulsive), even the minimal possible fractional increment was not feasible for the longest (or shortest) delay—that is, $ \left({r}_2-\delta \right)\left(1+{\widehat{k}}_t{d}_m\right)<{r}_2\;\mathrm{or}\kern0.24em {r}_2<{r}_1\left(1+{\widehat{k}}_t{d}_1\right) $. In these cases, amounts very close to the boundary values of the offer range and either the highest or the lowest possible delay were considered as the next options. In all cases, we chose randomly from the set of feasible delays, if any, and adjusted for the immediate and delayed amounts in such a way that the subjective values of both offers were approximately the same according to the current estimate, $ {\widehat{k}}_t $. The procedure continued for a certain number of trials, N, and $ {\widehat{k}}_N $ was considered the estimated parameter. The risk with adaptive designs is that individuals will reverse-engineer the design. Therefore, to reduce the chance of an agent learning the pattern, we added random offers that were distributed throughout all trials.

The same framework is valid for the concepts of probability discounting and loss aversion. We used Eq. 2, analogous to Eq. 1 for delay discounting, to evaluate the probability discounting of gains and losses in corresponding tasks. Finally,

$$ V=\frac{1}{2}\left( G-\lambda L\right) $$

(5)

was used to evaluate mixed prospects and to estimate a behavioral measure of loss aversion, λ. Equation 5 has a simple linear form in which loss aversion is the ratio of the contribution of the loss magnitude, L, to the contribution of the gain magnitude, G, to the participant’s decisions (Frydman, Camerer, Bossaerts, & Rangel, 2011; Tom, Fox, Trepel, & Poldrack, 2007).

Simulations

Once the mathematical model was built, we produced large number of simulated data to examine the estimation procedure in the following situations. First, we assumed that the choices were made with the prescribed parameters of the model. As an example, Fig. 1 shows how the prior distributions changed over the course of the estimation procedure for a decision-maker behaving according to k = –2 and β = 1.5. We depict five starting, middle, and final trials, which, based on the data from Table 1, show improving estimations across trials.

Table 1 First and last five trials of a simulation with k = –2 and β = 1.5

Full size table

For the case of consistent behavior, we performed simulations for every k = –5.5, –5,…, –0.5, with a fixed value of β = 1.5. The estimated parameters are shown in Fig. 2a for k = –4, –3, –1 on a trial-by-trial basis. Figure S1 shows boxplots for all values of k. The values of k were reproduced well, and the accuracy of the estimations is supported by the small errors, represented by the low standard deviations. However, the estimations of β resulted in higher standard deviations, which is an indication of a suboptimal estimation (see Fig. 2b). Then, simulations for a fixed discounting rate, k = –3, and different β = –4.5, –3.5,…, 4.5 showed that k was estimated more accurately with higher values of β (Fig. 2c, Fig. S2). We demonstrate the trial-by-trial changes across the estimation procedures to emphasize the convergence of the algorithm.

To investigate the reliability of the procedure, we assumed a sample of normally distributed parameters such that k ~ N(–3, 1) and β ~ N(0.5, 2). Data sets of 50 trials were simulated for random pairs from these distributions, and the resulting data are shown in Fig. 3a. The correlations between the initial (true scores) and estimated parameters were .98 for k and .95 for β, which are depicted in Fig. 3b.

Simulations with appropriate settings of priors and initial offers were also performed for probability discounting and mixed-gambles models (see Fig. 4 ). For the probability discounting rates, we restricted the domain to the logarithmic scale such that –3 ≤ k _o ≤ 3, given the fact that a k _o of 1—that is, k = 0—is considered a baseline at which the subjective value corresponds to the expected value, and any value of k _o different from zero is considered to be an indicator of risk seeking or risk aversive behavior. The loss aversion parameter was set to 0 ≤ λ ≤ 4 without a transformation to the logarithmic scale. The reason was that any value of λ beyond this range was infeasible with our offer range, and any transformation could result in a change of direction of any potential skewness. The consistency parameter had the same range as before, –5 ≤ β ≤ 5. Predefined values of the discounting and loss aversion parameters were reliably recovered. As before, the consistency parameter was reproduced with less precision.

Comparison to standard methods

To compare our approach to standard methods, we chose a value-adjusting method as a representative of a class of titration procedures that adjust the amount by half of the difference between the delayed and immediate offers depending on the previous choice (Ripke et al., 2012). Assuming consistent choices, after a certain number of trials this method will locate the indifference point for a given delay. Finally, the hyperbolic function was fit to indifference points of several delays to estimate the discounting rate.

To make the comparison more reliable in terms of controlling for accuracy, the offer ranges, and similar initial assumptions, we simplified the question of estimating the discounting rate to finding the indifference points for certain discounting rates, immediate amounts, and delays. In other words, given a discounting rate of k, an immediate amount of x _i, and a delay of d days, we aimed to locate the indifference point—that is, to find the delayed amount x _d such that x _d = x _i(1 + kd). We further assumed that the choices were made using Eq. 3 with fixed values of β.

We conducted the comparison in three different settings with increasing complexity. (1) For a single immediate amount x _i = 10, single delay d = 30, and consistent choices with β = 1.5, we looked at the number of iterations each algorithm needed to locate the indifference point for k ∊ {–6, –5.9, ⋅ ⋅ ⋅, 1} (see Fig. 5a). Although the performance of both methods overlaps for small values of k, for stronger discounting one needs to provide larger delayed values, which makes the amount-adjusting method inefficient. (2) The number of iterations was observed for multiple immediate amounts x _i ∊ {5, 10, 15, 20}, multiple delays d = {10, 30, 60, 120, 180}, and consistent choices with β = 1.5 for k ∊ {–6, –5.9, ⋅ ⋅ ⋅, 1}. Our approach exhibited more stable performance in this case, too (Fig. 5b). (3) Starting with an immediate amount of x _i = 10 for multiple delays d = {10, 30, 60, 120, 180}, and allowing 50 trials in total, we applied both methods to reproduce discounting rates of k ∊ {–6, –5.9, ⋅ ⋅ ⋅, 1}. For every k, we conducted 1,000 simulations with β ~ N(0, 2) (Fig. 5c). The median estimated k using our approach (Fig. 5c, top) closely tracked the true values (black line), and the 5th and 95th percentiles (dotted lines) were evenly distributed for different k values. In contrast, the amount-adjusting procedure performed somewhat less well in terms of the medians and skewed residuals (Fig. 5c, bottom).

For the first two cases, we applied each algorithm 10,000 times for every k value and random combinations of x _i and d. We terminated the algorithm either upon reaching the indifference point, up to an absolute error of .05, or after 200 trials if the results did not converge. The delayed offer at the first trial for both methods was set to twice the amount of the true indifference point, 2x _i(1 + kd). The algorithms had no restriction on the amounts they could offer. To avoid any bias, we used a flat prior on k for our algorithm.

We also compared the two methods using empirical data that were collected using a delay discounting task from an early implementation of the algorithm, with slightly different settings of priors, parameter domains, and offer range. We invited a total of 88 participants from a follow-up study to Ripke et al. (2012) to perform the adaptive task in addition to the main task of the study. The standard task was a 50-trial amount-adjusting task designed to assess indifference points for delays of 10, 30, 60, 120, and 180 days. The temporal discounting rates for the standard method were computed by fitting the hyperbolic value function, Eq. 1, to these indifference points. The resulting estimates of the delay discounting rates from the two methods were highly correlated, r = .66, p < .001, as is shown in Fig. 6a.

Furthermore, we applied our algorithm—that is, the sequential Bayesian update—to the data from the standard method. The data comprised a total of 50 choices between delayed offers and a fixed immediate offer. The iterative procedure ran through all trials and updated the estimations on the basis of the trial-by-trial likelihood of the choices. The resulting discounting rates were correlated to the results from the standard method, r = .99, p < .001, as is given in Fig. 6b, which shows that our approach gave almost the same results as fitting to the indifference points. This introduces a new and computationally easy way of estimating discounting rates in data from standard methods, and it could also serve as a proof of validity for our approach: Assuming that the classical method measures a construct, our algorithm does so to the same degree.

The battery

The adaptive procedure for binary choice presentation was employed to develop a task battery for the measurement of different facets of impulsive and risky decision making, with four independent tasks: delay discounting (DD), probability discounting for gains (PDG), probability discounting for losses (PDL), and mixed gambles (MG) (see Fig. 7).

During each task, participants are supposed to choose one of the two offers presented simultaneously for 5 s on a computer screen. The time limit was set in light of the average response times from previous assessments. For each trial, the participant’s choice is highlighted with a frame before presenting the next offer. Presenting the outcomes of gambles during the experiment and the time interval of each trial, as well as the number of trials, are all optional and can be set initially. In general, participants are informed by instructions before each task that at the end of the experiment one trial per task will be selected randomly from among their choices and credited toward their compensation. However, these instructions are integrated into the battery and can be modified on the basis of alternative task designs. The temporal delays in DD are set to 3, 7, 14, 31, 61, 180, and 365 days. For PDG and PDL, gambles are played with five possible probability values: 2/3, 1/2, 1/3, 1/4, and 1/5. The task length for DD, PDL, and PDG is 50 trials, and monetary gains/losses range from €3 to €50. For MG, 50 trials are performed, presenting amounts of €1–€40 for gains and €5–€20 for losses. The number of trials, 50, was chosen according to data acquired by previous implementations of the algorithm, so as to end up with stable estimates. At the beginning of the MG task, participants receive €10 as “house money.” During all tasks, offers are randomly assigned to presentation on the left or the right of the screen.

The experiments, including instructions, binary choices, and outcomes, were initially implemented using MATLAB, Release 2010a (The MathWorks, Inc., Natick, MA) and Psychtoolbox 3.0.10, based on the Psychophysics Toolbox extensions (Brainard, 1997; Pelli, 1997), and are now available under GNU Octave. The initial settings of the tasks, including reward types and ranges, temporal delays, probabilities for gains and losses, and gambling, together with the instructions and payment schemes, are easily accessible through the source code. This enables an end user to modify the layout and initial settings on the basis of different hypotheses and requirements.

Piloting

We piloted the battery on a sample of 26 (15 female, 11 male) healthy adults with a mean age of 26.2 years (SD = 7.8). The participants were recruited through flyers distributed around the university campus and neighborhood, and they were paid a fixed amount of money for compensation. Participants completed the battery within 19 min, on average, of which the estimation procedure required 13 min (i.e., 6 min for the instructions). The estimated parameters are shown in Fig. 8a as boxplots, and the summary statistics for the sample are presented in Table 2. To test the convergence of the estimation procedure, we calculated the absolute difference between the estimated value of each parameter on any trial, $ {\widehat{k}}_t\left({\widehat{\beta}}_t\right) $, and the final estimation, $ \widehat{k}\left(\widehat{\beta}\right) $. The median of this error term was then depicted for all participants, together with the 75th percentile and maximum values. Figure 9 shows a decreasing pattern that can roughly be interpreted as the convergence of the procedure. Regarding the distributions of the parameters, the Kolmogorov–Smirnov test rejected the normality null hypothesis for DD, PDG, and MG at α = .05 in our sample. Therefore, we used Spearman’s rank correlation coefficient to look for any associations between the parameters, which is summarized in Table 3.

Table 2 Participant sample description, with k and β on a logarithmic scale

Full size table

Table 3 Correlations between parameters

Full size table

Previous studies have shown a negative correlation between the probability discounting rates for gains and losses (Shead & Hodgins, 2009; Takahashi, Takagishi, Nishinaka, Makino, & Fukui, 2014). Nevertheless, we observed no correlation, Spearman’s ρ = –.0003, though a linear trend was present, as is shown in Fig. 8b, and a post-hoc analysis revealed a power of .35 for our sample size to detect a correlation of –.25 with α = .05. Furthermore, the probability discounting rates for gains and losses in our sample did not show any significant differences in terms of means and medians. For mixed gambles, the resulting distribution was in line with the literature, although we endowed the participants with money in advance and used different framings (including zero values). Such differences in the presentation and/or context of instruction have been shown to affect choice behavior (Ert & Erev, 2008; Kahneman & Tversky, 1979; Silberberg, Murray, Christensen, & Asano, 1988; Thaler & Johnson, 1990). Moreover, we observed significant associations between probability discounting for gains and both loss aversion and temporal discounting, ρ = –.41, p < .05, and ρ = .47, p < .05, respectively. We also observed a nonsignificant association between loss aversion and probability discounting for losses, ρ = –.37, p = .061.

We next examined the reproducibility of β in the low value range using simulations. Specifically, we conducted simulations for the delay discounting case using samples of the initial parameters distributed according to Table 1 [1,000 simulations with k ~ N(–5.24, 2) and β ~ N(–1.35, 1.32)]. The estimated parameters had significant correlations of .92 and .86 for k and β, respectively. Hence, in simulations the reproducibility appeared acceptable, even at the lower end of the parameter range.

Discussion

In this study, we developed and implemented a novel adaptive algorithm to measure different metrics related to impulsive and risky decision making, such as the temporal and probability discounting rates and loss aversion.

The main advantage of this approach is its isochronous adaptive nature, which aims at providing the most informative offers at each trial on the basis of the choices that have been made earlier. Theoretically, this should allow for a very efficient inference of behavioral parameters. Accordingly, we showed through simulations and analysis of participant samples that model parameters converge in a few initial trials, both in simulations and in the course of tasks for real experiments. This provides stable estimates and can differentiate between participants, assuming that the hyperbolic and linear value functions, Eqs. 1, 2, and 5, capture the behavior to an acceptable degree. The temporal discounting rates obtained with computerized adjusting-amount procedures (Reynolds, Richards, Horn, & Karraker, 2004; Richards, Zhang, Mitchell, & Wit, 1999; Ripke et al., 2012) or the Monetary Choice Questionnaire (Kirby, Petry, & Bickel, 1999; Koff & Lucas, 2011) vary in terms of their summary statistics between different studies reported in the literature. This makes the results from different studies incomparable with regard to the values. Nevertheless, we showed a correlation of r = .66 between our approach and a standard amount-adjusting procedure for the same participants, which seems to be acceptable in terms of test–retest reliability, considering that the two tasks had different settings of amounts and delays (Beck & Triplett, 2009; Craig, Maxfield, Stein, Renda, & Madden, 2014; Kirby, 2009).

Moreover, we showed through simulations that our approach outperforms a standard amount-adjusting method specifically for the higher values of delay discounting rates. Recent advances (Koffarnus & Bickel, 2014; Yoon & Chapman, 2016) have introduced methods to estimate discounting parameters with very few trials (five and ten, respectively, for Koffarnus & Bickel and Yoon & Chapman). Both methods are theoretically variants of titration procedures. Such extremely brief tasks are advantageous because they save time, but due to theoretical considerations we assume that this might come at the cost of precision. Future research might therefore compare these very short task variants with our algorithm by utilizing simulations and participant samples.

The algorithm was also used to reestimate the discounting rates for data that were acquired by nonadaptive methods. The initial estimates and reestimates were nearly identical (r = .99). Furthermore, the medians of the probability discounting rates in our piloting were very close to zero (log-transformed), which is theoretically consistent and corresponds to the evaluation of the risky option by its expected value. For loss aversion, the medians were comparable to what has been reported in the literature (Tom et al., 2007).

The described algorithm was used to implement four independent measures of value-based decision making as an experimental package. This shows that the Bayesian framework can easily be adapted to assess a range of behavioral concepts and suggests that future work can be directed toward the development of additional tasks. For our test battery, we provide a graphical user interface that gives access to the model and runtime variables, as well as task-specific settings such that one can set the values for every single experiment. This and the efficient inference of behavior enhance the flexibility to cover different ranges and commodities of outcomes, even when time is a limiting factor. Most importantly, behavioral estimations are immediately available along with other data such as response times, rendering post-hoc parameter inference unnecessary.

The adaptiveness of the estimation procedure and the instruction disclosing that the outcome of a randomly picked trial will be credited as compensation make our approach vulnerable to reverse-engineering. For example, in DD after observing a delayed choice, the algorithm increases the relative amount of the immediate option for the next offer. Some participants will deliberately pick up the delayed option upon learning this pattern until the immediate offers reach a maximum. To reduce this effect, we distributed random offers in the course of the tasks, to make the patterns less obvious. This, however, decreases the power of the algorithm because of added noise and sampling from less informative data. On the other hand, in exceptional cases of extreme behavior—for example, if a participant tends to take just one type of offer, such as the immediate ones in delay discounting—the procedure runs normally, but these cases could be treated in a different way by allowing for adaptive offer ranges. However, these instances are easily detectable by their value and choice behavior and can be considered for treatment as outliers.

In summary, we developed a new approach for adaptive offer presentation in binary choice settings to estimate discounting parameters. We showed that this approach is quick, reliable, and outperforms the most widely used classical method. Furthermore, it can be easily transferred to other concepts of decision making. Our findings support construct validity under the mathematical framework and we conclude that the proposed Bayesian approach is a functional alternative to those existing in the literature. Thus, our work might advance the evaluation of decision-making processes in single studies or in research consortia that need to collect high numbers of datasets in a flexible and efficient way. Nevertheless, future studies will be required, in order to improve the estimation precision of the consistency parameter, β, as well as to compare with more recent methods and other complex decision models with more parameters.

References

Ainslie, G. (1975). Specious reward: Behavioral theory of impulsiveness and impulse control. Psychological Bulletin, 82, 463–496. doi:10.1037/H0076860
Article PubMed Google Scholar
Baker, F., Johnson, M. W., & Bickel, W. K. (2003). Delay discounting in current and never-before cigarette smokers: Similarities and differences across commodity, sign, and magnitude. Journal of Abnormal Psychology, 112, 382–392.
Article PubMed Google Scholar
Beck, R. C., & Triplett, M. F. (2009). Test–retest reliability of a group-administered paper-pencil measure of delay discounting. Experimental and Clinical Psychopharmacology, 17, 345–355. doi:10.1037/a0017078
Article PubMed Google Scholar
Bickel, W. K., Jarmolowicz, D. P., Mueller, E. T., Koffarnus, M. N., & Gatchalian, K. M. (2012). Excessive discounting of delayed reinforcers as a trans-disease process contributing to addiction and other disease-related vulnerabilities: Emerging evidence. Pharmacology & Therapeutics, 134, 287–297. doi:10.1016/j.pharmthera.2012.02.004
Article Google Scholar
Bjork, J. M., Hommer, D. W., Grant, S. J., & Danube, C. (2004). Impulsivity in abstinent alcohol-dependent patients: Relation to control subjects and type 1-/type 2-like traits. Alcohol, 34, 133–150. doi:10.1016/j.alcohol.2004.06.012
Article PubMed Google Scholar
Blanco, C., Potenza, M. N., Kim, S. W., Ibáñez, A., Zaninelli, R., Saiz-Ruiz, J., & Grant, J. E. (2009). A pilot study of impulsivity and compulsivity in pathological gambling. Psychiatry Research, 167, 161–168. doi:10.1016/j.psychres.2008.04.023
Article PubMed PubMed Central Google Scholar
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. doi:10.1163/156856897X00357
Article PubMed Google Scholar
Burns, K., & Bechara, A. (2007). Decision making and free will: A neuroscience perspective. Behavioral Sciences and the Law, 25, 263–280. doi:10.1002/bsl.751
Article PubMed Google Scholar
Craig, A. R., Maxfield, A. D., Stein, J. S., Renda, C. R., & Madden, G. J. (2014). Do the adjusting-delay and increasing-delay tasks measure the same construct: Delay discounting? Behavioural Pharmacology, 25, 306–315. doi:10.1097/Fbp.0000000000000055
Article PubMed PubMed Central Google Scholar
Dom, G., D’haene, P., Hulstijn, W., & Sabbe, B. (2006). Impulsivity in abstinent early- and late-onset alcoholics: Differences in self-report measures and a discounting task. Addiction, 101, 50–59. doi:10.1111/j.1360-0443.2005.01270.x
Article PubMed Google Scholar
Doyle, J. R. (2013). Survey of time preference, delay discounting models. Judgment and Decision Making, 8, 116–135.
Google Scholar
Ert, E., & Erev, I. (2008). The rejection of attractive gambles, loss aversion, and the lemon avoidance heuristic. Journal of Economic Psychology, 29, 715–723. doi:10.1016/j.joep.2007.06.003
Article Google Scholar
Frydman, C., Camerer, C., Bossaerts, P., & Rangel, A. (2011). MAOA-L carriers are better at making optimal financial decisions under risk. Proceedings of the Royal Society B, 278, 2053–2059. doi:10.1098/rspb.2010.2304
Article PubMed Google Scholar
Garvert, M. M., Moutoussis, M., Kurth-Nelson, Z., Behrens, T. E., & Dolan, R. J. (2015). Learning-induced plasticity in medial prefrontal cortex predicts preference malleability. Neuron, 85, 418–428. doi:10.1016/j.neuron.2014.12.033
Article PubMed PubMed Central Google Scholar
Green, L., & Myerson, J. (2004). A discounting framework for choice with delayed and probabilistic rewards. Psychological Bulletin, 130, 769–792. doi:10.1037/0033-2909.130.5.769
Article PubMed PubMed Central Google Scholar
Green, L., Fry, A. F., & Myerson, J. (1994). Discounting of delayed rewards: A life-span comparison. Psychological Science, 5, 33–36. doi:10.1111/j.1467-9280.1994.tb00610.x
Article Google Scholar
Green, L., Myerson, J., Shah, A. K., Estle, S. J., & Holt, D. D. (2007). Do adjusting-amount and adjusting-delay procedures produce equivalent estimates of subjective value in pigeons? Journal of the Experimental Analysis of Behavior, 87, 337–347. doi:10.1901/jeab.2007.37-06
Article PubMed PubMed Central Google Scholar
Hare, T., Hakimi, S., & Rangel, A. (2014). Activity in dlPFC and its effective connectivity to vmPFC are associated with temporal discounting. Frontiers in Neuroscience, 8, 50. doi:10.3389/fnins.2014.00050
Article PubMed PubMed Central Google Scholar
Holt, D. D., Green, L., & Myerson, J. (2012). Estimating the subjective value of future rewards: Comparison of adjusting-amount and adjusting-delay procedures. Behavioural Processes, 90, 302–310. doi:10.1016/j.beproc.2012.03.003
Article PubMed PubMed Central Google Scholar
Johnson, M. W., & Bickel, W. K. (2002). Within-subject comparison of real and hypothetical money rewards in delay discounting. Journal of the Experimental Analysis of Behavior, 77, 129.
Article PubMed PubMed Central Google Scholar
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263–291. doi:10.2307/1914185
Article Google Scholar
Kahneman, D., & Tversky, A. (1984). Choices, values, and frames. American Psychologist, 39, 341–350. doi:10.1037/0003-066X.39.4.341
Article Google Scholar
Kirby, K. N. (2009). One-year temporal stability of delay-discount rates. Psychonomic Bulletin & Review, 16, 457–462. doi:10.3758/PBR.16.3.457
Article Google Scholar
Kirby, K. N., Petry, N. M., & Bickel, W. K. (1999). Heroin addicts have higher discount rates for delayed rewards than non-drug-using controls. Journal of Experimental Psychology: General, 128, 78–87.
Article Google Scholar
Koff, E., & Lucas, M. (2011). Mood moderates the relationship between impulsiveness and delay discounting. Personality and Individual Differences, 50, 1018–1022. doi:10.1016/j.paid.2011.01.016
Article Google Scholar
Koffarnus, M. N., & Bickel, W. K. (2014). A 5-trial adjusting delay discounting task: Accurate discount rates in less than one minute. Experimental and Clinical Psychopharmacology, 22, 222–228. doi:10.1037/a0035973
Article PubMed PubMed Central Google Scholar
Koffarnus, M. N., Jarmolowicz, D. P., Mueller, E. T., & Bickel, W. K. (2013). Changing delay discounting in the light of the competing neurobehavioral decision systems theory: A review. Journal of the Experimental Analysis of Behavior, 99, 32–57. doi:10.1002/jeab.2
Article PubMed Google Scholar
Krishnan-Sarin, S., Reynolds, B., Duhig, A. M., Smith, A., Liss, T., McFetridge, A.,…Potenza, M. N. (2007). Behavioral impulsivity predicts treatment outcome in a smoking cessation program for adolescent smokers. Drug and Alcohol Dependence, 88, 79–82. doi:10.1016/j.drugalcdep.2006.09.006
Lewi, J., Butera, R., & Paninski, L. (2008). Sequential optimal design of neurophysiology experiments. Neural Computation, 21, 619–687. doi:10.1162/neco.2008.08-07-594
Article Google Scholar
Loewenstein, G. (1988). Frames of mind in intertemporal choice. Management Science, 34, 200–214.
Article Google Scholar
Lovric, M. (2010). International encyclopedia of statistical science. New York, NY: Springer.
Google Scholar
MacKillop, J., & Kahler, C. W. (2009). Delayed reward discounting predicts treatment response for heavy drinkers receiving smoking cessation treatment. Drug and Alcohol Dependence, 104, 197–203. doi:10.1016/j.drugalcdep.2009.04.020
Article PubMed PubMed Central Google Scholar
MacKillop, J., Amlung, M. T., Few, L. R., Ray, L. A., Sweet, L. H., & Munafò, M. R. (2011). Delayed reward discounting and addictive behavior: A meta-analysis. Psychopharmacology, 216, 305–321. doi:10.1007/s00213-011-2229-0
Article PubMed PubMed Central Google Scholar
Madden, G. J., & Johnson, P. S. (2010). A delay-discounting primer. In G. J. M. W. K. Bickel (Ed.), Impulsivity: The behavioral and neurological science of discounting (pp. 11–37). Washington, DC: American Psychological Association.
Chapter Google Scholar
Madden, G. J., Petry, N. M., Badger, G. J., & Bickel, W. K. (1997). Impulsive and self-control choices in opioid-dependent patients and non-drug-using control patients: Drug and monetary rewards. Experimental and Clinical Psychopharmacology, 5, 256.
Article PubMed Google Scholar
Madden, G. J., Petry, N. M., & Johnson, P. S. (2009). Pathological gamblers discount probabilistic rewards less steeply than matched controls. Experimental and Clinical Psychopharmacology, 17, 283–290. doi:10.1037/A0016806
Article PubMed PubMed Central Google Scholar
Mazur, J. E. (1987). An adjusting procedure for studying delayed reinforcement. In M. L. Commons, J. E. Mazur, J. A. Nevin, & H. Rachlin (Eds.), Quantitative analysis of behavior: Vol. 5. The effect of delay and of intervening events on reinforcement value (pp. 55–73). Hillsdale, NJ: Erlbaum.
Google Scholar
Mitchell, J. M., Fields, H. L., D’Esposito, M., & Boettiger, C. A. (2005). Impulsive responding in alcoholics. Alcoholism: Clinical and Experimental Research, 29, 2158–2169. doi:10.1097/01.alc.0000191755.63639.4a
Article Google Scholar
Myerson, J., Green, L., & Warusawitharana, M. (2001). Area under the curve as a measure of discounting. Journal of the Experimental Analysis of Behavior, 76, 235–243. doi:10.1901/jeab.2001.76-235
Article PubMed PubMed Central Google Scholar
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. doi:10.1163/156856897X00366
Article PubMed Google Scholar
Petry, N. M. (2003). Discounting of money, health, and freedom in substance abusers and controls. Drug and Alcohol Dependence, 71, 133–141. doi:10.1016/S0376-8716(03)00090-5
Article PubMed Google Scholar
Petry, N. M. (2012). Discounting of probabilistic rewards is associated with gambling abstinence in treatment-seeking pathological gamblers. Journal of Abnormal Psychology, 121, 151–159. doi:10.1037/A0024782
Article PubMed Google Scholar
Rachlin, H., Raineri, A., & Cross, D. (1991). Subjective-probability and delay. Journal of the Experimental Analysis of Behavior, 55, 233–244. doi:10.1901/jeab.1991.55-233
Article PubMed PubMed Central Google Scholar
Reynolds, B., Richards, J. B., Horn, K., & Karraker, K. (2004). Delay discounting and probability discounting as related to cigarette smoking status in adults. Behavioural Processes, 65, 35–42.
Article PubMed Google Scholar
Richards, J. B., Zhang, L., Mitchell, S. H., & Wit, H. (1999). Delay or probability discounting in a model of impulsive behavior: Effect of alcohol. Journal of the Experimental Analysis of Behavior, 71, 121–143.
Article PubMed PubMed Central Google Scholar
Ripke, S., Hübner, T., Mennigen, E., Müller, K. U., Rodehacke, S., Schmidt, D.,…Smolka, M. N. (2012). Reward processing and intertemporal decision making in adults and adolescents: The role of impulsivity and decision consistency. Brain Research, 1478, 36–47. doi:10.1016/j.brainres.2012.08.034
Sebastiani, P., & Wynn, H. P. (2000). Maximum entropy sampling and optimal Bayesian experimental design. Journal of the Royal Statistical Society: Series B, 62, 145–157.
Article Google Scholar
Shead, N. W., & Hodgins, D. C. (2009). Probability discounting of gains and losses: Implications for risk attitudes and impulsivity. Journal of the Experimental Analysis of Behavior, 92, 1–16. doi:10.1901/Jeab.2009.92-1
Article PubMed PubMed Central Google Scholar
Silberberg, A., Murray, P., Christensen, J., & Asano, T. (1988). Choice in the repeated-gambles experiment. Journal of the Experimental Analysis of Behavior, 50, 187–195. doi:10.1901/jeab.1988.50-187
Article PubMed PubMed Central Google Scholar
Simpson, C. A., & Vuchinich, R. E. (2000). Reliability of a measure of temporal discounting. Psychological Record, 50(1), 3–16.
Article Google Scholar
Smith, C. L., & Hantula, D. A. (2008). Methodological considerations in the study of delay discounting in intertemporal choice: A comparison of tasks and modes. Behavior Research Methods, 40, 940–953. doi:10.3758/brm.40.4.940
Article PubMed Google Scholar
Stanger, C., Ryan, S. R., Fu, H., Landes, R. D., Jones, B. A., Bickel, W. K., & Budney, A. J. (2012). Delay discounting predicts adolescent substance abuse treatment outcome. Experimental and Clinical Psychopharmacology, 20, 205–212. doi:10.1037/a0026543
Article PubMed Google Scholar
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1). Cambridge, MA: MIT Press.
Google Scholar
Takahashi, T., Takagishi, H., Nishinaka, H., Makino, T., & Fukui, H. (2014). Neuroeconomics of psychopathy: Risk taking in probability discounting of gain and loss predicts psychopathy. Neuroendocrinology Letters, 35, 510–517.
PubMed Google Scholar
Thaler, R. H., & Johnson, E. J. (1990). Gambling with the house money and trying to break even: The effects of prior outcomes on risky choice. Management Science, 36, 643–660.
Article Google Scholar
Tom, S. M., Fox, C. R., Trepel, C., & Poldrack, R. A. (2007). The neural basis of loss aversion in decision-making under risk. Science, 315, 515–518. doi:10.1126/science.1134239
Article PubMed Google Scholar
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5, 297–323.
Article Google Scholar
Vincent, B. T. (2016). Hierarchical Bayesian estimation and hypothesis testing for delay discounting tasks. Behavior Research Methods, 48, 1608–1620. doi:10.3758/s13428-015-0672-2
Article PubMed Google Scholar
Von Neumann, J., & Morgenstern, O. (1944). Theory of games and economic behavior. Princeton, NJ: Princeton University Press.
Google Scholar
Yechiam, E., Busemeyer, J. R., Stout, J. C., & Bechara, A. (2005). Using cognitive models to map relations between neuropsychological disorders and human decision-making deficits. Psychological Science, 16, 973–978. doi:10.1111/j.1467-9280.2005.01646.x
Article PubMed Google Scholar
Yoon, H., & Chapman, G. B. (2016). A closer look at the yardstick: A new discount rate measure with precision and range. Journal of Behavioral Decision Making, 29, 470–480. doi:10.1002/bdm.1890
Article Google Scholar

Download references

Author note

We thank Zeb Kurth-Nelson for sharing his ideas on the mathematical framework, Nils B. Kroemer for insightful discussions, and Elisabeth Jünger and Christian Sommer for collecting the pilot data. This study was supported by the Deutsche Forschungsgemeinschaft (DFG FOR 1617 Grants RA 1047/2-1, SM 80/7-1, and SM 80/7-2; DFG SPP 1226 Grant SM 80/5-2; and DFG SFB 940/1 and SFB 940/2 grants). Q.J.M.H. and M.S. contributed to the conception and design of the study. A.G. and Q.J.M.H. implemented the mathematical algorithm, which was improved by S.P. for this work. Piloting of participants and data collection were performed by N.B., and N.B. and S.P. drafted the manuscript. All authors provided critical revision of the manuscript for important intellectual content and approved the final version for publication.

Author information

Authors and Affiliations

Department of Psychiatry and Psychotherapy, Technische Universität Dresden, Dresden, Germany
Shakoor Pooseh, Nadine Bernhardt, Alvaro Guevara & Michael N. Smolka
Escuela de Matemática, Universidad de Costa Rica, Ciudad universitaria Rodrigo Facio Brenes, Costa Rica
Alvaro Guevara
Translational Neuromodeling Unit, Hospital of Psychiatry, University of Zürich and Swiss Federal Institute of Technology, Zurich, Switzerland
Quentin J. M. Huys
Psychiatry, Psychosomatics, and Psychotherapy, University of Zürich, Zurich, Switzerland
Quentin J. M. Huys

Authors

Shakoor Pooseh
View author publications
You can also search for this author in PubMed Google Scholar
Nadine Bernhardt
View author publications
You can also search for this author in PubMed Google Scholar
Alvaro Guevara
View author publications
You can also search for this author in PubMed Google Scholar
Quentin J. M. Huys
View author publications
You can also search for this author in PubMed Google Scholar
Michael N. Smolka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael N. Smolka.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Figure S1

(PDF 197 kb)

Figure S2

(PDF 194 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pooseh, S., Bernhardt, N., Guevara, A. et al. Value-based decision-making battery: A Bayesian adaptive approach to assess impulsive and risky behavior. Behav Res 50, 236–249 (2018). https://doi.org/10.3758/s13428-017-0866-x

Download citation

Published: 13 March 2017
Issue Date: February 2018
DOI: https://doi.org/10.3758/s13428-017-0866-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Value-based decision-making battery: A Bayesian adaptive approach to assess impulsive and risky behavior

Abstract

Similar content being viewed by others

Steep Discounting of Future Rewards as an Impulsivity Phenotype: A Concise Review

The Delay Discounting Procedure: Methodology and Flexibility

A Fuzzy-Trace Theory of Risk and Time Preferences in Decision Making: Integrating Cognition and Motivation

Mathematical framework

Simulations

Comparison to standard methods

The battery

Piloting

Discussion

References

Author note

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Figure S1

Figure S2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Value-based decision-making battery: A Bayesian adaptive approach to assess impulsive and risky behavior

Abstract

Similar content being viewed by others

Steep Discounting of Future Rewards as an Impulsivity Phenotype: A Concise Review

The Delay Discounting Procedure: Methodology and Flexibility

A Fuzzy-Trace Theory of Risk and Time Preferences in Decision Making: Integrating Cognition and Motivation

Mathematical framework

Simulations

Comparison to standard methods

The battery

Piloting

Discussion

References

Author note

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Figure S1

Figure S2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation