Introduction

The term discounting process refers to a decrease in the subjective value of an outcome as a specific environmental factor on which a reward or a loss is devalued increases (e.g., Green and Myerson 2004; Rachlin et al. 1991). The most widely studied process, delay discounting (also known as temporal discounting) (for review, see: Madden and Bickel 2010), refers to the behavioral definition of impulsivity previously mentioned—the preference for smaller, immediate rewards over larger but delayed rewards. Of course, the value of a reward decreases as a function of variables other than time (see: Green and Myerson 2004; Rachlin et al. 1991). Apart from the discounting of delayed rewards, behavioral psychology also studies probabilistic discounting (the process by which the subjective value of the gain diminishes together with the decreasing probability of achieving the gain) (Ostaszewski et al. 1998; Rachlin et al. 1991) and effort discounting (the decrease in subjective value of the gain coinciding with the increasing effort needed to gain the reward) (Mitchell 2004; Sugiwaka and Okouchi 2004) as well as social discounting (defined as the decrease in subjective value of the reward with the increasing number of people with whom the reinforcement is to be shared) (Rachlin and Raineri 1992); later, another definition of the social discounting was formulated, with social distance relative to the person with whom a reward is to be shared as the discounting factor (Jones and Rachlin 2006; Rachlin and Jones 2008).

Delay, Probability, Social, and Effort Discounting: One or Four Separated Processes?

The only theoretical model of discounting which takes into account three main types of discounting was proposed by Howard Rachlin (Rachlin 2006; Rachlin and Raineri 1992). The general model of discounting he presented is based on the assumption that the final, subjective value of the reward depends on three main factors: postponement of the reward, probability of getting the reward, and the number of people among which the reward is divided. The entire process of discounting is thus composed of three elements: discounting based on time (delay discounting), discounting based on likelihood (probability discounting), and discounting associated with sharing (social discounting) (Rachlin and Raineri 1992; Rachlin et al. 1991). Each of these elements is a separate phenomenon, but they are all based on a similar mechanism, jointly influencing the subjective value of a reward and are also somewhat mutually reducible. According to Rachlin et al. (1991), for example, it is possible to determine which postponement may affect the value of a reward in the same way as the likelihood of getting this reward.

As seen, Rachlin (Rachlin and Raineri 1992) has not incorporated the discounting associated with the effort that is required to obtain a reward into his model. Taking into consideration his argument that all of the discounting factors affect the value of rewards in a similar way, however, a mathematical formula that describes effort discounting should be analogous to those that describe the discounting process based on delay, probability, and social factors (Rachlin and Raineri 1992). Indeed, researchers (Mitchell 2004; Sugiwaka and Okouchi 2004) have successfully used the hyperbolic discounting formula to describe the effort discounting process. Consistency in the formulation of the problems of delay, probability, and effort, shown in current studies in the field of experimental behavior analysis, allows a direct comparison of these processes and of the underlying decision-making mechanisms. Finally, the fact that the performance of specific tasks (effort) always takes time (delay) and does not necessarily lead to the fulfillment of the requirements needed to obtain a reward (uncertainty) suggests that the effort is connected on the theoretical level both to the delay of rewards and to the probability of their receipt (of getting the reward). Referring to the arguments mentioned earlier, it can be assumed that delay, probabilistic, social, and effort discounting are four separate processes that are nevertheless not totally independent.

Discounting as a Personality Trait

Researchers claim that discounting is closely related to some personality and individual characteristics. However so far only a handful of studies investigated whether different types of discounting themselves might be considered a distinct personality trait (see Green and Myerson 2013; Kirby 2009; Odum 2011). A typical definition of a personality trait is “a relatively stable pattern of thoughts, feelings, and behaviors that reflects the tendency to respond in certain ways under certain circumstances” (Odum 2011). Thus, in accordance with the first part of the definition of a personality trait, the rate of discounting is relatively stable. Findings on the reliability of discounting suggest that it is relatively stable over time (r = .71–.91; Baker et al. 2003; Beck and Triplett 2009; Kirby 2009; Ohmura et al. 2006; Simpson and Vuchinich 2000; Smith and Hantula 2008, Takahashi et al. 2007). Test-retest intervals range from one week to one year.

The second part of the definition of personality trait assumes the situational stability of behaviour. This is not to say that people must act the same whatever the situation, but that their behaviour is “meaningfully consistent” (Odum 2011). To translate this into the language of discounting, people who demonstrate a high degree of discounting for one outcome (e.g. money) should show relatively fast discounting for other outcomes (e.g. food, drugs; Odum 2011). One way to find out whether such relationship is present is to analyse correlations between the levels of discounting for different outcomes, as well as correlations between different types of discounting. Studies so far have found significant positive correlations between the steepness of discounting for one outcome and the degree of discounting found for another outcome (Odum 2011; Odum and Rainaud 2003), as well as significant positive correlations between various kinds of discounting (Green and Myerson 2004, 2010; Mitchell 2004). Consistent with this view, substance-users have also been shown to discount both monetary and non-monetary rewards more steeply than controls (for a review, see Yi et al. 2010), consistent with the hypothesis of a unitary discounting trait that substance-users possess to a greater degree than non-substance-users (Green and Myerson 2013).

However, this is not to say that the rates at which individuals discount different types of rewards always are correlated. Importantly, some studies have shown that individual differences in discounting are domain specific (see Green and Myerson 2013; König 2009). That is, individuals who discounts one type of delayed reward (e.g., money) more steeply than others do not necessarily discount other types of delayed reward (e.g., food) more steeply, what was termed domain independence (Chapman 1996). For example, Chapman (1996) has reported that rates of discounting monetary and health rewards are uncorrelated. Moreover, Jimura et al. (2009, 2011) in a series of experiments showed that the correlation between discounting of liquid and monetary rewards was not significant, providing evidence that these two reward domains are independent at the individual level. Interesting, in the same experiments (Jimura et al. 2011) individual differences in discounting of both reward types were stable over several weeks. These results suggest that although similar decision-making processes may be involved, the discounting of one type of rewards and the discounting of other type of rewards reflect separate, temporally stable individual characteristics, rather than a single trait of impulsivity. In summary, the current research verifies the status of the discounting rate as a trait and tests the hypothesis whether discounting can be considered a separate personality trait (Green and Myerson 2013; König 2009; Odum 2011).

Measurement Techniques of Discounting

There have been numerous attempts to develop screening measures to identify the potential presence of steep discounting rate (e.g., Navarick 2004; Smith and Hantula 2008). To evaluate the discounting of delayed rewards, the most commonly used traditional discounting measure presents an individual with a series of pairs of hypothetical choices: participants choose between a smaller, more immediate alternative and a larger, more delayed alternative (e.g., Green and Myerson 2004; Rachlin et al. 1991). For example, suppose a participant is presented with a choice between receiving $10 immediately or $100 in six months, and he chooses the immediate option, and that subsequently the participant must decide between $10 or $110 in six months, and he chooses the future option (Rachlin et al. 1991). This pattern of choices implies that the participant would be indifferent between $10 today and roughly $105 in six months. This indifference point can then be converted into a discount rate using a number of different models (e.g., Myerson et al. 2001). Second, the matching method, in contrast to the choice-based method, asks for the indifference point directly. For example, it might ask the participant what amount “X” would make him indifferent between $10 immediately and $X in six months.

The most common criticism of the use of traditional questionnaires with pairs of options is that the experimental setting fails to capture the actual circumstances of making real choices (Navarick 2004). Participants are presented with a completely hypothetical situation, so that the choices they make are also purely theoretical (Madden et al. 2004). There is no conclusive evidence that decisions made in real-life circumstances would be the same. An effective solution would be to develop a tool measuring the steepness of various types of discounting, where traditional pairs of choices between out-of-context hypothetical sums of money would be replaced with questions referring to participants’ experiences and specific actions. Another argument in favour of the Discounting Inventory is that the measures of discounting in use today are often plagued by significant randomness and inconsistency in how they are completed (Smith and Hantula 2008). This is evidenced by relatively many shortcomings in the goodness of fit of discounting models to actual results (Madden and Bickel 2010), which is likely an effect of the type of measure used for discounting. The additional problem with traditional discounting measures using pairs of hypothetical choices is that the accuracy of measurement may be compromised due to task fatigue or boredom as a result of the many choices required, often 100 or more (Navarick 2004). Hence the need to create a universal tool for measuring individual differences in discounting, one that contains questions with varied content, is independent of an arbitrary type of reward, delay, effort, and number of people with whom to share the outcome.

The aim of this research project is to develop a Discounting Inventory that would allow the measurement of individual differences in the discounting rate. The construction of a universal inventory would support a hypothesis that the discounting rate can be regarded as an individual personality trait. The success of the project, meaning the construction of an instrument for measuring discounting with good psychometric properties, will be useful for both further basic research (as a comparative measure) and for psychological practice.

Generation of Items and Itemmetric Analysis

The creation of items and all consecutive steps were dictated by psychometrical requirements underlying the development of personality inventories (Nunnally 1978; Strelau and Zawadzki 1993), which resulted in 436 items. Most of the items for the candidate discounting type were written relying on current literature using the classis rational-empirical approach. The item content was related to the respective theoretical constructs of discounting (Rachlin and Raineri 1992). Each discounting type included items representing behaviors, intentions, and attitudes. In turn, these items were recorded in the form of indicative statements. It was also important to keep all items consistent in terms of perspective, i.e. being sure not to mix items that assess behaviors with items that assess affective responses to or outcomes of behaviors (Strelau and Zawadzki 1993). We tried to omit negatively-worded items, as previous research pointed out a few of these items randomly interspersed within a measure can have a detrimental effects on its psychometric properties (Murphy and Davidshofer 2005). Items were judged and, if necessary, corrected, taking into account linguistic criteria to avoid mistakes, questions, passive voice, difficult terms (Strelau and Zawadzki 1993). This task was performed by four professional linguists.

The next phase consisted of testing the content validity of the items (Murphy and Davidshofer 2005). Seven raters for the content validity were psychologists trained extensively in discounting theory through course work and research experience in these areas. They sorted the items into the four separate scales, taking their content into account. A significant coefficient of rater agreement was obtained (0.81). The coefficient of rater agreement expresses the average ratio of the number of agreed ratings to the highest number of agreed ratings, estimated by comparing the stores of all judges (where 0 = lack of agreement and 1 = full agreement). Statistical significance was scored by calculating the random distribution of ratings. Subsequently, the items were assessed separately for each scale by means of a 7-point rating scale (ranging from 1 = does not correspond at all to 7 = fully corresponds) for content validity–a standard procedure measure of prototypicality. Assessment done by the same group of psychology experts resulted in a coefficient of rater agreement of 0.90 being highly statistically significant. This coefficient was calculated by taking into account the ratio of the average mean deviation of ratings by all judges to the highest possible mean deviation, where 0 = lack of agreement and 1 = full agreement. Statistical significance was also scored by calculating the random distribution of ratings. The linguistic analysis of items as well as content validity studies allowed us to reduce the number of items from 436 to 209.

Study 1: The Psychometric Analyses

In Study 1, after developing an item pool and subjecting it to content validity analyses, we conducted factor analyses to examine the factor structure of the DI and develop the final version by successive reduction of the number of items. The first stage in analyzing data consisted of shortening the instrument and developing four scales according to the four types of discounting distinguished at the beginning of our study. As a result, the 209-item pool will be reduced.

Sample

The total sample consisted of 2843 Polish participants (1547 female and 1296 male) ranging in age from 18 to 77 years (M = 29.95, SD = 10.25) but with approximately 45% of them aged between 19 and 26 years. The results obtained from the subjects were divided into two groups before calculating the psychometric parameters. The total sample was randomly divided into calibration (n = 1423; the data derived from these individuals were used for the principal components analysis) and validation (n = 1420; these data were subjected to the confirmatory factor analysis) subsamples.

Measure

The measurement comprised the 209-item pool of the Discounting Inventory in the 4-point Likert scale format (4 = fully agree, 3 = agree slightly, 2 = disagree slightly, 1 = disagree completely).

Principal Component Analyses

To establish the structure of discounting process, the principal components method was used, and the scree test (Murphy and Davidshofer 2005) was applied to identify the number of factors. We used a principal component factor analysis because it is commonly used for data reduction and is preferably used when the aim is to analyze the structure and identify the underlying dimensions. This procedure had the advantage of allowing us to choose items on the basis of their factor loading and include items from each of the preliminary scales without being affected by the number and arbitrary classification of items in the preliminary scales (Murphy and Davidshofer 2005; Nunnally 1978). An oblique (Oblimin) rotation was performed. Oblique rotation methods assume that the factors are intercorrelated, and previous research showed that different types of discounting were significantly correlated (Estle et al. 2007; Green and Myerson 2004, 2010). The results regarding the factor structure (after Oblimin rotation) of the items pool showed the final classification of items into four discounting scales.

Results

The Kaiser–Meyer–Olkin measures of sample adequacy were above 0.80. We used an eigenvalue greater or equal to 1.0 and scree test for determining the number of factors to be retained and rotated. A principal factor analysis with Oblimin rotation produced four factors with a scree plot that reveals the clear indication of a four-factor solution explaining 55% of the variance, the contribution of the single factors being 24.6%, 13.6%, 10.8%, and 6.0%. The corresponding eigenvalues were 11.8, 6.5, 5.2, and 2.9, respectively. Other factorial solutions that focused only on one common factor (as domain-general approaches to discounting would imply) did not go beyond 25% of explained variance. Because the “eigenvalue >1” rule tends to overextract, two tests were conducted: a parallel analysis and Velicer’s MAP criteria—the two best methods for determining number of factors (O’Connor 2000). These methods were applied using R software, which is capable of handling the polychoric correlation matrix necessary to analyze Likert-type data and can also provide both parallel analysis and Velicer’s MAP results. Both methods suggested that four factors be retained. We therefore concluded that a four-factor solution best fit the data for the exploratory subsample.

To obtain the final version and reduce the number of items, we selected 12 items for each scale based on three criteria: Items in which the communality was below 0.4 were discarded. Moreover, items that generated two or more coefficients above 0.4 in the pattern matrix were deleted. Similarly, items that yielded no coefficients above 0.4 were deleted. The loadings were quite high, ranging from 0.41 to 0.95. Thus, 48 items were considered for subsequent analyses through confirmatory factor analyses. The exact meaning of 48 items can be found in Appendix.

Confirmatory Factor Analyses

Next, to substantiate the reliability of these four factors, the data generated by the second subsample of participants were subjected to confirmatory factor analysis, using the AMOS program. The estimation method was Maximum Likelihood. A CFA model was performed on the 48 items remaining after the deletions described earlier. Based on the standardized regression weights, each item was linked to a single factor. All of these coefficients exceed 0.50, providing additional support for the efficacy of this model.

Model Fit

We used several criteria of model fit (see Bollen and Long 1993). A well-fitting model should ideally have a nonsignificant χ2 statistic (p > 0.05). Because the χ2 statistic tends to be inflated in large samples, the ratio χ2/df was determined, which should not be much larger than 2.0. The χ2/df is a measure of the absolute fit of the model with the data, indicating how closely the model fits compared to a perfect fit. The model is considered to have an acceptable fit if the GFI (goodness-of-fit index), TLI (Tucker–Lewis index), and CFI (comparative fit index) values are approximately 0.90 or above. The RMSEA (root mean square error of approximation) represent reasonable errors of approximation in the population; a value of approximately .05 or less would indicate a close fit, and a value of up to 0.08 would represent a reasonable fit of the model. We note, however, that the choice of indices and cutoff values is a topic surrounded by considerable controversy (see, e.g., Mulaik 2007).

Results

The four-factor model had a good fit to the data: (χ2 [330 df] = 650.10, p = 0.04, χ2/df = 1.97; see Table 1 for more details). Second, for the purpose of comparison, a one-factor model, which presupposes that all the items pertain to the same factor, was also assessed. According to the χ2 statistic, the model would have to be rejected (χ2 [607 df] = 1286.92, p < 0.001). The value χ2/df = 2.12 obtained here indicates an unacceptable fit. An acceptable χ2/df is usually set at or less than 2. The four-factor model had a much better fit than a one-factor model of general discounting, Δχ2(277) = 636.82, p < 0.001. Moreover, the significant χ2 indicates that the four types of discounting are correlated but not independent.

Table 1 Confirmatory factor analysis fit

Descriptive Statistics, Gender Differences, and Internal Consistency of the Final Version of DI

Psychometric properties of the newly constructed 48-item DI in the combined sample (n = 2843) are summarized in Table 2. Both skewness and kurtosis were less than or equal to 1.0 for all scales, indicating that the item distributions were similar to the rest of the items in the instrument and that the item distributions were rather symmetrical. The internal consistencies measured with Cronbach’s alphas are adequate. The following coefficients for the internal consistency of the scales were observed: total measure α = 0.89, effort discounting scale α = 0.95, probabilistic discounting scale α = 0.88, social discounting scale α = 0.85, and delay discounting scale α = 0.87, suggesting that the items possess satisfactory internal consistency. Item-total score correlations were all positive, ranging from 0.43 to 0.86. Intercorrelations between four main factors are generally low, with only low correlations between probabilistic and social discounting (0.18) as well as probabilistic and effort discounting (0.25). The correlation between delay and social discounting is very low (0.11), which emphasizes the need to distinguish between these two factors. As predicted, the correlations between delay and effort discounting factors are somewhat higher (0.31), with the highest intercorrelations occurring between delay and probabilistic discounting (0.45), which yields support for Rachlin’s theory of discounting (1992). These results suggest that although similar decision-making processes may underlie different types of discounting, four kinds of discounting may reflect separate domains, rather than a single domain of impulsivity.

Table 2 Descriptive statistics, t-tests comparisons by sex and alpha coefficients (N = 2843)

In the personality literature, when a gender difference in impulsiveness is reported, females are less impulsive than males, usually (Logue 1995). In the present research, it was expected that the female participants will show smaller discounting than will do the male participants. The hypothesis was supported additionally by the results of past research on discounting. When gender differences were reported, it was male participants who has shown greater discounting (e.g. Kirby and Maraković 1996). As expected, significant gender differences resulted for delay and effort discounting scales, with male participants scoring higher than females on both factors. Other gender differences in two other scales were not observed (see Table 2).

Discussion of Study 1

To ascertain the final version of the instrument, discern the factor structure that underpins the DI, and examine whether our a priori classification of four discounting domains was supported, two phases of analysis were undertaken. First, a principal components analysis was undertaken on the first sample. The output of this procedure was then substantiated with a confirmatory factor analysis on the second sample. We obtained a robust factor solution with four facets for each of the DI factors, each facet having low secondary loadings and good internal consistency. We named the four resulting facets for each DI factor in line with the theoretical content of their assigned items: four scales have the same names as the four types of discounting being distinguished (Green and Myerson 2004; Ostaszewski et al. 1998; Rachlin and Raineri 1992; Sugiwaka and Okouchi 2004). This four-factor structure was also replicated in a second sample in the confirmatory factor analyses. Finally, the results obtained allowed us to conclude that the final DI scales have very good distribution characteristics (kurtosis and skewness), show adequate variability, and have plausible reliability scores as measured by Cronbach’s alpha for each scale and for the entire inventory.

To meet the need for a shorter instrument that assesses all four types of discounting efficiently, we decided to reduce the remaining pull of items. Through several iterations of retaining and deleting items based on their component loadings, item intercorrelations, and contribution to coefficient alphas, the total number of items was reduced from 209 to 48 (12 per scale). Those 48 items had loadings equal to or higher than 0.40 on their own factor and lower on the remaining factors.

Study 2: Test-Retest Reliability

Stability over time is one of the defining characteristics of personality traits (Murphy and Davidshofer 2005). Therefore, such evidence will bolster the claims about the usefulness of our measure as well as provide support for the treatment of the different types of discounting as stable personality variables in a newly developed instrument. Here, the test-retest reliability of the DI measure was assessed during a 2-week period.

Sample

The sample consisted of 371 participants (246 were female and 125 were male) aged from 18 to 59 years (M = 26.8, SD = 5.75). Most of them were high school graduates or had a higher level of education (62% high school, 27% university degree). Fifty-five percent of the participants were students. Participants were tested twice with a 2-week interval. A total of 371 individuals completed the DI measure a second time, 2 weeks after its initial administration. During each test session, participants completed the DI measure. All participants provided informed consent twice after the nature of the study had been explained to them.

Measure

Study 2 used the same 48-item DI measure constructed previously in Study 1.

Results

Test–retest stability of scale scores were checked with two-tailed Pearson correlations (as well as their 95% confidence intervals). For the total instrument, test–retest reliability coefficient of rtt = 0.73 (95% CI = 0.71–0.75) was obtained. The test-retest values for the different subscales were as follows: delay discounting rtt = 0.71 (95% CI = 0.70–0.72), effort discounting rtt = 0.74 (95% CI = 0.73–0.75), social discounting rtt = 0.81 (95% CI = 0.80–0.82), and probabilistic discounting rtt = 0.67 (95% CI = 0.65–0.69).

Discussion of Study 2

Despite being collected two weeks apart, each of the participants’ reports of the four dimensions exhibited moderate to strong test stability (rtt = 0.65–0.82). All reliabilities were significant with p < 0.05. Hence, these findings suggest that all these subscales are reasonable, indicating good test-retest stability of the DI measure. However, as stability over time is one of the defining characteristics of personality traits, obtaining the data from 2-week test-retest is not enough to prove that discounting can be regarded as a personality trait. Especially that the probability scores evidenced a much lower test–retest reliability than did the other components, indicating that the probability scale scores changed significantly as a function of time, whereas other component scores (and the overall score) were more comparatively stable. These data suggest that probability scale scores may reflect more of a state than trait function in maintaining discounting process. While traditional discounting measures are known to produce stable measures of discounting across re-testing intervals ranging up to one year (e.g., Kirby 2009; Ohmura et al. 2006; Simpson and Vuchinich 2000) little is known about the test-retest reliability of the DI. The short re-test interval raises the possibility that choices made in the first session influenced those made in the second session. Thus, the goal of the next investigation should be to test the reliability of the DI after a longer interim (e.g. 6 months). It is assumed that the longer retest time can provide a more stringent test of the DI’s temporal stability. Test–retest reliability tends to decrease over time (Cronbach 1990). Thus, if the DI’s temporal stability were “good” at 6 months, it would be reasonable to assume that it would be “good” or “very good” at one or three months.

Study 3

It is also important to evaluate whether the Discounting Inventory measures the same construct as a traditional discounting task. Here, in Study 3, the goal of the experiment was to assess the concurrent validity of the DI by comparing discounting scores between this task and a widely used traditional discounting instrument. Correlations are expected to vary according to the similarity of the constructs being measured by each instrument. Both DI and traditional discounting measures are intended to assess the discounting construct, so the two measures should be correlated. However, they differ in the way they assess the construct of discounting, including self-reported measures of personality that rely on individuals’ perceptions of their behavior and behavioral tasks that measure overt behavior related to specific dimensions of impulsivity (Reynolds et al. 2006). In addition, previous studies have reported significant correlations between different measures of discounting (e.g., Experiential Discounting Task and traditional measure of the discounting rate; r = 0.26–0.52) (Reynolds and Schiffbauer 2004). But because previous research on correlations between different measures of discounting reported modest associations, correlations were expected to be in the small to moderate range (r = 0.3–0.5).

Participants

A total of 179 participants (118 women and 61 men) completed this study, ranging in age from 19 to 39 years (M = 22.8, SD = 1.52). The majority of participants (74%) were university students. All participants gave their informed consent before inclusion in the study.

Measures

Discounting Inventory

Study 3 used the same 48-item DI measure constructed previously in Study 1 and 2.

Traditional Discounting

Traditional discounting procedure was adapted from Richards et al. (1999). Measure consisted of four parts. One assessed the steepness of delay discounting, the second one assessed the steepness of effort discounting, the third one the steepness of social discounting, and the fourth one assessed the steepness of probabilistic discounting. The sequence of tasks was counterbalanced across participants.

The effort discounting part consisted of five pages, each with a table with two columns. Effortful reward was presented in the right-hand column, together with information about particular effort requirements. Effortless rewards were presented in the left-hand column. On each page, a different value of effort necessary to receive a reward of PLN 800 (at the time of the study, U.S. $1 ≈ PLN 4.00) was presented. On consecutive pages, the values of effort increased. The extent of the effort depended on the floor to which the participant had to climb. Participants were asked to imagine climbing stairs up to a specified floor (the 3rd, 10th, 15th, 30th, and 50th floor). The effortless rewards were printed in rows in descending order, starting at 100% of the value of an effortful reward and ending at 0% of the effortful reward. For example, in the fifth row of one table, a participant had to choose between climbing to the 30th floor to receive PLN 800 or an effortless reward of PLN 680. Participants were asked to mark which of the two rewards (effortful or effortless) they chose in every row of every table.

The delay, probabilistic, and social discounting parts of the traditional discounting questionnaire were prepared in exactly the same way as the effort discounting part. The delayed reward (PLN 1400) was presented in the right-hand column, together with the information about the particular delay (1 mo., 6 mo., 12 mo., 5 yr., or 15 yr.). Immediate rewards were presented in the left-hand column and printed in rows in descending order from 1400 to 0 PLN. Probabilistic discounting—in this case, the probability of receiving a reward —was assessed at five probability interval values: 5%, 25%, 50%, 75%, and 95%. On each probabilistic task trial, participants chose between a certain amount of money and the possibility of receiving PLN 1200 with a specified probability. In the social discounting portion, participants made hypothetical choices between a smaller monetary reward exclusively for themselves or a larger reward they had to share equally with a specific number of strangers (1, 3, 5, 11, 17 people; PLN 900). Although the delay, probabilistic, social, and effort tasks and the outcomes were hypothetical, participants were instructed to act as if the situation were real. Participants were given the following instruction:,,You will not receive any of the rewards that you choose, but we want you to make your decisions as though you were really going to get the rewards you choose”.

Procedure

Participants started each page of the traditional discounting measure from the top, where both amounts were equal, and chose one of the two options in each row. The aim of the procedure was to discover the lowest amount of effortless, certain, received for oneself, and immediate reward that a participant would prefer to receive instead of a reward that required a particular effort, probability, sharing with others, or delay. This lowest amount would be the last amount that the participant chose in the left-hand column, before switching to the effortful, uncertain, shared, or delayed option in the right-hand column. This amount was considered the subjective value of the reward for a given magnitude of effort, probability, number of people, or given delay.

Analysis of the Data from the Traditional Discounting Measure

An area-under-the-curve (AUC) method was used to characterize the delay, probabilistic, social, and effort discounting rate (Myerson et al. 2001). AUC involves computing the area of the trapezoids that are created by plotting the coordinates of indifference points for each delay period, probability, number of people, and effort values (for details, see Myerson et al. 2001). These values are summed to obtain a total area that ranges from 0.0 to 1.0; larger AUC values are indicative of slower or no discounting, and lower AUC values mean greater levels of discounting.

Myerson et al. (2001) stated that AUC has several merits that make it appropriate for statistical analysis in discounting research. One advantage is that the AUC measure is theoretically neutral and can be calculated for all participants regardless of their response pattern (Ostaszewski et al. 2013). AUC requires no a priori assumptions about the shape of the discount function or the number of free parameters used in modeling (Smith and Hantula 2008). In addition, AUC measure usually follows a normal distribution, which allows the use of parametric statistical analysis. As such, these improved measurement properties make it a stronger candidate marker than discounting rates.

Identifying Nonsystematic Discounting Data

Individual hypothetical discounting patterns were also categorized as systematic and nonsystematic on the basis of Johnson and Bickel’s (2008) criteria of identifying atypical response patterns that suggest random or inconsistent patterns of responding in a sample. Specifically, individual participants were considered nonsystematic responders if the analysis of their hypothetical discounting data revealed that (1) any indifference point (except for the first one) was larger than the previous one by more than 10% and (2) the last indifference point was not less than the first by at least 10%.

In the whole sample, three (1,7%) data sets out of the 179 totally examined was identified as nonsystematic due to criterion 1. In each of these data sets, at least one indifference point suggests a departure from the monotonically decreasing function. Seven data sets (4%) out of 179 examined were identified as nonsystematic due to criterion 2. That is, the last indifference point was not less than the first indifference point by at least a magnitude equal to 10% of the larger later reward. Individual with the hypothetical discounting patterns categorized as nonsystematic on the basis of Johnson and Bickel’s (2008) criteria of identifying atypical response patterns, were excluded from further analysis. Thus the final sample consisted of 169 participants.

Results

To verify the relationship between two measures of discounting, Pearson’s correlation coefficients were calculated. Table 3 summarizes these results. All correlations are in the expected direction.

Table 3 Correlation matrix (N = 179) comparing all measures of discounting

DI produced significant negative correlations with the established measures of discounting (range r: −0.25– –0.55). In the more easily interpreted scenario, the results showed that the higher one’s scores on the DI’s scales, the steeper one’s delay, probabilistic, social, and effort discounting were. In addition, all four DI scales provided significant associations with each discounting type measured by the traditional discounting instrument. Note from Table 3 that each DI scale correlated −0.41 or better with its standard counterpart.

Discussion of Study 3

The present study evaluated the concurrent validity of the DI. It assesses whether a test actually measures the construct it purports to measure (Cronbach 1990). In this case, concurrent validity was evaluated by comparing DI performance to that obtained using standardized delay, probability, social, and effort discounting tasks (Richards et al. 1999). We expected significant associations between the two indicators of discounting. On the basis of previously reported findings in the assessment of different measures of discounting, we selected a correlation of 0.30 as the threshold criterion for establishing concurrent validity. Statistically significant correlations were detected between each DI scale and four traditional discounting measures. All of these correlations reached the criterion of 0.30. The consensus of the evidence suggests that the DI measures a similar construct to that measured by a traditional discounting instrument. However, the conclusion should be validated in the further research, in which the results of the DI measurement would be related to the results of behavioral discounting procedures. One of the limitations of the present study is the fact, that both the DI and the traditional discounting tasks represent paper and pencil methods.

Based on the observed correlations, one might argue that the Discounting Inventory is a central conceptual component of impulsivity but only a peripheral component of discounting. Although a portion of this correlation could stem from common method variance, the datum still supports the assertion of construct correspondence between the two discounting measures. At one level of conceptualization, different instruments have varying degrees of overlap but no one measure is comprehensive (Reynolds and Schiffbauer 2004; Reynolds et al. 2006). Others (Mitchell 1999; Richards et al. 1999) have interpreted the small association between self-reported scores and choice-based (behavioral) measures of discounting as evidence that the behavioral tendencies characterized by extreme discounting may not be the same as those indicated from self-report inventories of impulsivity—perhaps because assessments of discounting isolate a more narrowly defined behavior (see Reynolds et al. 2006). The traditional discounting measure, regards specific preferences concerning the narrow preference between immediate and delayed, certain and probabilistic, received for oneself and shared, and effortful and effortless outcomes. These behavioral choices do not appear to be global (e.g., they depend on the commodity and on the sign, whether it is a gain or a loss) (Reynolds et al. 2006). On the other hand, the DI measures more broad aspects of behaviors or subjective experiences.

General Discussion

The aim of this research was to develop and investigate the psychometric properties of the Discounting Inventory (DI). We developed the DI from an initial pool of 436 items. Its final version includes four factors with 12 items per factor. The principal component analysis with an Oblimin rotation allowed extracting four factors, explaining 55.0% of the variance. These factors were closely associated with the theoretical four dimensions, which we have referred to as delay discounting, probabilistic discounting, social discounting, and effort discounting (Green and Myerson 2004; Ostaszewski et al. 1998; Rachlin and Raineri 1992; Sugiwaka and Okouchi 2004). Confirmatory factor analyses have been performed to test the adequacy of the structure of this model. Concerning the 48-item version, the results are very similar to those found in the PCA. The fit of the model is high when considering the χ2/df or the RMSEA measures. According to the χ2 statistic, however, the model would have to be rejected. This type of conflicting result is usually observed in personality models (Bollen and Long 1993; Mulaik 2007). According to past studies, the sample size and the number of variables can affect the χ2 significance. Therefore, paying more attention to the χ2/df measure is recommended. According to this measure, the 48-item model has a reasonable fit. The analysis replicated the four-factor structure we postulated and confirmed the orthogonality of these factors. The correlational architecture of the DI corresponds well with what was observed using traditional measures of the discounting rate. In particular, the correlations among the four types of discounting were mostly weak, as postulated by previous studies of discounting (e.g., Green and Myerson 2004, 2010; Mitchell 2004). Furthermore, significant concurrent validity is shown by evidence that the instrument is correlated with other assessment techniques of the same or similar constructs and assessed on the basis of correlations between the DI and traditional measures of discounting. Finally, the internal consistency and test-retest stability are satisfactory. Thus, the results of the psychometric properties of the DI measure indicate that it meets adequate, satisfactory criteria of concurrent validity and responsiveness for use as a discounting measure.

Nevertheless, the evidence from the present research is inconsistent with the idea of a single trait of impulsivity, one that involves all four kinds of discounting, where a trait is defined as an enduring behavioral tendency that manifests itself in multiple, diverse situations. First, results from the factor analyses suggest that four types of discounting may reflect separate domains, rather than a single domain of impulsivity. In addition to showing steep discounting of one type, one might expect individuals who engage in impulsive decision making to show greater discounting of other type. However, in the current research different types of discounting are only weakly to moderately intercorrelated. More research is needed, for example test-retest stability over a longer period of time, to verify the status of the discounting rate as a stable in time trait. Thus, based on the present findings, we cannot confirm that the discounting rate can be treated as a personality trait. However, such findings argue for a more state view of discounting rather than a trait function.

Limitations and Future Directions

One theoretical limitation of the DI should be considered, however. Specifically, it is related to the social discounting subscale. The scale is based on the original definition of the social discounting process, formulated by Rachlin and Raineri (1992), as a decrease in the subjective value of a reward as the number of people with whom it must be shared increases. There is also an alternative definition of the process in the literature, with social distance relative to the person with whom a reward is to be shared as the discounting factor (e.g., Jones and Rachlin 2006; Rachlin and Jones 2008). Applying this definition in research, however, requires participants to be able to imagine a certain number of people with longer and longer social distance to the participant. Thus, defined this way, the social discounting process seems adequate only to human subjects with good mental skills. Although the original definition was never shown as inadequate, it seems that the definition based on social distance received relatively more attention in the contemporary literature. The authors of the DI questionnaire made an arbitrary decision to follow the original definition because the objective, numerical character of the discounting factor understood as an increasing number of people sharing a reward seems closer to other objective, behavioral discounting factors (i.e., delay, probability, and effort). The social discounting process defined this way stays universal and species-independent, because it does not require abstract thought, which is a solely human attribute. Nevertheless, using the DI questionnaire, one should keep in mind that the process could be understand also in another way.

As already noted in the discussion of Study 2, little is known about the long test-retest reliability of the DI. While traditional discounting measures are known to produce stable measures of discounting across re-testing intervals ranging up to one year (e.g., Kirby 2009; Ohmura et al. 2006; Simpson and Vuchinich 2000) little is known about the long test-retest reliability of the DI. In the present research the short re-test interval raises the possibility that choices made in the first session influenced those made in the second session. Thus, the goal of the future investigation is to test the reliability of the DI after an interim of for example 6 months. It is assumed that the 6-month retest time would provide a more stringent test of the DI’s temporal stability.

In addition, it is important to note that conclusions about the validity of the DI scales are limited. In general, assessments should be concerned with two areas: the need to obtain data indicating the type of measurement validity in more than one group of subjects and the need to test several types of validity (e.g., analyses against external criteria including convergent and discriminant validity; Cronbach 1990). The first recommendation is quite clear. It must be demonstrated that the constructed inventory is characterized by the stable validity of the measurement between groups. The second recommendation is that the psychometric properties be subjected to a comprehensive assessment, and so from the various points of view of the various possible uses of the inventory (Murphy and Davidshofer 2005). Another important implication for future validation research is to expand the knowledge about the nomological network of the DI involving theoretical and observational terms (see Borsboom et al. 2004). Finally, if time is limited, using concise measures can eliminate item redundancy, save time and effort, and consequently reduce participant fatigue and boredom. Thus, to meet the need for a brief assessment method, future research should develop shorter version of the DI with reduced number of items. Despite its limitations, this study is the first to report the systematic development and psychometric properties of a comprehensive self-reporting discounting instrument with statements instead of pairs of hypothetical choices.