The Implicit Association Test (IAT; Greenwald, McGhee, & Schwartz, 1998) is a computerized two-choice discrimination task in which stimuli have to be categorized as belonging to object categories (e.g., flowers and insects) or attribute categories (e.g., good and bad) by pressing, as quickly and accurately as possible, one of two response keys. The IAT consists of seven blocks. Three practice blocks involve the categorization of stimuli that represent either the object categories or the attribute categories. Four critical blocks involve the simultaneous categorization of stimuli representing the four categories, with two response mappings. In one mapping, the categories flowers and good share a response key, and the categories insects and bad share the other. In the other mapping, the categories flowers and bad share a response key, and the categories insects and good share the other. The mapping that leads to faster and more accurate responses is called compatible mapping, whereas the other is called incompatible mapping. The difference in performance between the two kinds of mapping is known as the IAT effect.

A few models for the analysis of process components underlying the IAT effect are present in the literature. The quad model (Conrey, Sherman, Gawronski, Hugenberg, & Groom, 2005) is a multinomial model that disentangles components concerning the automatic activation of associations, the ability to discriminate stimuli, the ability to overcome automatically activated associations, and the influence of general guessing biases. Since the model accounts only for response accuracy, it uses a very small part of the information provided by an IAT. With regard to this limitation, the four parameters of the quad model account for only 24 % of the variance of the D scores (Greenwald, Nosek, & Banaji, 2003) computed on the same data (Conrey et al., 2005).

More recently, a diffusion model (DM) analysis has been proposed (Klauer, Voss, Schmitz, & Teige-Mocigemba, 2007), which disentangles components concerning the rate of information accumulation (drift), the speed–accuracy setting (threshold), and the nondecision components. The analysis assumes serial processing of the stimuli, and it simultaneously accounts for response accuracy and reaction time. Both drifts and thresholds vary across blocks of trials. The DM provides useful information in the analysis of IAT data. However, it also presents some shortcomings. At the estimation level, only IAT procedures including many more trials than usual can be analyzed, and different DMs have to be estimated for the critical blocks. Hence, parameter values estimated for different blocks cannot be compared apart from the other parameters to obtain an individual measure of automatic association. For instance, drift rates of different blocks cannot be compared without taking into account thresholds as well. At the theoretical level, the DM assumes that the individual serially processes each stimulus, although the empirical evidence accumulated using many different visual search tasks, including multiple-target searches (van der Heijden, 1975) and short words as stimuli, seems to support most parallel, rather than serial, processing of stimuli (Andrews, 1992; Evans, Horowitz, & Wolfe, 2011; Thornton & Gilden, 2007; Townsend, 1990).

As a result, there is a need for a formal model that simultaneously allows the separation of automatic and controlled processes, accounts for both response times (RTs) and errors, and provides researchers with a quantitative and detailed comparison of individual performance at the critical tasks. This study presents a model that overcomes current limitations. The model has been specifically derived to fit the characteristics of the IAT and the needs of the researchers using it. The discrimination–association model (DAM) decomposes the IAT effect into three process components: stimuli discrimination, automatic association, and termination criterion (i.e., task difficulty or individual cautiousness). Unlike the DM analysis of Klauer et al. (2007), in this model, the automatic association component depends on the stimulus categories, but it does not vary across blocks.

The aim of this study is to present the DAM and its validation. In the next section, the DAM and its mathematical specification are introduced. Next, the results of an empirical application of the model are provided. Finally, advantages of the analysis of IAT data derived from using the DAM, instead of other models, are explored.

Overview of the model

In this section, the development of a stochastic model of the accuracy and RTs of an individual respondent to an IAT is described. As such, compatible and incompatible blocks are defined within each participant and can, therefore, vary across participants.

We shall start by considering a collection O of object stimuli and a collection A of attribute stimuli. The task of an individual is to classify the objects in O into the two categories [a] and [b] (e.g., flowers and insects) and the elements in A into the two categories [+] and [−] (e.g., good and bad). The assumption behind the model is that every stimulus in S = OA potentially contains—albeit in a variable quantity—evidence for each of the four categories [a], [b], [+], [−] and that processing of every single stimulus in S occurs in parallel. In particular, evidence in favor of each of the four categories is accumulated on separate, parallel, and independent stochastic processes (called counters) that are engaged in a competition for the emission of the observable response.

According to this idea, the existence of four separate and independent counters, denoted as X a (t), X b (t), X +(t), and X (t), is assumed. Once a participant is presented with a particular stimulus s, each of these four counters starts accruing selective information about a certain characteristic of it. More precisely, for j ∈ {a,b,+,−}, counter X j (t) accrues information about membership of s in category [j]. A specific assumption concerning the nature of these counters is that each of them behaves as a Poisson process. Technically, this implies that, in every single process, (1) interarrival times (i.e., time intervals between consecutive units of information) are independent and identically distributed and (2) their distribution is exponential with rate λ (in this respect, see, e.g., Townsend & Ashby, 1983).

Stimulus discrimination and automatic association

A key property of the model is that the rate at which information is accumulated on each of the four processes depends only on the process and the actual category of the presented stimulus, and on nothing else. Under this assumption, the rate of every single process is not affected by the particular block (practice, compatible, or incompatible) in which the stimulus appears. Thus, in the most general formulation of the model, there will be a different rate for each pair that can be formed by taking one of the four processes X a (t), X b (t), X +(t), X -(t) and one of the four categories [a], [b], [+], [−], as is shown in Table 1.

Table 1 Accumulation rates of each of the four Poisson processes vary with stimulus category

According to this formulation, for i,j ∈ {a,b,+,−}, the parameter λ ij is the average amount of information that process X i (t) accumulates in the time unit, when a stimulus of category [j] is presented. For example, if the presented stimulus belongs to category [a], the rate of process X a (t) will be as large as λ aa , while that of process X +(t) will be λ +a .

The 4 × 4 matrix consisting of the 16 rates listed in Table 1 is naturally decomposed into four submatrices of four rates each. The 2 × 2 submatrix containing the four rates λ aa , λ ab , λ ba , λ bb (the upper left submatrix of the table) is involved in the discrimination between categories [a] and [b]. The better the discrimination, the smaller will be the two rates λ ab and λ ba . A perfect discrimination would indeed imply λ ab = λ ba = 0. Similarly, the lower right submatrix is involved in the discrimination between [+] and [−]. In fact, the most interesting part of Table 1 is represented by the lower left and upper right 2 × 2 submatrices, because they are involved in an association between object and attribute categories. For instance, with regard to the lower left subtable, the parameter λ +a is the rate at which information concerning membership in category [+] is accumulated when the presented stimulus belongs, indeed, to category [a]. To give another example, λ b is the rate at which evidence about membership in category [−] is accrued when the presented stimulus actually belongs to category [b].

Overall, the particular set of values taken by the four parameters λ +a , λ a , λ +b , λ b can be regarded as an association pattern. In a practical application of the model, this pattern will typically change from one individual to another, emphasizing individual differences in both a quantitative and qualitative fashion. Obviously, the four rates λ a+, λ b+, λ a, λ b can also be regarded as an association pattern (for a comparison between the two submatrices, see the following paragraph). To give some examples, the case λ +a > 0, λ a = 0 can be regarded as a perfect association between categories [a] and [+]. Similarly, the condition λ b > 0, λ +b = 0 represents a perfect association between [b] and [−]. Obviously, many other possibilities might arise in an empirical setting.

It is clear that both the lower left and upper right submatrices are involved in an association between objects and attributes, although with different meanings. The lower left submatrix is related to what will be referred to as an object-driven association in the sequel. An association of this type arises when the presented stimulus is an object. The opposite situation occurs when the stimulus is an attribute, a condition in which the association is called attribute driven. This is precisely what happens in the upper right submatrix of Table 1. Obviously, it is a working hypothesis that the two association types (object driven and attribute driven) are different for the same individual. Acceptance or rejection of this specific assumption can occur only on an empirical basis and, in any case, is not critical, because it is always possible to introduce equality constraints of the form λ ij = λ ji for all the association rates.

To summarize, the rate parameters presented in Table 1 can be grouped into four separate categories: (1) object discrimination rates (upper left submatrix), (2) attribute discrimination rates (lower right submatrix), (3) object-driven association rates (lower left submatrix), and (4) attribute-driven association rates (upper right submatrix).

Process superposition and the IAT effect

Not much has been stated so far about the termination of the accumulation process and how each process is involved in producing an observable response. Figure 1 is a pictorial representation of how the four processes are related to both the stimulus and observable response on a single trial of the compatible (left-hand diagram) and incompatible (right-hand diagram) blocks of an IAT.

Fig. 1
figure 1

How processes are connected to the stimulus and the observable response in the compatible (left-hand diagram) and incompatible (right-hand diagram) blocks

The left-hand diagram refers to a trial in which categories [a] and [+] are mapped to the left key, while [b] and [−] are mapped to the right key. Arrows departing from the stimulus and reaching each of the four processes represent information accumulation, which depends on the rates of the four processes.

Concerning the relationship between each of the processes and each of the two response categories (left or right), it can be seen that both processes X a (t) and X +(t) produce the same observable response—namely, left—whereas the response is right for both X b (t) and X (t). That is, the four processes always operate in opposite pairs, or stated another way, there is a race between pairs of processes. The pair that first accumulates the required amount of information produces the observable response. In the standard two-process Poisson race model, this amount of information is often called the termination criterion.

According to a well-known property of Poisson processes, the superposition X 1(t) + X 2(t) of two independent Poisson processes, having rates λ 1 and λ 2, is itself a Poisson process with a rate λ 1 + λ 2. Therefore, the superposition X a (t) + X +(t) is a Poisson process whose rate is λ ai + λ +i , where i ∈ {a,b,+,−} is the category of the presented stimulus. Similar considerations follow for the superposition X b (t) + X (t), whose rate will be λ bi + λ i .

It is a precise assumption of the proposed model that in the compatible blocks of the IAT, a race takes place between the two compound processes X a+(t) = X a (t) + X +(t) and X b(t) = X b (t) + X (t) which have the same termination criterion (Fig. 2). The process that first meets the criterion will produce the observable response. With the introduction of this assumption, the model can be recast as the standard two-process Poisson race model described in Townsend and Ashby (1983).

Fig. 2
figure 2

A race between two-compound Poisson processes. Process X a+(t) has a higher rate than process X b(t). In this case, the race is won by process X a+(t), which accumulates the required information in the shortest time

The right-hand diagram of Fig. 1 refers to a trial of the incompatible block. Here, the categories [a] and [−] are mapped to the right key, while [b] and [+] are mapped to the left key. When compared with the trial of the compatible block, the only difference is that the labels [a] and [b] swapped positions on the screen. As is shown in the diagram, the corresponding arrows are inverted accordingly so that the race, in this block, is between X a(t) = X a (t) + X (t) and X b+(t) = X b (t) + X +(t).

This inversion plays a fundamental role in the model and represents the basic mechanism of an important component of the IAT effect. Let us suppose that the presented stimulus belongs to category [a]. Then, X a+(t) is the process that provides the correct response (or simply, it is the correct process) in the compatible block, whereas the correct process in the incompatible blocks is X a(t). By assuming identical termination criteria in the two blocks, if the condition λ +a > λ a holds true, the expected RT for the correct response will be longer in the incompatible blocks, when compared with the compatible blocks. This happens because the rate of the correct process in the incompatible blocks (λ aa + λ a ) is less than that of the correct process in the compatible blocks (λ aa + λ +a ).

Additional components of the IAT effect: the termination criteria

In general, there is no reason for assuming identical termination criteria in the two critical blocks, and thus, they can be left free to vary across blocks (practice, compatible, or incompatible). Every single termination criterion is denoted by K b , where b specifies the blocks (P = practice, C = compatible, I = incompatible). There are no additional restrictions on this set of three termination criteria. However, some specific hypotheses on how they are related to one another can be formulated. To begin with, let us suppose that a stimulus of category [a] appears on the screen. If the condition λ +a = λ a holds true, the two correct processes X a+(t) (compatible blocks) and X a(t) (incompatible blocks) will have identical rates. Nonetheless, an IAT effect will still be observed whenever the condition K C < K I is satisfied. This will happen because of the larger amount of information required by the correct process in the incompatible blocks, which, under the equality condition stated above, would have a longer expected RT. The positive difference K I K C is thus an additional component of the IAT effect. In general, when the practice blocks are considered as well, it seems reasonable to expect the following inequalities across the blocks:

$$ {K_P}<{K_C}<{K_I}. $$

This inequality allows one to interpret the termination criteria as either task difficulty or individual cautiousness.

Separation between discrimination and association and model identifiability

Perhaps the most appealing feature of the DAM is that it is aimed at separating association from discrimination in the critical blocks. This separation is meaningful if the discrimination and association rate parameters can be uniquely determined. This is an aspect that cannot be taken for granted. In this respect, let us suppose that a stimulus of category [a] is presented to a participant. Then the rates of the correct and wrong responses in the compatible blocks are λ aa + λ +a and λ ba + λ a , respectively, whereas in the incompatible blocks, we have λ aa + λ a and λ ba + λ +a . With no other restrictions on this set of four parameters, let us define an alternative parameter set \( \left\{ {\lambda_{aa}^{\prime },\lambda_{ba}^{\prime },\lambda_{+a}^{\prime },\lambda_{-a}^{\prime }} \right\} \), such that \( \lambda_{aa}^{\prime }={\lambda_{aa }}+c \), \( \lambda_{ba}^{\prime }={\lambda_{ba }}+c \), \( \lambda_{+a}^{\prime }={\lambda_{+a }}-c \), and \( \lambda_{-a}^{\prime }={\lambda_{-a }}-c \) for some arbitrary nonnegative constant c ≤ min{λ a , λ +a }. Then it can be easily seen that the following four equalities hold true:

$$ \begin{array}{*{20}c} {\lambda_{aa}^{\prime }+\lambda_{+a}^{\prime }={\lambda_{aa }}+{\lambda_{+a }},}{\lambda_{ba}^{\prime }+\lambda_{-a}^{\prime }={\lambda_{ba }}+{\lambda_{-a }},} \\ {\lambda_{aa}^{\prime }+\lambda_{-a}^{\prime }={\lambda_{aa }}+{\lambda_{-a }},}{\lambda_{ba}^{\prime }+\lambda_{+a}^{\prime }={\lambda_{ba }}+{\lambda_{-a }},} \\ \end{array} $$

signifying that this new set of parameters will make exactly the same predictions as the original one. In other words, there exists a linear dependence among the four parameters at hand, and this is a clear sign of the fact that, with no further restrictions, the model would not be identifiable. To have a unique separation between discrimination and association parameters, we first observe that identification is restored whenever at least one of the four parameters at hand is not free anymore to vary within the critical blocks. Bearing this in mind, we note that quite reasonable assumptions can indeed be introduced that can solve the separation problem. They establish a special link between the critical and practice blocks. First, we observe that the average amount of information that a stimulus of category [a] provides in the time unit concerning “being a stimulus of category [a]” should be essentially the same in all blocks, including the practice blocks. This assumption implies that the correct discrimination rate λ aa will be equal in both practice and critical blocks. Following an analogous reasoning, a similar assumption can be formulated with respect to the wrong discrimination rate λ ba . These are key assumptions of the model; it is clear, in fact, that discrimination concerns stimuli and not blocks. Evidently, the plausibility of these assumptions can be tested empirically. Moreover, any of the two assumptions can solve the identification problem because, on the basis of the type of equality constraints it introduces, the (either correct or wrong) discrimination parameters of all blocks are uniquely determined in the practice block.

Each of the two assumptions introduced earlier might be criticized because, indeed, the type of discrimination task involved in the critical blocks is different from that involved in the practice blocks. However, it is important to realize that assuming equal discrimination rates across practice and critical blocks by no means implies that the discrimination process is the same in all such blocks. In fact, eventual differences between the two types of processes can still be captured by the different termination criteria K P , K C , K I , and if the inequality K P < K C < K I proves to be empirically valid, the difference K C K P would provide a measure of the relative difficulty of discrimination in critical blocks, when compared with discrimination in the practice blocks.

An empirical application

Method

Participants

A total of 199 psychology students at the University of Padua participated in the study on a voluntary basis. Their mean age was 23.66 years (SD = 2.85), and 122 were female.

Materials and procedure

The participants were tested individually in a laboratory. First, they were presented with a Coca–Pepsi IAT, according to the structure represented in Table 2.

Table 2 Structure of the Coca–Pepsi IAT

The Coca–Pepsi IAT used the category labels Coca Cola, Pepsi Cola, good, and bad. Ten color brand pictures were used to represent the object categories Coca Cola and Pepsi Cola, and 16 words were used to represent the attribute categories good (glory, good, happiness, joy, laughing, love, peace, pleasure) and bad (annoying, bad, evil, failure, hate, horrible, pain, terrible). The stimuli were presented in the center of the computer screen in an alternating fashion, and the participants were asked to categorize them by pressing, as quickly and accurately as possible, the response key “Q” or “P,” respectively.

Then participants were presented with a demographic questionnaire and with two dichotomous questions that asked them to indicate the tastiest cola and the most attractive brand, between Coca Cola and Pepsi Cola. At the end, the experimenter invited the participants to choose between a free can of Coca Cola or Pepsi Cola. The two cans were disposed on a table. The experimenter registered the choice of the participants after they had left the laboratory.

Analysis procedure

Four vectors were obtained for each participant. The vectors specify the kind of block (practice, Pepsibad/Cocagood, Cocabad/Pepsigood), the category to which a stimulus belongs (CCoca Cola, PPepsi Cola, Ggood, Bbad), the accuracy of a response (1 = correct, 0 = incorrect), and the latency of the response in milliseconds. The length of the vectors was equal to the number of trials (n = 184). The four vectors represent the input of a MATLAB function for computing maximum likelihood estimates of the parameters of the DAM for every single participant. A software for estimating and testing the DAM is available upon request to the first author. Maximum likelihood estimation is sensitive to outliers and contaminants in the distributions of RTs. For this reason, following Klauer et al. (2007), trials whose latencies were outliers in the RT distribution were discarded according to Tukey’s criterion (see, e.g., Hoaglin, Mosteller, & Tukey, 1983). This led to the deletion of 6.92 % of all the responses.

The method for minimizing the negative log-likelihood function of the model was the BFGS optimization procedure (Broyden, 1970; Fletcher, 1970; Goldfarb, 1970; Shanno, 1970) with finite difference gradients. Similar to many other optimization methods, the BFGS is not guaranteed to converge to the global minimum of the negative log-likelihood function. The problem of local minima was tackled by applying the optimization procedure 100 times to the same observed data, each time starting from a different point of the parameter space. Among all different solutions, the one with the largest likelihood was retained. This reestimation procedure was applied to the data set of every single participant.

Another important issue that had to be solved in estimating parameters was the empirical identifiability of the model. Even when a model is identifiable in theory, the data at hand might not be adequately informative for computing reliable estimates of one or more parameters in the model. As far as IAT data are concerned, this is especially true for the wrong discrimination parameters of the DAM. In fact, the discrimination task is usually easy, and the data of some participants might not contain any wrong responses in some blocks of the IAT (typically, the practice blocks). An observed frequency of such responses, greater than zero, is a necessary condition for estimating the wrong discrimination rates.

Irrespective of the missing information in the data, a formal test of the empirical identifiability of the model involves the computation of the Hessian matrix of the model’s negative log-likelihood function at the point at which the latter attains its minimum (Gradshteyn & Ryzhik, 2000). If this matrix happens to be positive definite, the model is empirically identifiable. If not, one or more parameters are not identifiable. In this case, the optimization procedure still provides some estimate for those parameters, but the value obtained is meaningless and should thus be discarded. We note in passing that positive definiteness of the Hessian matrix also ensures that we are really on a minimum and not, for instance, on a saddle point or in a flat valley of the log-likelihood surface, far away from the minimum.

Empirical identification was tested for every single participant. In those cases where the Hessian matrix was not positive definite, the nonidentifiable parameters were not considered in the subsequent analyses. The procedure for detecting nonidentifiable parameters was as follows. The rank r of the Hessian matrix was computed. If r was less than the number m of the free parameters of the model, this was a symptom of a linear dependence among some of the rows (or, equivalently, columns) of the Hessian, indicating possible linear dependencies among the model’s parameters. It is possible to detect parameters that are involved in linear dependencies by computing, for example, the reduced row echelon form of the Hessian matrix (see, e.g., Stefanutti, Heller, Anselmi, & Robusto, 2012, for a similar approach in a different context).

The model’s goodness of fit was tested by applying the method described in the Appendix.

Friedman’s test and Wilcoxon’s test for dependent sample were used for comparing the difficulty of the three kinds of blocks across the participants. We expected the practice blocks to be the easiest. In addition, given the implicit preference for Coca Cola relative to Pepsi Cola commonly observed in the literature (see, e.g., Maison, Greenwald, & Bruin, 2004; Sriram & Greenwald, 2009), we expected the Pepsibad/Cocagood blocks to be easier than the Cocabad/Pepsigood blocks. Friedman’s test was used for comparing the rates concerning correct and incorrect discrimination of each stimulus category. We expected the former to be greater than the latter.

Separate contrast measures, DISC, ASSO, and DIFF, were computed for the discrimination rates, association rates, and termination criteria, respectively. A DISC was computed for each stimulus category by taking the difference between the rate concerning the correct discrimination of the category with that concerning the incorrect discrimination (e.g., DISC C = λ CC λ PC ). Positive values of the DISCs indicate that, on average, the stimuli provide more information, in the time unit, about their own category than about the constrasted one. An ASSO was computed for each stimulus category by contrasting its association rates. Positive values of ASSO C indicate that, on average, the Coca Cola stimuli provide more information, in the time unit, about the category good than about bad (ASSO C = λ GC λ BC ). In other words, the Coca Colagood association is stronger than the Coca Colabad association. Similarly, positive values of ASSO G indicate that the goodCoca Cola association is stronger than the goodPepsi Cola association (ASSO G = λ CG λ PG ). Interpretation of ASSO P and ASSO B is opposite to that of ASSO C and ASSO G (ASSO P = λ BP λ GP ; ASSO B = λ PB λ CB ). A DIFF was computed by contrasting the termination criteria of the critical blocks (DIFF = K Coca-Bad/Pepsi-Good K Pepsi-Bad/Coca-Good ). A positive value to DIFF indicates that the Pepsibad/Cocagood blocks are easier than the Cocabad/Pepsigood blocks. A regression analysis was run, in which the nine contrast measures were the independent variables and the D score was the dependent variable. The D is the most common measure of the IAT effect size. It involves dividing the difference in average response latency between the compatible and incompatible blocks by the standard deviation of latencies for all the critical blocks (for details, see D2 in the work by Greenwald et al., 2003). D expresses the implicit preference for one cola relative to the other. We expected it to be predicted by DIFF and the contrast measures concerning the association rates ASSOs. Since D does not express the difference between correct and incorrect discrimination, it should not be predicted by the constrast measures DISCs.

The nine contrast measures were also the independent variables of three logistic regressions, in which taste preference, brand attractiveness, and choice of the cola were alternatively the dependent variable (0 denotes Pepsi, 1 denotes Coca Cola). We expected the three variables to be predicted by DIFF and the constrast measures ASSOs, but not by the DISCs. Three logistic regressions were also run in which the D score was the independent variable.

Results

The analysis presented in this section is restricted to 161 out of 199 participants whose model was acceptedFootnote 12 p-value > .05), and 2.37 % of the λ parameters were removed from the analysis because they happened to be nonidentifiable. All of them concerned only incorrect discrimination (i.e., λ PC , λ CP , λ BG, and λ GB ). In the IAT, it is not unusual to observe participants who make only a few errors. When this happens, there might be no sufficient information for stable estimates of the λ parameters concerning incorrect discrimination. Out of the 161 participants, 38 presented at least one nonidentifiable parameter.

Table 3 provides the summary statistics of parameter estimates. The first subscript of the λ parameters refers to the process, whereas the second refers to the stimulus category. The letters C, P, G, and B denote Coca Cola, Pepsi Cola, good, and bad, respectively. The subscript of the K parameters denotes the kind of block (practice, Pepsibad/Cocagood, Cocabad/Pepsigood). Friedman’s test for dependent samples showed that the blocks were of different difficulty (mean rank = 1.00, 2.37, and 2.63 for practice, Pepsibad/Cocagood, and Cocabad/Pepsigood blocks, respectively), χ 2(2) = 246.72, p < .001. Wilcoxon’s test highlighted that the Pepsibad/Cocagood blocks were more difficult than the practice blocks, Z = −11.01, p < .001, and less difficult than the Cocabad/Pepsigood blocks,Z = −3.77, p < .001. These results confirm our expectations that the practice blocks were the easiest and that the Pepsibad/Cocagood blocks were easier than the Cocabad/Pepsigood blocks.

Table 3 Summary statistics of parameter estimates

The D score was computed for each participant. A positive D means that the stimuli are categorized faster in the Pepsibad/Cocagood blocks than in the Cocabad/Pepsigood blocks, indicating an implicit preference for Coca Cola relative to Pepsi Cola. Such a preference, although very small, was observed on the overall sample (M = 0.07, SE = 0.03), t(160) = 2.35, p < .05.

For each stimulus category, Wilcoxon’s test showed that the rate concerning the correct discrimination was greater than that concerning the incorrect discrimination (λ CC = 7.49, λ PC = 3.00, Z = 10.37, p < .001; λ PP = 7.21, λ CP = 3.22, Z = 10.55, p < .001; λ GG = 5.76, λ BG = 1.99, Z = 10.27, p < .001; λ BB = 5.48, λ GB = 1.84, Z = 10.70, p < .001; the values are the mean ranks across participants). This result confirms our expectation that the correct discrimination is faster than the incorrect one. No significant differences were found among the mean ranks of the association rates across participants.

Let us now consider the results of the regression analyses. The analyses were performed on 123 participants for whom all parameters were identifiable. It is worth recalling that a positive value of DIFF indicates that the Pepsibad/Cocagood blocks are easier than the Cocabad/Pepsigood blocks. Positive values of the DISCs indicate that, on average, the stimuli provide more information, in the time unit, about their own category than about the contrasted one. Positive values of ASSO C indicate that the Coca Colagood association is stronger than the Coca Colabad association, and positive values of ASSO G indicate that the goodCoca Cola association is stronger than the goodPepsi Cola association. Interpretation of ASSO P and ASSO B is opposite to that of ASSO C and ASSO G. Confirming our expectation, the D score was predicted by DIFF and by all the four contrast measures concerning the association rates ASSO C , ASSO P , ASSO G , and ASSO B (Table 4). These five measures altogether accounted for 65 % of the variance of D (R 2 = .66; AdjR 2 = .65), and none of the measures showed multicollinearity (tolerance ≥ 0.40).

Table 4 Regression of D score on the nine contrast measures

The choice of the cola was predicted by DIFF and ASSO P , and the taste preference was predicted by DIFF (Table 5). None of the contrast measures predicted the brand attractiveness. These results confirm our expectations only partially, indicating that DIFF is an important predictor of the choice of the cola and taste preference. All three variables were predicted by the D score (β = 2.26, p < .001 for choice of the cola; β = 2.37, p < .001 for taste preference; β = 1.87, p < .01 for brand attractiveness).

Table 5 Regression of cola choice, taste preference, and brand attractiveness on the nine contrast measures

Case studies

Figure 3 provides four fictitious cases of how association rates might highlight automatic associations that differ in nature and meaning. An “object-driven association” indicating an implicit preference for Coca Cola relative to Pepsi Cola is observed when λ GC > λ BC and λ BP > λ GP (Fig. 3a). This means that the Cocagood association is stronger than the Cocabad association and the Pepsibad association is stronger than the Pepsigood association. An “attribute-driven association” indicating an implicit preference for the same cola is observed when λ CG > λ PG and λ PB > λ CB (Fig. 3b). An “object-driven association” and an “attribute-driven association” indicating an implicit preference for Pepsi Cola relative to Coca Cola are observed when λ GP > λ BP and λ BC > λ GC (Fig. 3c) and when λ PG > λ CG and λ CB > λ PB (Fig. 3d), respectively.

Fig. 3
figure 3

Fictitious cases of automatic associations. Squares represent stimuli, circles represent processes, and arrows represent association rates. C = Coca Cola, P = Pepsi Cola, G = good, B = bad. Unbroken arrows represent the strongest association rates

Figure 4 depicts the standardized association rates (obtained by dividing each association rate by its standard error) of 7 participants in our study. They are exemplary cases of the profiles of automatic association that have been observed in the present application. Unbroken arrows represent significant rates. The D scores of these participants are also provided. It is worth recalling that a positive D indicates an implicit preference for Coca Cola relative to Pepsi Cola, whereas a negative D indicates the opposite.

Fig. 4
figure 4

Standardized association rates and D scores of 7 participants in the study. Squares represent stimuli, circles represent processes, and arrows represent association rates. C = Coca Cola, P = Pepsi Cola, G = good, B = bad. Unbroken arrows represent significant association rates (p < .05; Bonferroni correction)

An object-driven and an attribute-driven association indicating a preference for Coca Cola relative to Pepsi Cola are observed in participants a and b, respectively (Fig. 4a, b). Exemplars of object-driven and attribute-driven associations indicating a preference for the other cola could not be observed in the present sample. An association is observed in participant c, which can be regarded as a partial attribute-driven association because only the bad stimuli contribute to the measure (Fig. 4c). Interestingly, the D scores of participants b and c are similar in size but different in meaning. Participant c associates only the bad stimuli with Pepsi Cola, whereas participant b also associates the good stimuli with Coca Cola. The implicit preference for Coca Cola relative to Pepsi Cola observed in participant c expresses only the attribution of negative features to the Pepsi Cola, whereas that observed in participant b also exhibits the attribution of positive features to the Coca Cola. Other exemplars in which the D scores are similar in size but different in meaning could be observed by comparing participants d and e (Fig. 4d, e) and participants f and g (Fig. 4f, g). The implicit preference for Coca Cola relative to Pepsi Cola observed in participant d results from associating the Coca Cola stimuli with good, whereas that observed in participant e results from associating the Pepsi Cola stimuli with bad. The implicit preference for Pepsi Cola relative to Coca Cola observed in participant f results from associating the Coca Cola stimuli with bad, whereas that observed in participant g results from associating the Pepsi Cola stimuli with good.

Discussion

A formal model has been presented that decomposes the IAT effect into three process components: stimuli discrimination, automatic association, and termination criterion. The model has been applied to the responses provided to a Coca–Pepsi IAT.

Stimuli discrimination provides information about the functioning of the stimuli. The discrimination rates enable the researchers to explore whether the stimuli that have been chosen to represent a category can be easily recognized and correctly categorized. In our application, the pictures representing the categories Coca Cola and Pepsi Cola and the words representing the categories good and bad were highly recognizable.

Automatic association provides information about the association between objects and attributes. The association rates enable the researchers to distinguish between implicit measures that have the same direction but different meanings. In our study, we have been able to distinguish participants whose implicit preference for a certain cola resulted from a positive evaluation of that cola from those whose implicit preference resulted from a negative evaluation of the contrasting cola. This has a strong potential in some research fields. In the investigation of implicit race attitudes, for example, researchers would be allowed to distinguish individuals with an actual implicit prejudice (e.g., Blackbad association in a white participant) from those who merely hold an implicit preference for their own group (e.g., Whitegood association in a white participant). Since these parameters express the average amount of information that the object (attribute) stimuli provide about the attribute (object) categories in the time unit, they have been denoted as association rates. However, it is worth noting that, depending on the application context, they might reflect effects that are not based on associations, but on a recoding of the object and attribute categories on the basis of shared features (e.g., salience, valence, shape, color; Greenwald, Nosek, Banaji, & Klauer, 2005; Rothermund & Wentura, 2004).

Termination criteria represent the amount of information that is needed before a response is given. They can be interpreted as either task difficulty or individual cautiousness.

The model enables a fine-grained analysis of the IAT effect. It is interesting to note that in our Coca–Pepsi IAT, the D score mostly reflected the different difficulty in categorizing the stimuli in the critical blocks and the different strength of object–attribute associations. The different difficulty of the categorization tasks also predicted taste preference and the choice of a free can of cola.

There are some connections between the DAM and the DM. Since the DAM is an extension of the Poisson race model, these connections are essentially the same as those described by Van Zandt, Colonius, and Proctor (2000). Both of the models disentangle components of the IAT effect concerning the rate at which information about the stimulus is accumulated (drift in the DM, stimuli discrimination and automatic association in the DAM) and the amount of information that must be accumulated before a response is given (threshold in the DM, termination criterion in the DAM). However, there are several differences between the two models that might enable the DAM to be a good alternative to the DM. First, in the presented model, only termination criteria vary across blocks, whereas in the DM analysis, both drifts and thresholds vary across blocks. This might confound their relative contribution when compatible and incompatible blocks are compared. The DAM allows estimating accumulation rates separately for objects and attributes by using the number of trials of a typical IAT. On the contrary, a DM analysis, separated for objects and attributes, requires a number of trials, which is almost 4 times greater than usual (see Klauer et al., 2007, Study 1). In the DM, information accumulation of contrasting information is perfectly negatively correlated, whereas, in the DAM, it is assumed to be independent. These two approaches are based on the assumptions of serial and parallel processing of the stimuli, respectively. Finally, the DAM derives from a model (the Poisson race model) that is formally simpler than the DM, and therefore, it is easier to investigate its mathematical properties.

Limits and future research

It is not atypical to observe participants who make no or few errors in one or more of the IAT blocks. When this happens, there might be no sufficient information for obtaining stable estimates of all model parameters. For example, in the DAM, with no wrong responses to the stimuli of a specific category, it would be impossible to estimate the λ parameter concerning their incorrect discrimination. Similarly, no or few wrong responses represent a limit for the quad model and the DM. An advantage of the DAM is that it considers separate processes for correct and incorrect responses. Hence, even if the estimates of the parameters concerning wrong responses are not reliable, this would not be the case for those concerning correct responses.

The aim of the empirical application presented in this study was to show how the DAM can be applied to the analysis of IAT data and what information it can provide about the response process. In this first application of the model, it was decided to present all participants with the IAT blocks in the same order for comparability reasons. The Pepsibad/Cocagood blocks always preceded the Cocabad/Pepsigood blocks. The fixed sequence of the critical blocks might have influenced the size of the IAT effect (see, e.g., Nosek, Greenwald, & Banaji, 2005). However, it should be considered that the aim of this study was to illustrate the usefulness and practical potential of the DAM, and not to draw conclusions about implicit preferences toward one cola over the other.

As far as the interpretation of the parameters of the DAM is concerned, it is worth underlining that they still need to be validated. As was already noted, the empirical application that we presented in this article is merely a demonstration of the potential usefulness of the DAM. For instance, the results we presented do not provide any clear evidence that the lambda parameters can be interpreted as association strenghts. There are many other plausible interpretations of the association parameters of the DAM, and an exhaustive set of studies including manipulations, test–retest investigations, and known groups is needed to establish their validity.

One further limit is that the regression analyses presented in this study were performed on 123 participants, those for whom all parameters were identifiable. Future research will be devoted to developing procedures for reducing the proportion of nonidentifiable parameters.

Also, the model in its current specification does not allow the study of the effects of single stimuli but estimates only the effects of stimulus categories. Previous research has described that stimuli influence the IAT effect (see, e.g., Bluemke & Friese, 2006; Gast & Rothermund, 2010; Govan & Williams, 2004; Steffens & Plewe, 2001). A possibility for taking the effects of single stimuli into account might consist of extending the model into a hierarchical framework (see, e.g., Vandekerckhove, Tuerlinckx, & Lee, 2011).

The aim of future research will also be to improve the model fit. Response latencies of objects and attributes in the different blocks, as well as the amount of information concerning incorrect responses, will be considered. A possibility for enlarging the number of fitting models might consist of modeling intertrial variability to account for the differences between the stimuli of a certain category and the effects of passing time on responses to the trials (e.g., training to the task, fatigue). Other possibilities might be the incorporation of nondecision components (e.g., encoding stimuli, motor response, distractions) within the model, and allowing the termination criteria to vary between objects and attributes.

Conclusions

In this article, a formal model has been proposed that decomposes the IAT effect into three process components: stimuli discrimination, automatic association, and termination criterion. Discrimination regards the amount of information that an object (attribute) stimulus provides about the object (attribute) categories. Association regards the amount of information that an object (attribute) stimulus provides about the attribute (object) categories. The DAM has been tailored to fit the IAT, and it provides a number of desirable features. By means of an illustrative application, we explained why and how, once fully validated, the model might help to shed light on the meaning of the IAT effect, providing, for instance, estimates of many unique associations that are typically hidden behind every IAT score.