Associations between psychologists’ thinking styles and accuracy on a diagnostic classification task

The present study investigated whether individual differences between psychologists in thinking styles are associated with accuracy in diagnostic classification. We asked novice and experienced clinicians to classify two clinical cases of clients with two co-occurring psychological disorders. No significant difference in diagnostic accuracy was found between the two groups, but when combining the data from novices and experienced psychologists accuracy was found to be negatively associated with certain decision making strategies and with a higher self-assessed ability and preference for a rational thinking style. Our results underscore the idea that it might be fruitful to look for explanations of differences in the accuracy of diagnostic judgments in individual differences between psychologists (such as in thinking styles or decision making strategies used), rather than in experience level.

It has been suggested that consulting the diagnostic and statistical manual of mental disorders (DSM-IV-TR; American Psychiatric Association 2000), which provides explicit and specific criteria for the diagnosis of the different disorders, is related to greater reliability, fewer biases and less under-and over-diagnosing than when classification is done without using such checklists (see Garb 1998). Yet diagnostic classification is not always done by comparing the client's symptoms to DSM-criteria. For instance, it has been found that experienced psychologists compare clients to prototypes (Garb 1996), that the construction of causal relations between symptoms may influence diagnostic classification (Kim and Ahn 2002) and that diagnosing disorders may involve the use of intuition (Srivastava and Grube 2009).
In previous research concerning the accuracy of psychologists' judgments and decisions, a comparison has often been made between novices and experienced psychologists. In many such studies no significant differences in accuracy are found between these groups (e.g., see Garb 1998). A recent meta-analysis by Spengler et al. (2009) suggests that there is a small but reliable effect (d = 0.15) in favour of experienced clinicians. As the authors point out, this implies that very large samples are needed to have enough power to find significant results, which may be especially difficult with this group. Regardless of this, given this small effect size the authors raise the question whether experience is the best predictor of judgment accuracy. They call for more research into individual differences between psychologists, aside from their experience level.

The present study
The present study takes a preliminary step in answering this call, by investigating whether individual differences in thinking styles between psychologists are associated with accuracy of diagnostic classification judgments. In the judgment-and decision literature a distinction is often made between thinking processes that are implicit, rapid and automatic and those that are more likely to be slow, consciously monitored and deliberately controlled (e.g., Evans 2008). Epstein (e.g., 2010) uses the terms "experiential/intuitive" and "rational/analytical" to refer to this general difference in the way people process information. These two information processing styles are independent, and are assumed to operate in parallel and to be interactive. Together they contribute to behaviour, with their relative contributions varying depending on the situation and the person (e.g., Epstein 2010). Individual differences in self-assessed ability and preference to engage in these thinking styles ("experientiality" and "rationality" as measured by the rational-experiential inventory, REI; Pacini and Epstein 1999) have been found to be associated with different personality characteristics and actual judgments and decisions.
We reasoned that it would be interesting to investigate individual differences in these thinking styles in relation to the accuracy of psychologists' diagnostic classifications. Rationality has been found to be positively associated with higher working memory capacity and better syllogistic reasoning and negatively with biases (Fletcher et al. 2011), which could imply that rationality is positively associated with diagnostic reasoning as well. On the other hand, results of studies by Corbin et al. (2010) and Delaney and Sahakyan (2007) indicate that participants with high working memory capacities, who are, by association, rational (Fletcher et al. 2011), are influenced by contextual information to a relatively large extent. Contextual information may not always be relevant for diagnostic classification and may sometimes even be distracting. Higher rationality would then, perhaps rather counter-intuitively, be related to lower diagnostic accuracy.
As for the other type of thinking process, experientiality, the converse applies. It has been found to be positively associated with biases and negatively with syllogistic reasoning (Fletcher et al. 2011). By extension this may lead to the assumption that high experientiality will be negatively associated with diagnostic reasoning and would then be related to lower diagnostic classification accuracy. On the other hand, it has been suggested that good judgments do not always require complex cognition and that heuristic processing can sometimes outperform complex cognitive processing (Gigerenzer and Gaissmaier 2011;Marewski et al. 2010). Rapid heuristic, experiential thinking would then, again maybe counter-intuitively, lead to higher diagnostic accuracy.
We expect no significant difference in performance between novices and experienced psychologists on the group-level (cf. Garb 1998;Spengler et al. 2009). However, individual differences between psychologists in their rational and experiential thinking styles may be associated with diagnostic classification accuracy in either of the directions outlined above. Rationality may be associated with lower (more distraction) or higher (better reasoning) diagnostic accuracy, and the same applies to experientiality which may also be associated with lower (more biased) and higher (heuristic) accuracy. Other thinking strategies that psychologists may use to diagnose clients have been mentioned above and have been identified in previous literature (cf. Elstein and Schwarz 2002). We included a questionnaire developed specifically for this study to gauge these strategies and to test their association with judgment accuracy.

Procedure
Participants were invited to participate via an e-mail, which contained a link to the online experiment. They first answered demographical questions, after which they filled out the rational and experiential inventory (REI; Pacini and Epstein 1999). Next, they were asked to provide diagnostic classifications for two case-descriptions of clients with co-occurring disorders by choosing two out of 20 disorders for each client. These options were presented on screen in alphabetical order and were kept constant for the two cases (see Appendix, column 1; note that the English translations of the disorders results in a non-alphabetical order).
Participants were told they were allowed to consult the DSM (American Psychiatric Association 2000) and that they could no longer change their decisions after pressing the "finished"-button. The two case-descriptions were presented (in an order [707] 122 Synthese (2012) 189:119-130 randomized over participants) on separate pages, which included a dropdown menu for the choice options. A photo of a person's upper body and face (with a closed mouth), chosen to be appropriate in age and gender for the clients of the case-descriptions, accompanied the case-description and was presented again later with the Diagnostic Decision Making questions (DDM-questions; see Materials) to aid memory. Decision making time was measured using Inquisit 3 web-edition software (Inquisit 3.0.4.0 2010), from the moment the case-description was presented on the screen until the participants pressed the "finished"-button.
After the diagnostic classification task, participants were presented with the DDMquestions. Finally, they were asked how motivated they had been during the diagnostic task (to be indicated on a scale from 1 (completely not) to 100 (completely) and how many correct classifications they thought they had provided (confidence).

Participants
A total of 25 psychologists (M age = 52.08; SD = 7.66; age range 39-63; 15 female) formed the group "experienced psychologists". They had an average of 20.64 (SD = 8.23; range 10-38) years of experience in diagnosing psychological disorders and 84 % worked in private practice. Most indicated to work from a combination of theoretical orientations of which cognitive-behavioural (72%), mindfulness (32%), behavioural (32%), psychodynamic (32%), client centered (32%), and psychoanalytical (12%) were most often indicated. A total of 21 clinical psychology master students (M age = 23.76; SD = 2.53; age range 21-30; 20 female) with no more than 1 year of experience formed the group "novices". Psychologists did not receive compensation for their participation. Students could enter a raffle in which 10 gift certificates (worth 20 euro each) were given away, by typing in their e-mail address at the end of the experiment.

Materials
We used the REI (Pacini and Epstein 1999), which measures the two different thinking styles. Both scales consist of 20 items (e.g., "I like to rely on my intuitive impressions") which participants rate on a 5-point scale ranging from definitely false to definitely true. We used a Dutch translation of the REI which has a satisfactory reliability (Witteman et al. 2009). Cronbach's alphas in the present study were 0.83 for rationality and 0.87 for experientiality.
We presented participants with two case-descriptions of clients with co-occurring psychological disorders, which were taken from the DSM IV-casebook (Frances and Ross 2001) and translated into Dutch. Case 1 ("A late onset of symptoms") presents a client ("Mrs. S.") with a panic disorder with agoraphobia and a depressive disorder not otherwise specified. Case 2 ("A tempestuous life") presents a client ("Ms. C.") with a borderline personality disorder and a depressive disorder not otherwise specified (this client also has a histrionic personality disorder, but this was not one of the options participants could choose in the present study).
To try to capture how participants performed the diagnostic classifications task, Diagnostic Decision Making questions (DDM-questions) were created, which were based on possible strategies psychologists could have used. Questions related to disorders diagnosed (e.g. "With how many diagnoses did you look at the diagnostic criteria in the DSM?") or to the clients of the case descriptions ("With how many clients did you immediately know which diagnosis was appropriate, without knowing how you knew?"). Participants were asked to indicate to how many diagnoses (0, 1, 2, 3 or 4) or with how many clients (0, 1 or 2) the particular question was applicable. See Table 3, column 1 for the DDM-questions and additional introductory information that was presented to the participants.

Experience level
Novices and experienced psychologists did not differ significantly in their diagnostic classification accuracy (Mann-Whitney's U = 235, z = −0.732 ns), decision making time (U = 250, z = −0.276 ns), and motivation (U = 232, z = −0.673 ns). Novices were however significantly less confident than experienced psychologists about their classification accuracy, U = 174, z = −2.164, p = 0.032. See Table 1 for means and standard deviations of these variables.
Novices and experienced psychologists did not differ significantly in their thinking styles, neither in their rationality (Mann-Whitney's U = 185, z = −1.712 ns) nor in their experientiality (t (44) = 1.461 ns). Novices and experienced psychologists only differed significantly in one decision making strategy: In how much they thought about clients seen or read about in the past (DDM-question 5, see Table 3). Experienced psychologists had these thoughts more, U = 137.500, z = −2.930, p = 0.004.

Diagnostic accuracy groups
To investigate whether judgment accuracy was related to individual differences between clinicians aside from experience level, the data of all participants were  combined and three diagnostic accuracy groups were formed: participants with below average performance (0-1 correct classifications; n = 10), those with average performance (2 correct classifications; n = 31), and those with above average performance (3-4 correct classifications; n = 5) (see Appendix for the classifications these groups provided). The groups did not differ significantly in their proportion of novices and experienced psychologists (χ 2 (2) = 1.50 ns) nor in decision making time (Kruskal-Wallis H (2) = 2.50 ns), motivation (F = 0.872 ns), or confidence (H (2) = 0.51 ns) (see Table 2). Jonckheere's test revealed significant trends with DDM-questions 6 and 7, indicating that the more correct classifications participants gave, the less they had thoughts about a "prototypical" client (J = 167, z = −2.22, p = 0.023) and the less they formed a general picture or came to an interpretation or explanation of the client's complaints (J = 162.50, z = −2.57, p = 0.012) (see Table 3, columns 1-4).
Results showed that the three groups did not differ significantly in their experientiality scores (F = 0.104 ns) but did in their rationality scores (F = 4.356, p = 0.019). Participants with below average performance had a nearly significantly higher rationality than those with average performance (t (39) = 1.986, p = 0.054) and a significantly higher rationality than those above average performance (t (13) = 3.369, p = 0.005). There were no significant differences in rationality between the latter two groups (t (34) = 1.706 ns).

Alternative Analyses with Application of a looser Criterion
For the case "Mrs. S", 80.4% of the participants correctly diagnosed "panic disorder with agoraphobia" and for the case "Ms. C.", 95.7% of the participants correctly diagnosed "borderline disorder". They had more difficulty correctly diagnosing the "second disorder" which in both cases was depressive disorder not otherwise specified. Because of this difference it was decided to also score according to a looser criterion by counting three of the options that closely resemble this (i.c., major depressive disorder single episode, major depressive disorder recurrent, and dysthymia) as also Questions 1-4 relate to the number of disorders diagnosed, hence the range of possible answers of 0-4. Questions 5-9 relate to the number of clients, hence the range of possible answers of 0-2 for those questions. In-between the questions introductory information is presented (in italics) as was presented to participants correct. This resulted in the following diagnostic accuracy groups-looser criterion: 0-1 correct (n = 7), 2 correct (n = 22), and 3-4 correct (n = 17) (see Table 4).
All results presented before are also found with this looser criterion. That is, no significant difference in experientiality (F = 0.121 ns) but a significant difference in rationality (F = 7.600, p = 0.001) and significant trends with regard to DDM-questions 6 and 7 were found. Participants with below average performance are now also found to have a significantly higher rationality than those with average performance (t (21.89) = 2.723, p = 0.012) and these in turn now also have a higher rationality than those with above average performance (t (37) = 2.543, p = 0.015). Furthermore, a significant trend with DDM-question 8 is now also found (J = 414.00, z = 2.038, p = 0.042), indicating that the more correct classifications participants gave, the more they immediately knew which diagnosis was appropriate without knowing how they knew. See Table 3, columns 5-7, for means and standard deviations for the DDM-questions, presented separately for the three diagnostic accuracy groupslooser criterion.

Conclusion and discussion
Differences in decision-and judgment accuracy between novice and experienced psychologists have often been studied, while research into individual differences between psychologists aside from their experience level has received sparse attention to date. The present study investigated whether individual differences between psychologists can be found to be associated with the accuracy of their judgments. We looked at possible associations between rational and experiential thinking styles and diagnostic accuracy regardless of experience level, and we investigated whether specific decision making strategies were associated with accuracy.
No significant difference was found between novices and experienced psychologists in the accuracy of their diagnostic classifications, which was as expected (cf. Garb 1998;Spengler et al. 2009). But when the data of all participants were combined and groups were formed based not on experience but on scores of diagnostic classification accuracy (below average, average, above average), significant associations between accuracy and individual differences between psychologists were found. These groups [713] did not differ significantly in their proportion of novices and experienced psychologists, decision making time, motivation, confidence or experientiality. They were however found to differ in their rationality: The higher psychologists' rationality the worse their diagnostic classification accuracy was. Next to this, it was found that the more psychologists thought about a prototypical client and the more they formed a general picture or came to an interpretation or explanation of the client's complaints, the worse their diagnostic accuracy.
It could be speculated that those high in rationality were influenced more by the contextual information in the case-descriptions (cf. Corbin et al. 2010;Delaney and Sahakyan 2007) and focused on the "wrong" pieces of information or weighed the information sub-optimally. In this context it might be interesting to note that those who provided incorrect diagnoses did so by choosing one of relatively few alternative diagnoses (see Appendix). Some of these diagnoses can be seen as reflecting parts of the information given in the case-descriptions.
Although rationality did not correlate significantly with thinking about a prototypical client (DDM-question 6; r s = 0.22 ns) or forming a general picture or coming to an interpretation or explanation of the client's complaints (DDM-question 7; r s = 0.11 ns), 1 it can still be speculated that our significant findings share a common characteristic. In doing this, it might be useful to make a distinction between "rational, logical" thinking and "abstract, conceptual" thinking. The rational favourability sub-scale of Epstein's REI has been found to load positively on both of these types of rational thinking (Pretz and Totz 2007) and given our results this could raise the question which of the two (or both) is related to diagnostic accuracy. When speculating that "abstract, conceptual" thinking is responsible for the negative association with diagnostic accuracy, it could be reasoned that this shares a common characteristic with thinking about a prototypical client and forming a general picture or coming to an interpretation or explanation of the client's complaints. These may all involve thinking about things "beyond" the concrete information presented. It could be interesting to try and manipulate this type of thinking process by having participants provide explanations of the client's symptoms ("thinking beyond the information") before asking them for a diagnostic classification and compare this to participants who should focus explicitly on extracting symptoms from the information ("sticking to the concrete information").
A limitation to the present study could be that conducting it via the internet may have led to a selection bias. Also, the separate statistical analyses performed in the present study have not been corrected for an inflation of the type-1 error rate (cf. Nakagawa 2004). Finally, the conclusion would have been stronger if we had used more than two cases. When looking at the cases separately, a significant difference in rationality between the diagnostic accuracy groups and the diagnostic accuracy groups-looser criterion is found for "Mrs. S.". For "Ms. C." the means and mean ranks are in a similar direction but there are no significant differences.
The results of the present study should be treated with caution given the above limitations and the fact that they do not allow for any causal statements. Replication and extension of this type of research is needed to independently corroborate our findings. Our results can best be seen as underscoring the idea that investigating associations between accuracy of judgments and individual differences between psychologists aside from experience level might be fruitful.

Appendix
See Table 5.  Note Each column with percentages represents the actual diagnostic classifications participants provided for one of the two cases, "Mrs. S" and "Ms. C.". They are depicted for the diagnostic accuracy groups separately, hence 3 × 2 columns with percentages (three diagnostic accuracy groups and two cases). The percentages per column add up to 200% (or nearly 200% due to rounding), 100% for each of the two disorders to be indicated per case. A dash ("-") in a cell stands for 0%. a Accurate diagnostic classifications for case description of Mrs. S. b Accurate diagnostic classifications for case description of Ms. C. Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.