To varying degrees, people tend to report their symptoms inaccurately, i.e., report more or fewer symptoms than they actually have. When such inaccuracies become large, there is cause for concern. A heavily distorted symptom presentation may compromise diagnoses and treatment plans and may bias referral letters to clinicians or expert witness opinions. Still, self-reported symptoms are often the primary source of patient information in clinical work, or in professional reviews of legal documents or archival material (Carneiro et al., 2019; Rosen, 2006; Waite & Geddes, 2006; Wisdom et al., 2014). Therefore, it is important to consider the possibility that the patient exaggerates or minimizes symptoms. Validity tests have been constructed to aid psychologists in detecting symptom distortions and should ideally be part of every test battery (e.g., Bush et al., 2005; Chafetz et al., 2015; Heilbronner et al., 2009; Institute of Medicine, 2015; Sweet & Guidotti Breting, 2013).

Some validity tests aim to identify endorsement of bizarre symptoms or overendorsement of common symptoms (i.e., symptom overreporting) on self-report or rating measures of symptoms. These validity tests are referred to as symptom validity tests (SVTs). Other validity tests intend to detect underreporting, and include those that index supernormality (Cima et al., 2003, 2008) or socially desirable responding (i.e., SDR; e.g., Paulhus, 1988). Supernormality refers to denial of common, “everyday” symptoms, whereas SDR concerns denial of deviations from the social norm, which is arguably broader than just denial of symptoms. Supernormality and (blatant manifestations of) SDR are both seen as manifestations of “faking good” behavior (e.g., Bensch et al., 2019a, b; De Page & Merckelbach, 2021). Yet, although supernormality and SDR are conceptual cousins, the extant literature on supernormality is scarce, whereas the literature on SDR is abundant. For the purpose of the present review, we considered both stand-alone measures of supernormality and SDR as measures of one particular type of symptom distortion, namely underreporting.

Stand-alone validity tests are entirely or primarily dedicated to detecting non-credible presentations of symptoms or impairments, whereas embedded validity scales are part of broader measures that serve a clinical purpose, but contain a validity check (e.g., on over- or underreporting). Stand-alone tests are generally considered to be superior to embedded scales in terms of discriminant validity and are less often associated with interpretational problems (see, for a discussion: Erdodi & Lichtenstein, 2017). With these considerations in mind, we narrowed the focus of our scoping review down to stand-alone validity tests, but where relevant, we will discuss converging evidence from embedded measures, but also performance validity tests (PVTs) that tap into an exaggerated presentation of cognitive impairments.

Validity tests remain silent as to “why” patients over- or underreport symptoms (Bass & Wade, 2019). Symptom overreporting is often interpreted as a sign of malingering (i.e., intentional overreporting motivated by external incentives; Martin et al., 2015; Thompson et al., 2018). The Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric Association [APA], 2013) has a limited perspective on this matter. As is true for the DSM-IV-TR, the DSM-5 stresses malingering as an antecedent of symptom distortions, although its entries for histrionic personality disorder, and factitious disorder also allude to overreporting. In doing so, the DSM-5 fails to recognize more articulated views on symptom distortions (e.g., see Berry & Nelson, 2010; Niesten et al., 2015; Otto, 2008; Rogers, 1990). Malingering is just one pathway to symptom overreporting and there might exist other pathways, such as acquiescence, careless or inattentive responding, demand characteristics, and misinformation effects (Merckelbach et al., 2019). Moreover, while the DSM-5 briefly mentions anosognosia (i.e., lack of illness awareness), its one-sided focus on overreporting foregoes — and distracts from — the phenomenon of underreporting.

Clarifying the antecedents of symptom over- and underreporting — other than malingering or faking good, respectively — is important for the following reason: a narrow view on overreporting and underreporting would exclusively link these phenomena to incentives (e.g., compensation money; a favorable outcome in a child custody dispute) and a conceptual framing in terms of deliberate attempts to deceive others. Alternatively, some authors have argued that certain personality traits may foster a distorted symptom presentation. Ideas about the link between traits and symptom distortions can be traced back to the early papers of Eysenck, who speculated that psychoticism, antisocial features, and perhaps extraversion might contribute to overreporting (e.g., Eysenck & Gudjonsson, 1989; but see Young et al., 2016).

What is the empirical support for the idea that certain personality traits are associated with distorted symptom presentation? The current review addresses this question. If traits serve as powerful drivers of overreporting and underreporting, our conceptualization of distorted symptom presentation would have to be broadened to include, for example, the extent to which a hyperbolic style or a defensive attitude of the patient fuels such symptom distortions. To the extent that there are replicable, solid, and substantial links between personality traits and symptom distortion, such findings would provide important caveats and clues for clinicians who are faced with the task to interpret overreporting or underreporting on validity tests.

What is well established is that incentives play a prominent role in distorted symptom presentation. That is, studies consistently find higher failure rates on validity tests with increasing context-related incentives. For example, Mittenberg et al. (2002) estimated the prevalence of symptom exaggeration to hover around 10% in low-stake contexts such as regular medical cases, but up to 30% in high-stake contexts such as in forensic and civil legal cases (see also Young, 2015). One could argue that the non-trivial rate of ± 10% of invalid responding in low-stake contexts (Dandachi-Fitzgerald et al., 2011, 2017; Mittenberg et al., 2002) reflects the contribution of traits to symptom distortions. Hence, we reviewed the literature to determine whether there is solid evidence linking certain traits to distorted symptom presentation, specifically in low-stake contexts.

In what follows, we will provide a scoping review on empirical and conceptual links between traits or personality pathology/disorders, on the one hand, and raised scores on stand-alone validity tests on the other hand. In doing so, we took the DSM-5 section III: alternative DSM-5 model for personality disorders (AMPD) as a starting point. Importantly, elevated scores on validity measures do not necessarily imply that cut points have been crossed and that research subjects fail a validity test. Thus, our review is not restricted to studies that employed a categorical interpretation (i.e., pass versus failure) of validity test outcomes.

Before turning to the results of our review, there is a preliminary issue that deserves consideration. This issue has to do with the interpretational problems that may arise when validity test scores are linked to self-report indices of traits. Heightened scores on validity measures cast doubts on the accuracy of self-reported traits, thereby restricting the confidence that one can place in any interpretation of the data (see Merten et al., 2007). Despite this challenge to our review, we anticipate that some consistent patterns may emerge. First, the majority of articles included in our review focused on self-reported traits and more subtle distortions on validity tests (i.e., elevated scores). In such samples, typically only a minority of persons obtain scores beyond categorical cut-offs that serve as a red flag for obvious symptom distortion (i.e., validity test failure), and that would raise concerns of relevant distortion of self-reported traits. Second, an often overlooked aspect of validity tests is that in many of these measures, the items mostly allude to bizarre symptoms (e.g., in typical SVTs), or mundane experiences (as in measures of supernormality): the type of items that some people might experience as provocative, threatening, ridiculous, or boring. It might well be that such a reactive attitude (i.e., “the negative subject”; Christensen, 1977; Miron & Brehm, 2006) is not only more pronounced in persons with certain traits (e.g., neuroticism, suspiciousness, narcissism), but also in itself fosters overreporting or underreporting of symptoms. Viewed this way, it is perfectly legitimate to the raise the question whether certain traits are related to symptom overreporting or underreporting on self-report validity measures.

Method

We conducted a scoping review (Grant & Booth, 2009) to investigate overall results, trends, limitations, and gaps in the literature on traits and distorted symptom presentation. We opted for this qualitative approach, because we had reason to assume that the extant literature would be highly diverse in terms of samples, measures, and incentives. For example, both Niesten et al. (2015) and Van Impelen et al. (2017) concluded in their reviews on antisocial features and distorted symptom presentation that the links between these constructs were highly variable and context dependent. Thus, we assumed that such diverse, perhaps even incoherent patterns might also hold true for the links between other traits and distorted symptom presentations.

On May 5, 2020, the first author (DH) ran a search in APA PsycINFO (EBSCO) and PubMed to identify empirical studies that directly tapped into measures of overreporting and/or underreporting, as well as diagnoses or measures of DSM-5 section II or III personality disorders or traits. A Boolean title/abstract search of the format “X AND Y” was used; alternative terms were specified with the OR function. More specifically, the search comprised of typical response bias equivalents, such as (dis)simulation, distortion, response style, symptom validity, over-/underreporting, malingering, exaggeration, feigning, faking, desirable responding, self-deception/-deceptive enhancement, impression management, and/or commonly used “stand-alone validity tests” such as the Structured Interview of Reported Symptoms, Structured Inventory of Malingered Symptomatology, Miller Forensic Assessment of Symptoms Test, Self-Report Symptom Inventory, Supernormality Scale(-Revised), Marlowe-Crowne Social Desirability Scale, and Balanced Inventory of Desirable Responding, and their acronyms (see for an introduction to a special issue on self-report symptom validity measures, Giromini et al., 2022). These terms were combined with variations of “trait” and “personality disorder” terms, such as the ten DSM-5 section II categorical personality disorders (PD, i.e., schizoid, schizotypal, paranoid, antisocial, borderline, narcissistic, avoidant, dependent, and obsessive–compulsive PD), the 25 DSM-5 section III AMPD trait facets and their variations, and other conceptually similar or related traits and terms known to the authors: alexithymia, dissociation, fantasy proneness, narcissism, hysteria, paranoia, and psychopathy.

The search was limited to studies in English peer-reviewed journals between 1979, the year that the oldest measure in relevant articles — the Narcissistic Personality Inventory (NPI; Raskin & Hall, 1979) — was published, and May 5, 2020, the date of search. All retrieved records were screened such that those records with abstracts that covered our search terms were selected for full-text retrieval and closer inspection. Studies that were inaccessible (e.g., unpublished dissertations, conference papers) were excluded, as were non-empirical articles or chapters. Next, full-text records were coded on relevant parameters (i.e., sample and setting, trait/disorder, relevant measures, and outcomes comparisons). Studies that examined whether antisocial personality disorder and/or psychopathy are linked to heightened scores or failures on stand-alone (and/or embedded) validity tests were excluded because these studies have been extensively summarized and discussed in previous reviews (e.g., Niesten et al., 2015; Van Impelen et al., 2017).

The scoping review is divided into a section on overreporting and a section on underreporting. We focus only on those traits that have been linked to symptom distortion in the literature identified by our scoping review, or by indirect evidence in articles known to us. For each section, we cluster traits based on their conceptual similarity. For each discussed trait, our evaluation of the evidence — where available — is structured in the following way: (a) conceptual definitions; (b) direct empirical evidence provided by studies that emerged from our scoping review, linking the trait to validity test outcomes. In evaluating the strength of correlational data, we followed the benchmarks of Cohen (1988) and considered rs ≥ 0.50 as strong, rs from 0.30 to 0.50 as moderate, and rs from 0.10 to 0.30 as weak; (c) indirect empirical evidence from literature known to the authors, involving conceptually similar traits or disorders, embedded validity scales, or PVTs as indicators of impairment exaggeration; (d) theoretical or conceptual articles that provide tentative explanations for such links based on empirical evidence; and (e) prima facie tentative plausibility of links to symptom distortion, based on core features of the respective traits (e.g., cognitions, emotions, behavior).

Results

Our scoping review identified 52 articles (k = 55 studies) that linked self-reported traits to outcomes on stand-alone validity tests (i.e., stand-alone SVTs, or measures of SDR and/or supernormality). As to overreporting, 12 articles encompassing 15 studies were found addressing depression/depressivity (k = 3), alexithymia (k = 2), apathy (k = 1), dissociation/cognitive and perceptual dysregulation (k = 7), and/or fantasy proneness/unusual beliefs and experiences (k = 2). As to underreporting, 40 articles encompassing 41 studies were found addressing alexithymia (k = 11), dissociation (k = 7), and/or narcissism (k = 23).

Overreporting

Studies in this domain have mainly focused on correlations between self-report measures of traits, for example, the Dissociative Experiences Scale (DES; Bernstein & Putnam, 1986) in the case of dissociativity, and SVTs consisting of a self-report checklist of bizarre symptoms as a measure of symptom overreporting, such as the Structured Inventory of Malingered Symptomatology (SIMS: Smith & Burger, 1997). Some researchers also relied on a group-comparison approach, typically based on scores above and below validity test cut-offs. Table 1 presents an overview of correlational studies on traits and overreporting; Table 2 summarizes group-comparison studies. Note that these and following tables only include studies that relied on stand-alone (rather than embedded) validity tests.

Table 1 Traits and overreporting on stand-alone SVTs: correlational studies
Table 2 Traits and overreporting on stand-alone SVTs: group-comparison studies

Depressivity

Trait depressivity refers to pervasive and/or persistent feelings of being down, miserable, or hopeless; feelings of inferior self-worth, shame or guilt; and/or pessimism or suicidal thoughts or behavior (APA, 2013; Krueger et al., 2012). Our screening of the literature did not find relevant articles on trait depressivity per se, but did retrieve (a non-exhaustive list of) studies on depression and symptom distortions, on which we report below.

Two articles (i.e., Merckelbach & Smith, 2003; Merten et al., 2019a) directly linked self-reported depression on the Beck Depression Inventory (BDI; Beck et al., 1961) to overreporting on SVTs and noted strong associations: Merckelbach and Smith (2003) found in undergraduates (N = 182) that BDI scores correlated with SIMS scores. Yet, differences in depression scores were not evident when comparisons were based on groups attaining SIMS scores below or above the cut-off. Relying on compensation-seeking psychosomatic rehabilitation inpatients (N = 537), Merten et al. (2019b) found scores on the SIMS and the Self-Report Symptom Inventory (SRSI) — a recently developed measure of overreporting (Merten et al., 2016) — to be strongly related to BDI-II scores (Beck et al., 1996). Specifically, compensation-seeking psychosomatic inpatients with elevated BDI-II scores (i.e., > 40) more often failed on the SVTs than those who attained lower BDI-II scores.

Some studies focused on embedded validity scales or PVTs as indicators of impairment exaggeration. For example, a meta-analysis found that genuine major depressive disorder was linked to validity scale failures (i.e., F-r, FBS-r, and RBS; Ms > 80 T; Sharf et al., 2017) on the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008). However, studies that relied on PVTs in inpatients (Rees et al., 2001), community-dwelling elders (Ashendorf et al., 2004), disability-seeking outpatients (Yanez et al., 2006), and claimants in personal injury litigation (Stevens et al., 2008) did not find associations between underperformance and BDI(-II) scores.

Indirect support for a link between depressivity and symptom distortion comes from studies showing that people who endorse high levels of negative emotions tend to report more psychological and physical symptoms, regardless of the actual severity of their condition (Costa & McCrae, 1987; but see Friedman et al., 2010). Along similar lines, studies observed that some patients diagnosed with a mood disorder tend to attain higher symptom ratings on the BDI than their clinicians on the observer-rated Hamilton Depression Rating Scale (HDRS; Hamilton, 1980). For example, Stanley and Wilson (2006) observed this discrepancy in patients with major depressive disorder and comorbid borderline personality disorder, but not in those without this comorbidity. Duberstein and Heisel (2007) noted that endorsement of high neuroticism levels is associated with symptom overreporting, whereas other researchers contend that it is self-reported depressive symptoms rather than general neuroticism that is related to inflated reports of physical symptoms (Howren et al., 2009).

Suls and Howren (2012) provided a theoretical rationale for this relationship. According to these authors, depression is accompanied by a negative recall bias (e.g., a heightened accessibility of negative memories), which would lead to inflated self-reports of past symptoms, whereas anxiety would be associated with an attentional bias that escalates self-reports of momentary symptoms (see also Robinson & Clore, 2002).

Alexithymia, Restricted Affectivity, Anhedonia, and Apathy

Alexithymia refers to the inability to identify and articulate internal sensations, coupled with an external orientation (i.e., deficits in emotion processing). A common self-report measure of alexithymia is the Toronto Alexithymia Scale-20 (TAS-20; Bagby et al., 1994). Two studies addressed TAS-20 scores and symptom overreporting on stand-alone SVTs: Brady et al. (2017) found in US veterans diagnosed with posttraumatic stress disorder (PTSD; N = 75) that TAS-20 scores were moderately correlated with failure on the Miller Forensic Assessment of Symptoms Test (M-FAST; Miller, 2001). Similarly, Merckelbach et al. (2018) noted in forensic outpatients (n = 40) and non-forensic participants (n = 40) strong associations between TAS-20 and SIMS scores.

Indirect support for a link between alexithymia and symptom distortion is provided by research that related poor interoceptive accuracy to overreporting of somatic symptoms (e.g., Byrne & Ditto, 2005; Grynberg & Pollatos, 2015; Herbert et al., 2011), although one study did not observe such a relationship (Fairclough & Goodwin, 2007). De Gucht and Heiser (2003) reviewed 16 studies on alexithymia (indexed by TAS-20 or versions thereof) and symptom reporting and concluded that endorsement of alexithymia is significantly, albeit weakly (r = 0.23) associated with inflated self-reports of somatic symptoms. Similar findings were reported in more recent studies relying on patients (Porcelli et al., 2013) and undergraduate students (Bogaerts et al., 2015; Wearden et al., 2005). These studies converged on the preliminary hypothesis that in alexithymia, normal arousal and distress might be mislabeled as highly intense symptoms (Grynberg et al., 2012).

The AMPD traits restricted affectivity and anhedonia are conceptual neighbors of alexithymia (e.g., see Badura, 2003; Gooding & Tallent, 2003). Restricted affectivity is defined as constricted emotional experience and responsivity (i.e., emotional numbing), whereas anhedonia is a lack of enjoyment and engagement in experiences (APA, 2013; Krueger et al., 2012). Kashdan et al. (2007) investigated whether self-reported emotional numbing and anhedonia as indexed by the MMPI-2 are linked to symptom distortion, using the Fp scale as an embedded indicator of overreporting (Butcher et al., 2001). They observed that PTSD veterans who overreported symptoms (Fp > 8) on the MMPI-2 (n = 30) scored higher on MMPI-2 indices of emotional numbing and anhedonia than those who did not engage in overreporting (n = 197), with effect sizes being small to moderate. Dandachi-Fitzgerald et al. (2020) noted in a sample of neurological patients (N = 138) that apathy as measured by the self-report Apathy Evaluation Scale (AES; Marin et al., 1991) was associated with heightened scores on the SIMS and lowered performance on a PVT (i.e., the Test of Malingered Memory; TOMM; Tombaugh, 1997; r = −0.31), albeit it that effect sizes were small to moderate.

Dissociativity

Dissociative symptoms constitute a heterogeneous class of experiences that may range from minor cognitive lapses such as daydreaming, to disabling symptoms such as derealization and depersonalization (Condon & Lynn, 2014; Giesbrecht et al., 2008). In the DSM-5 AMPD, the conceptual twin of dissociation is trait cognitive and perceptual dysregulation, defined as odd or unusual thought processes including dissociative experiences, mixed sleep–wake states, and/or thought-control experiences (APA, 2013; Krueger et al., 2012). Typically, dissociativity is assessed with self-report measures, mostly the DES, although other measures exist (Merckelbach et al., 2017). A recent meta-analysis shows a link between self-reported dissociativity and alexithymia (e.g., r = 0.56 in clinical populations; Reyno et al., 2020). DES scores are particularly related to the TAS-20 subscale “difficulty in identifying feelings” (e.g., rs = 0.46–0.52; Elzinga et al., 2002; Evren et al., 2008).

Merckelbach et al. (2017) reviewed studies in which dissociative symptoms or diagnoses (e.g., dissociative identity disorder [DID]) were associated with overreporting on embedded validity scales and/or stand-alone SVTs. These authors identified six relevant correlational studies (four included stand-alone SVTs) and ten group-comparison studies (i.e., based on test cut-offs; four included stand-alone SVTs), pertaining to various student and/or patient samples (e.g., PTSD; DID). The correlational studies that involved stand-alone SVTs (i.e., Giesbrecht & Merckelbach, 2006; Kunst et al., 2011; Merckelbach et al., 2015; Van der Heide & Merckelbach, 2016) overall found moderate to strong links between endorsement of dissociativity and overreporting. A similar picture emerged from correlational studies that involved embedded validity tests or PVTs). Likewise, a group comparison study in a veteran sample (N = 124; Constans et al., 2014) found that those who failed stand-alone SVTs all were diagnosed with PTSD and scored significantly higher on a self-report measure of dissociation. Other group comparison studies including SVTs or validity scales in small patient samples (PTSD and/or DID, Ns = 19–37, i.e., Brand et al., 2006; Rogers et al., 2009, 2011) found that sizeable proportions (i.e., 25–37%) of patients — especially those with DID — failed on such validity tests.

From a theoretical stance, a potential pathway from depressivity through recall bias and then symptom distortion possesses a certain prima facie plausibility. Much the same is true for alexithymia, poor interoceptive monitoring, and escalating symptom reports. We did not come across any theoretical rationale for how the broad and diverse category of dissociativity might contribute to distorted symptom presentation, except the notion that this overlap is the product of two prominent concomitants of dissociation, namely alexithymia (cf. supra) and fantasy proneness (Merckelbach et al., 2017), to which we will turn now.

Fantasy Proneness and Cognitive Dysregulation

Fantasy proneness refers to a strong preference for vivid imagery and make-believe experiences and activities (Elzinga et al., 2002; Merckelbach et al., 1999; Rauschenberger & Lynn, 1995). A common self-report measure of fantasy proneness is the Creative Experiences Questionnaire (CEQ; Merckelbach et al., 2001). A recent meta-analysis showed that self-reported dissociative symptoms and CEQ scores are strongly correlated (r = 0.52; Merckelbach et al., 2021). Fantasy proneness comes close to the DSM-5 AMPD trait unusual beliefs and experiences, which is described as beliefs of unusual abilities (e.g., mind-reading, telekinesis), thought-action fusion, and unusual (e.g., hallucination-like) experiences of reality (APA, 2013; Krueger et al., 2012).

Only two studies have directly examined the link between CEQ and symptom overreporting on a dedicated, stand-alone SVT. One study found in a sample of compensation seeking victims of interpersonal violence (N = 125) that CEQ scores were moderately correlated with symptom overreporting on the SIMS and that CEQ scores negated the link between dissociation and PTSD symptoms in this sample (Kunst et al., 2011). Similarly, another study observed in undergraduates (N = 182) that CEQ scores were moderately associated with overreporting tendencies on the SIMS (Merckelbach & Smith, 2003).

Other evidence for an association between self-reported fantasy proneness and symptom distortion comes from an analogue simulation study in which undergraduate students (N = 648) were instructed to feign traumatic stress symptoms. Specifically, those who endorse high levels of fantasy proneness more often atypically responded on the Trauma Symptom Inventory (TSI; Briere, 1995) and overall had inflated symptom scores as compared with those low on self-reported fantasy proneness (Peace & Masliuk, 2011). Relatedly, Merckelbach and van de Ven (2001) found that endorsement of fantasy proneness predicts hallucinatory reports during white noise exposure in students.

From a theoretical perspective, an association between fantasy proneness and distorted symptom presentation makes sense. It might reflect fantasy prone people’s preference for exploring the limits of reality and convention (i.e., counterfactual thinking; see Bacon et al., 2013), which engenders low thresholds when completing self-reports that include eccentric experiences and symptoms. Alternatively, the driving force might be the overlap between fantasy proneness and schizotypal features and the cognitive dysregulation implicated in that, which promotes inattentive or careless responding when filling out self-report instruments that list symptoms (Merckelbach et al., 2017).

Studies on patients with schizophrenia spectrum disorders indirectly support a relationship between cognitive dysregulation and symptom overreporting. That is, such diagnoses in patients are associated with failure on SVTs (e.g., Van Impelen et al., 2014), and on embedded validity indicators or PVTs (e.g., Peters et al., 2013; Gorissen et al., 2005; Van der Heide & Merckelbach, 2016, Van der Heide et al., 2017, 2019; but see Schroeder & Marshall, 2011). Several authors have suggested that deficits in reality monitoring, illness insight, and cognitive functions may drive such failures on validity tests (Radaelli et al., 2013; Shad et al., 2006; Schaefer et al., 2013; but see Stevens et al., 2014).

Underreporting

Empirical studies in the domain of traits and underreporting have largely focused on correlations between self-report measures of traits, for example, the NPI and/or the Pathological Narcissism Inventory (PNI; Pincus et al., 2009) in the case of narcissism, and self-report measures of SDR, such as the Marlowe-Crowne Social Desirability Scale (MCSDS; Crowne & Marlowe, 1960) and the Balanced Inventory of Desirable Responding (BIDR; Paulhus, 1988). Both the MCDS and the BIDR do contain items that allude to symptoms (e.g., It’s hard for me to shut off a disturbing thought) but most of their items gauge overly optimistic self-presentation in a number of domains (e.g., prosocial behavior, ambition). Table 3 presents an overview of such studies on traits and dedicated measures of underreporting, as identified by our scoping review.

Table 3 Traits and underreporting on stand-alone validity measures

Alexithymia

Eleven studies addressed potential links between self-reports of trait alexithymia and social desirability. Most were conducted in student or community samples and involved the TAS-20 and self-report measures of social desirability. A majority observed small to moderate correlations in the inverse direction — i.e., heightened alexithymia was linked to lower self-reported SDR — whereas some studies found no significant links. Of note is a study of Linden et al. (1996) in undergraduates (N = 80) that, unlike other studies pertaining to alexithymia, used a group-comparison (i.e., based on low, moderate, and high TAS-20 scores) rather than a correlational approach to the issue. The authors found no significant differences across these groups on the BIDR. The only study that examined the relationship between self-reported alexithymia and SDR in a patient group (i.e., in veterans with schizophrenia or schizoaffective disorder; N = 65) is also the only study to find positive (and moderate) correlations with social desirability (i.e., as indexed by the MCSDS; Fogley et al., 2014).

Taken together, these studies found endorsement of alexithymia to be associated with lower self-reported social desirability and/or found no significant associations between these constructs. As to a possible theoretical rationale, Messina et al. (2010) pointed to an inverse association between TAS-20, particularly the “difficulty identifying feelings” subscale of the TAS-20, and the “Lie Scale” of the Eysenck Personality Questionnaire (EPQ; Eysenck & Eysenck, 1985). Interestingly, this link disappeared when controlling for self-reported neuroticism on the EPQ. Whereas previous authors suggested that individuals with relatively poorer emotional differentiation (i.e., alexithymia) might be less capable or interested in social desirability (particularly its self-deception aspect; Fukunishi, 1994), Messina et al. (2010) suggested that such a lack of interest might be more primarily driven by mild diffuse emotional distress (i.e., neuroticism).

Dissociativity

Relying on a correlational approach, seven studies investigated the connection between self-reported dissociation and social desirability. Most of them involved university (Ns = 28–633) or high school students (N = 93; Callahan et al., 2003). None of the studies found that higher endorsement of dissociativity is accompanied by heightened social desirability scores (Beere et al., 1996; Evans et al., 2019; Hyman & Billings, 1998) and some even noted an inverse link, with self-reported dissociativity being related to lower social desirability (Callahan et al., 2003; Elzinga et al., 2002). The one exception to this pattern is a study pertaining to mothers of children aged 6–11 (N = 93) who were referred to Youth Protection Services (Collin-Vézina et al., 2005). Here, endorsement of dissociation was positively and moderately correlated to self-reported social desirability. Only few (n = 6) respondents achieved DES scores above the clinical cut-off. The authors suggested that overall, participants might have underreported on the DES given that they were interviewed following substantiated charges of child abuse or neglect.

Grandiosity

People high on trait grandiosity are self-centered, condescend, and tend to experience feelings of superiority and entitlement, often ascribed to self-presumed outstanding qualities and/or achievements (APA, 2013; Krueger et al., 2012). Grandiosity is one of two distinct dimensions of narcissism, the other being vulnerable narcissism, which includes hypersensitivity, introversion, shame, and inhibition of grandiose desires (e.g., see Cain et al., 2008; Miller et al., 2011; Wink, 1991).

Twenty-three empirical studies examined links between stand-alone self-report measures of narcissism and SDR. These studies were all conducted in student and/or community samples, with the exception of studies (in part) in myocardial infarction patients (n = 30; Fukunishi et al., 1995), an incarcerated sample (n = 703; Sleep et al., 2017), terrorism or work/car accident survivors (N = 152; Levi & Bachar, 2019), and adult psychiatric patients (N = 147, De Page & Merckelbach, 2021). Below, we catalogued the findings by dividing these studies into two time periods: early articles published before 1998, which primarily examined links between the NPI and the MCSDS, and later articles that more often included several self-report measures of narcissism (e.g., grandiose and vulnerable dimensions; the PNI) and social desirability (e.g., self-deceptive enhancement [SDE] and impression management [IM]; the BIDR).

Overall, early studies relied on undergraduate samples (Ns = 85–221) and found little support for the idea that narcissistic features as self-reported on the NPI are linked to heightened social desirability as indexed on the MCSDS. Specifically, some studies tended to find no link (Auerbach, 1984; Watson et al., 1986) between narcissism and social desirability, whereas other studies found that the Entitlement/Exploitativeness subscale of the NPI (Watson & Morris, 1991; Watson et al., 19861984) and/or total NPI scores (Fukunishi et al., 1995; Watson & Morris, 1991; Watson et al., 1984) were linked to lower social desirability. Furthermore, studies that included several measures of narcissism found mixed results. Using composite scores of narcissism measures, Raskin et al. (1991) replicated in undergraduate samples (Ns = 60–300) the inverse (i.e., negative) and moderate link between self-reported narcissism and SDR, but found a positive relationship between narcissism and self-deception (rs = 0.30–0.63). Alternatively, Hibbard (1992) found in university students (N = 701) null and inverse associations between various measures of narcissism and the MCSDS. Taken together, early studies found endorsement of narcissism either not to be associated with social desirability or to be associated with lowered SDR tendencies.

More recent studies also produced rather inconsistent results. In the only studies that examined an incarcerated sample (Sleep et al., 2017) or a patient sample (De Page & Merckelbach, 2021), small positive links emerged between (some subscales of) narcissism and underreporting. Specifically, the former study examined in a correctional subsample (n = 703) the connection between the NPI and underreporting scales of the MMPI-2 (i.e., the L-r and K-r scales). The link between the NPI and underreporting varied depending on the NPI scale: positive for the Leadership/Authority scale (rs = 0.13); null results for the Grandiose Exhibitionism scale (rs = 0.02 and 0.03); and inverse links for the Entitlement/Exploitativeness scale (rs = −0.16). Furthermore, in two additional subsamples of undergraduates (ns = 228 and 482), across several self-report measures of narcissism and embedded validity indices of underreporting, inconsistent links (rs = −0.34 to 0.24) emerged for (grandiose) narcissism, whereas vulnerable narcissism was related to less underreporting (rs = −0.52 to −0.27). De Page and Merckelbach (2021) relied on adult patients and looked into the association between self-reported supernormality (i.e., denying common symptoms) and grandiose narcissism as measured by the NPI. These authors observed a small and positive correlation between these two constructs. Similarly, other more recent work (Fernie et al., 2016; Manley et al., 2018; Paulhus, 1998) found positive links between (specific subscales of) narcissism and SDR.

However, other recent studies replicated earlier findings suggesting that endorsement of narcissism is related to lowered self-reported SDR (Braun et al., 2016; Jones et al., 2016; Levi & Bachar, 2019; Sedikides et al., 2004; Wu et al., 2019; Yu, 2018). Still others observed no links between endorsement of narcissism and SDR (Barelds & Dijkstra, 2010; Lyvers et al., 2019). Tellingly, studies that employed multiple self-report measures of narcissism often yielded mixed findings (Barry et al., 2016; Gamache et al., 2018; Manley et al., 2018; Zeigler-Hill & Wallace, 2011) across various indices of narcissism and desirable responding.

Taken together, studies on self-reported narcissism and social desirability produced many null results and, when they did find correlations, they were typically small, non-replicable, and inconsistent. From a theoretical point of view, both a positive and a negative connection a priori make sense: that grandiose narcissists self-enhance by presenting themselves favorably on agentic and egoistic values (e.g., intelligence, dominance, and assertiveness; Campbell et al., 2002), regardless of context, stakes, and incentives (Hart et al., 2019; Maaß & Ziegler, 2017). Alternatively, (grandiose) narcissists might not value social acceptance. As things stand, the empirical literature does favor neither the first, nor the second hypothesis.

Attention Seeking

In the DSM-5 AMPD, attention seeking refers to behavior focused on attracting notice, attention, or admiration from others (APA, 2013; Krueger et al., 2012). Historically, the concept has been associated with hysteria. One prominent idea about hysteria is that it typically involves overreporting of physical symptoms, but underreporting of psychological symptoms. One-third of hysteria patients are thought to display the phenomenon of la belle indifference: a careless attitude and lack of emotional distress about the somatic symptoms they present, which otherwise typically cause grave concern in patients with the actual medical condition (Heubrock & Petermann, 1998; Stephens & Kamp, 1962). We found no studies that directly tested the connection between attention seeking/hysteria and stand-alone measures of underreporting, but there is some indirect evidence that bears on the issue. Germane to this is a study (Ornduff et al., 1988) that relied on the MMPI-2 profile related to hysteria — i.e., high elevations of the Hy scale. The authors divided the Hy scale into items that reflect “bodily concern” and “psychological denial” and observed that true somatic conditions were linked to moderate Hy scores and endorsement of “bodily concern” items, whereas somatic patients with a psychiatric history attained high Hy scores by prominently endorsing the “psychological denial” items. This pattern fits with the idea that neuroticism is related to overreporting of somatic symptoms (e.g., Rosmalen et al., 2007).

Suspiciousness and Withdrawal

Trait suspiciousness is related to presuming and ascribing ill-intent and harm to others, such as being mistreated, used, or persecuted. Thus, people high on suspiciousness are distrustful, alert, and sensitive in interpersonal contact. Similarly, people high on trait withdrawal have a preference to keep to themselves. This often manifests itself in a reserved, avoidant, or disinterested behavior towards social contact (APA, 2013; Krueger et al., 2012). There is no empirical literature on underreporting and suspiciousness or withdrawal, but an interesting lead can be found in the extant MMPI literature on self-deceptive underreporting (Bagby & Marshall, 2004). That is, subtle denial reflected in high scores on the K-scale (Correction scale; Meehl & Hathaway, 1946) may in part be related to self-reported cynicism, (interpersonal) mistrust, and introversion, whereas the defensiveness reflected in high scores on the S-scale (Superlative Self-Presentation; Butcher & Han, 1995) may in part be associated with self-reported hypersensitivity and suspicion on the MMPI (Nichols, 2001).

Theoretical considerations also suggest self-deceptive underreporting in paranoia. Paranoid individuals are thoroughly — and self-deceptively — convinced of external threats, such that they tend to be resistant to alternative explanations that stress internal biases and monitoring errors instead. Thus, it might well be the case that paranoid or withdrawn individuals are disinclined to share sensitive information about their symptoms, out of a fear that disclosure might reinforce external agency over them (e.g., power; negative social feedback). Along these lines, a relevant question is whether forewarning and coaching (e.g., Baer & Wetter, 1997) may make paranoid people particularly reluctant to endorse symptoms.

Discussion

Our scoping review examined whether personality traits are related to symptom distortion on stand-alone validity tests (i.e., heightened scores and/or failures) in the form of overreporting or underreporting, the latter broadly defined so as to include both denial of deviations from social norms (i.e., SDR) and denial of symptoms (i.e., supernormality). For overreporting, the literature shows strong links with depression and alexithymia, moderate links with dissociation and fantasy proneness, and weak links with apathy and (based on embedded validity scales) restricted affectivity and anhedonia. For underreporting, studies found small to moderate and difficult to replicate links with alexithymia, dissociation, and narcissism. More generally, the overwhelming majority of correlations between self-reported traits and overreporting or underreporting were in the small-to-moderate range, indicating modest effect sizes at best. This state of affairs is best viewed in light of the methodological limitations of the studies in this field, which give little reason to conclude that there exists robust evidence for traits as major causal antecedents of distorted symptom reports.

First, the modest connections between self-reported traits and validity test scores preclude any strong statements about the directionality underlying these connections. The problem here is, of course, the accuracy of self-reported traits (see also, Merten et al., 2007). This problem may be less pronounced in studies in which a majority of participants attained elevated scores on validity measures that were not sufficiently deviant to qualify as failures (i.e., surpassing an empirically established cut-off point). The distinction between scores that are slightly elevated versus those indicative of failure on a validity test is crucial, because subtle symptom distortion would not preclude clinicians from interpreting other psychometric data from the same person (e.g., self-reports of traits), whereas validity test failure would. Yet, papers often do not differentiate between subtle versus profound symptom distortion. For example, it might well be the case that traits such as depressivity or alexithymia promote mild forms of invalid symptom reporting (e.g., below cut-off), but do not serve as precursors of massive distortions (e.g., beyond the cut-off). Future studies could include separate analyses for individuals with elevated validity test scores and those who exhibit clear validity test failure. Of course, the issue touches upon a more fundamental question, namely whether symptom distortion is a dimensional or a categorical (i.e., taxon-like) phenomenon (e.g., Walters et al., 2008).

Second, the literature on traits and symptom distortion is scattered and mostly focused on single traits and either symptom overreporting or underreporting. What is missing is an overarching and well-articulated theory that explains why and how various traits might be related to symptom distortion. What comes closest to this ideal is Suls and Howren’s (2012) framework that specifies how anxiety and depression fuel attentional bias and recall bias, respectively, thereby promoting momentary and retrospective escalation of symptom reports.

Third, almost without exception, the empirical literature in this domain relied on cross-sectional set-ups and so the data are predominantly correlational in nature, precluding any strong statements with regard to causality. It is curious that so few studies have adopted a test–retest procedure: demonstrating temporal stability would be a first and essential step for research endeavors that want to clarify trait contribution to symptom overreporting or underreporting. That such an approach is, in principle, feasible is illustrated by the study of An et al. (2012), who administered PVTs to a small student sample (N = 36) on two occasions and found that participants who failed at one point in time also tended to fail at a later time. Of course, this is only tentative evidence for a trait-like component, because there might be other, more parsimonious explanations for this temporal stability. For example, inattentive or random responding might spuriously create temporal stability and even inflate correlations between trait measures and indices of overreporting or underreporting (see also Bowling et al., 2016).

A fourth methodological consideration concerns the small sample sizes in this research domain. In order to obtain stable estimates of medium correlations of r = 0.3, sample sizes of N ≥ 212 are recommended, although strong or very large correlations are likely to be stable in relatively smaller samples (Schönbrodt & Perugini, 2013). Based on our scoping review, it is evident that many of the included studies do not meet this recommendation. Relatedly, structural equation modelling approaches in which the merits of several causal interpretations can be evaluated are conspicuously absent, precisely because such approaches require substantial sample sizes. It is obvious which rivaling interpretations would need to be tested in this field. For example, robust associations between depression and symptom overreporting might reflect depression leading to overreporting due to recall bias (Suls & Howren, 2012), it might reflect overreporting leading to inflated depression scores (Merten et al., 2019a), or it might reflect inattentive/careless responding leading to both symptom overreporting and escalated depression scores (Bowling et al., 2016).

In sum then, there is no convincing evidence that traits are such powerful antecedents of symptom distortion that they may contribute to a complete failure on a validity test. As things stand, certain traits can best be viewed as modest concomitants of symptom overreporting and underreporting, concomitants of which we do presently not know what their causal status is. Authors who failed to find clinically meaningful and statistically relevant associations between traits and symptom distortion echo our somewhat pessimistic summary of the findings in this field and symptom distortion, leading some of them to conclude that symptom distortion is primarily a situationally specific phenomenon (e.g., Young et al., 2016).

Several limitations apply to our scoping review. First, we did not perform a systematic review or meta-analysis. The restrictions in number and methodology of empirical studies in this domain make a quantitative review impossible. As said, studies typically relied on cross-sectional designs and remained silent on confounders such as trait-context interactions and carry-over effects (see also Niesten et al., 2015; Rogers, 1990; Walters, 1988). Moreover, they typically did not adhere to a “multi-method approach” of including multiple validity tests, such that a “two-failure rule” could be applied on identifying distorted symptom presentation (Victor et al., 2009). Second, we defined personality in the context of “pathological” traits. Other approaches would have been conceivable, including those focused on trait domains rather than facets (APA, 2013), models on maladaptive self-functioning and other-functioning (e.g., Livesley, 2003, 2006), and models of “normal” personality such as the Five Factor Model (FFM, e.g., McCrae & John, 1992). Indeed, symptom distortion might not be a function of personality pathology per se, rather normal personality functioning might contribute to symptom distortions in the lower range of the distribution.

Third, our review was perhaps naïve to assume that traits function independently of context (e.g., student populations, bona fide or contaminated clinical or forensic patient samples, criminal offenders). Arguably, incentives inherent to these contexts may foment symptom distortion (e.g., Mittenberg et al., 2002; Walters, 1988; Young, 2015) and their interaction with traits largely remains an open question. What we do know is that the effect of incentives is not more profound in those with antisocial traits or disorders, individuals who may be generally indifferent about impressions in the absence of incentives (for reviews, see Niesten et al., 2015; Van Impelen et al., 2017). Moreover, administering lengthy and boring tests to people may evoke in some of them careless/inattentive responding, which in itself is linked to certain traits (e.g., low conscientiousness; Bowling et al., 2016). Another clue is that people with heightened anxiety in interpersonal contact are prone to attentional bias, which may involve inflated interpretations of momentary symptoms (Suls & Howren, 2012). We suspect that their coping (e.g., approach or avoidant behavior) and symptom reporting (e.g., overreporting versus underreporting) may well depend on the extent to which clinicians are seen as a signal of safety (e.g., support, alleviation) or danger (e.g., untrustworthy) in that very moment. Future research may wish to focus on such trait-context interactions.

Conclusions

Symptom overreporting and underreporting occur on a regular basis. Thus, identification of their antecedents might inform assessment and treatment. Are there good reasons to suspect that personality traits are associated with distorted symptom presentation? The current review tried to address that question. Our scoping review of the empirical literature shows that there is currently little reason to believe that certain traits are conducive to symptom distortion. To take this research domain one step further, the focus on single traits such as the AMPD trait facets should be abandoned. Further progress depends on conceptual housecleaning, on the one hand, and broad inductive studies, on the other hand.

As to the first priority, the transdiagnostic literature on broad dimensions of psychopathology, including the domain level of traits — such as neuroticism/negative emotionality (N/NE) — might inspire more articulated theories on how traits, biases, and distorted symptom reports are related to each other. One recent idea is that the experience of symptoms involves both a sensory-perceptual component and an affective-motivational component. In individuals high on N/NE, the perceptual detail input would be poor, whereas the affective component would be strong, making these individuals vulnerable to symptom reports not related to physiological dysfunction (Van den Bergh et al., 2021). A similar theory is that N/NE is related to a negative interpretation bias in the form of negative disambiguation, i.e., interpretation of ambiguous signals in a catastrophic way. Prospective studies provide evidence for this line of reasoning (e.g., Engelhard & van den Hout, 2007).

As to the second priority, it would be informative to conduct cross-sectional studies that include broad measures of multiple traits and symptom distortions (e.g., overreporting and underreporting) in large samples with low or high incentives. Conducting factor analysis on such data would provide insight into the most important trait correlates of distorted symptom presentation and how they might interact with incentives. Until such studies are done, there is little ground for the claim that people who present with distorted symptom reports do so because they possess certain personality traits.