Socially Desirable Responding on Self-Reports
Whenever individual differences are measured with self-reports, concerns arise over response biases: They are habitual tendencies to respond to questions based on item properties such as keying direction and the desirability of the response options. Such tendencies may interfere with the ability of self-reports to capture the intended individual differences. Validity scales are available to measure such response biases as acquiescent responding, extreme responding, and random responding. But for various reasons, the greatest concern has been voiced over individual differences in socially desirable responding (SDR), that is, stylistic differences in the tendency to present oneself in a positive light.
Socially Desirable Responding (SDR) may occur as a response style, that is, a general tendency to give desirable answers on all self-reports. This consistent behavior may or may not have implications for broader individual difference variables (see below). Alternatively, SDR may appear as a response set, that is, a temporary motivation to appear positive. For example, applicants for the same job may differ in how desperate they are: Some may have been unemployed for 6 months whereas others may already have another job.
Problem with Confounding
The central concern is that SDR may act as a confound in a variety of self-reports. Individuals who are responding desirably on an SDR scale are likely to be responding desirably on other measures included in the same administration package. If a self-report variable is found to correlate with SDR, then two possible interpretations become available. If the researcher cannot distinguish between a personality variable (content) and SDR (a style), then, the whole assessment endeavor seems compromised. Indeed, some researchers have been known to abandon a self-report measure at the appearance of this threat to validity.
In self-reports of agreeableness, for example, it may be difficult to determine whether a high score is capturing agreeableness or the tendency to give desirable answers. If the latter predominates, one must interpret the personality scores quite differently. Indeed, the implication is that high scorers are fakers – a quality far different from their claim to possess desirable personalities (Graziano and Tobin 2002).
On the other hand, many assessment specialists argue that concerns over SDR are overblown. Personality researchers, for example, point out that the validity of many instruments (Big Five measures, for example) is well-established – despite substantial correlations with SDR scales. Moreover, the validity of those instruments changes little when SDR is controlled.
Evolution of Measurement
Concerns over socially desirable responding (SDR) were raised as soon as personality scales began to appear. The importance and complexity of this notion was highlighted during the 1940s with inclusion of validity scales in the MMPI. Indeed, their scoring system automatically used SDR scales to correct scores on psychopathology scales (Meehl and Hathaway 1946). Even in early versions, Hans Eysenck included a Lie scale in his influential personality inventories (e.g., Eysenck and Eysenck 1975). The most extreme allegations about SDR appeared during the 1950s where Allen Edwards (1957) and others went so far as to allege that the variance in self-reports of personality and psychopathology was almost entirely based on respondents’ differential concern with social desirability.
It wasn’t until the 1960s that the popular Marlowe-Crowne scale was developed and popularized by Douglas Crowne and David Marlowe (1960). Its wide acceptance was based on the fact that its construct validity was supported by a thorough body of research (Crowne and Marlowe 1964). However, the fact that Marlowe-Crowne scores did not converge with scores on other SDR scales raised much confusion (Wiggins 1973). Eventually consensus was reached when replicable structural analyses settled on two broad factors. Paulhus (1984) interpreted the two factors as self-deception, an unconscious self-favorability, and impression management, the intentional distortion of self-descriptions. The corresponding subscales of the Balanced Inventory of Desirable Responding (Paulhus 1991) have become the standard method for separating these two forms of SDR.
Content Versus Style
The value of SDR measures rests on the answer to a pivotal question: Can SDR measures distinguish content (true personality) from style (self-report bias)? Although some respondents score high on SDR because they are exaggerating their positive traits, other respondents may score high because they are honestly reporting possession of those positive traits (McCrae and Costa 1983). The former interpretation has led many researchers to assume that correlations with SDR scales invalidate personality measures. After all, it would be perilous for an employer to select personnel who show positive personality scores if those scores actually indicate a tendency to embellish.
Note that there is little evidence that SDR scales tap distinct personality variables. They never appear in factor analyses of personality, no matter how comprehensive. When they do appear it is at the meta-analytic level in two factor summaries of personality (Paulhus and Trapnell 2008). That pattern indicates that SDR operates at a broader level than common personality variables such as those composing the MMPI, Big Five, or 16PF.
Control of SDR
The impact of SDR can be minimized before it can occur by appropriate item design: Examples include neutral wording of item statements and use of forced-choice format where the desirability of the two responses is pre-equated. Where possible, test administration emphasizing confidentiality and anonymity can also reduce SDR. Methods that attempt to control SDR after it occurs are not recommended: Especially inappropriate is partialing SDR out of a personality variable. Indeed, several lines of research have shown that attempts to remove SDR from personality measures do not improve (and may actually reduce) the validity of these constructs: Metaphorically, this removal of overlap may “throw out the baby with the bathwater.”
The notion of using a within-subject design to capture SDR has been around for some time. Modern multivariate techniques have provided an alternative to crude partialing versions. Instead, the role of SDR in self-reports can now be understood in structural equation models using a hybrid of within- and between-subject analyses (Ziegler and Buehner 2009).
To avoid the inevitable confound of evaluation and personality content, several alternative measures have operationalized SDR with objective indicators. Among these are Ronald Holden’s laboratory method of comparing response times to faked and honest instructions (e.g., Holden and Kroner 1992). People tend to respond more slowly when told to fake a response in the opposite direction to their preference. The scientific advantage of this technique is the concrete nature of reaction times. The downside is the impracticality of collecting response times in most assessment situations.
Another behavioral method is Paulhus’s overclaiming technique, where respondents are given the opportunity to rate their familiarity with a variety of items, some of which do not exist (e.g., Paulhus et al. 2003). The tendency to claim foils can be considered a concrete indicator of SDR. Using signal detection methods to score familiarity ratings, the overclaiming method permits the simultaneous scoring of accuracy and bias. It has been successfully applied to such domains as educational assessment, consumer knowledge, and cross-cultural differences. Because they are concrete behaviors, this method is not open to the same criticism as standard social desirability scales. Moreover, the measure can be included in questionnaire packages and scored without any external criterion.
Instead of the original interpretation (self-deception and impression management), evidence has accumulated that the two large SDR factors differ with respect to content (Paulhus 2002). The distinction maps on to the two fundamental personality constellations commonly labeled agency and communion: Agency refers to achievement striving and differentiating oneself from others whereas communion refers to an integration with and concern for others (Bakan 1966). Holden and Fekken (1989) labeled the two factors Self Capability and Interpersonal Sensitivity – virtual synonyms for agency and communion. Their corresponding evaluative biases have been labeled egoistic versus moralistic (Vecchione and Alessandri 2013). Corresponding measures of agentic and communal impression management are now available (Blasberg et al. 2014).
Despite 60 years of research, many researchers remain concerned about the impact of SDR on self-report measures of personality and psychopathology. Its implications are especially important in the fields of personnel selection and clinical diagnosis. The choice to interpret self-report scores as indicating personality rather than SDR can have far-reaching consequences. One recent example is the dramatically different interpretations of the Impression Management (IM) scale found in two recent studies. Whereas Uziel (2014) found evidence for prosocial attributes, Davis et al. (2012) found antisocial correlates of high IM scores. The diversity of current perspectives is exemplified in the edited volume by Ziegler et al. (2012).
It is important to note that most SDR scales were designed to capture only positive elements of impression management. The standard selection methodology (i.e., fake good) depends solely on upward distortion. Hence items with low base-rates under honest response conditions are more likely to be selected (Wiggins 1959). Low scores on SDR scales are assumed to indicate a respondent free of bias. A qualitatively different type of validity scale is required to tap negative response biases such as malingering (see Rogers et al. 1991).
A continuing concern with self-report measures is a response bias called socially desirable responding (SDR), that is, differences in people’s tendency to exaggerate the positivity of their characteristics. Of the roster of response biases, SDR has drawn the most attention because it confounds the interpretation of self-reported personality, psychopathology, attitudes, values, etc. A variety of techniques have been developed to address these concerns. Recommended are those designed to minimize SDR before it can occur. Post hoc attempts to control SDR should be discouraged, for example, partialing SDR from scores on other individual difference variables.
Instead, correlations with personality scales should be viewed as informational rather than evidence of contamination. Because SDR measures differ in their emphasis on agentic desirability versus communal desirability, the pattern of correlations can be informative with regard to evaluative implications of a self-report variable. Rather than evidence for corrupted measurement, correlations with SDR may actually help clarify the psychological processes underlying self-reports.
- Bakan, D. (1966). The duality of human existence: Isolation and communion in Western man. Boston: Beacon Press.Google Scholar
- Crowne, D. P., & Marlowe, D. (1964). The approval motive. New York: Wiley.Google Scholar
- Edwards, A. L. (1957). The social desirability variable in personality assessment and research. New York: Dryden Press.Google Scholar
- Eysenck, H. J., & Eysenck, S. B. G. (1975). Manual of the Eysenck Personality Questionnaire (EPQ). London: Stoughton Educational.Google Scholar
- Paulhus, D. L. (2002). Socially desirable responding: The evolution of a construct. In H. I. Braun, D. N. Jackson, & D. E. Wiley (Eds.), The role of constructs in psychological and educational measurement (pp. 49–69). Mahwah: Erlbaum.Google Scholar
- Paulhus, D. L., & Trapnell, P. D. (2008). Self-presentation: An agency-communion framework. In O. P. John, R. W. Robins, & L. A. Pervin (Eds.), Handbook of personality psychology (pp. 492–517). New York: Guilford.Google Scholar
- Wiggins, J. S. (1959). Interrelationships among MMPI measures of dissimulation under standard and social desirability instructions. Journal of Consulting Psychology, 23, 419–427.Google Scholar
- Wiggins, J. S. (1973). Personality and prediction: Principles of personality assessment. Reading, MA: Addison-Wesley.Google Scholar
- Ziegler, M., MacCann, C., & Roberts, R. D. (2012). New perspectives on faking in personality assessment. Oxford: New York.Google Scholar