Careless Responding to Reverse-Worded Items: Implications for Confirmatory Factor Analysis

Original Article

Many self-report measures include some items worded in the direction opposite to that of other items. These so-called reverse-worded (RW) items can reduce the reliability and validity of a scale, and frequently form a separate method factor that does not appear to be substantively meaningful. One possible explanation for factors defined by RW items is respondent carelessness. The purpose of the present study was to evaluate whether relatively few careless responders to RW items can influence confirmatory-factor-analysis model fit enough that researchers would likely reject a one-factor model for a unidimensional scale. Results based on simulations indicated that if at least about 10% of participants respond to RW items carelessly, researchers are likely to reject a one-factor model for a unidimensional scale.


Reverse-scored reverse-worded item wording factor analysis careless responding 


  1. Barnette, J. J. (2000). Effects of stem and Likert response option reversals on survey internal consistency: If you feel the need, there is a better alternative to using those negatively worded stems. Educational and Psychological Measurement, 60, 361–370.CrossRefGoogle Scholar
  2. Benson, J. (1987). Detecting item bias in affective scales. Educational and Psychological Measurement, 47, 55–67.CrossRefGoogle Scholar
  3. Bentler, P. M. (1990). Comparative fit indices in structural models. Psychological Bulletin, 107, 238–246.CrossRefGoogle Scholar
  4. Bentler, P. M. (1995). EQS structural equations program manual. Encino, CA: Multivariate Software.Google Scholar
  5. Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness-of-fit in the analysis of covariance structures. Psychological Bulletin, 88, 588–606.CrossRefGoogle Scholar
  6. Birnbaum, A. (1968). Some latent trait models. In F. M. Lord & M. R. Novick (eds), Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison & Wesley.Google Scholar
  7. Browne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. Sociological Methods and Research, 21, 230–258.CrossRefGoogle Scholar
  8. Conrad, K. J., Wright, B. D., McKnight, P., McFall, A., Fontana, A., & Rosenheck, R. (2004). Comparing traditional and Rasch analyses of the Mississippi PTSD scale: Revealing limitations of reverse-scored items. Journal of Applied Measurement, 5, 15–30.PubMedGoogle Scholar
  9. Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38, 67–86.Google Scholar
  10. Greenberger, E., Chen, C., Dmitrieva, J., & Farruggia, S. P. (2003). Item-wording and the dimensionality of the Rosenberg self-esteem scale: do they matter? Personality and Individual Differences, 35, 1241–1254.CrossRefGoogle Scholar
  11. Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.Google Scholar
  12. Jöreskog, K. G., & Sörbom, D. (1981). LISREL V: Analysis of linear structural relationships by the method of maximum likelihood. Chicago: National Educational Resources.Google Scholar
  13. Knight, R. G., Chisholm, B. J., Marsh, N. V., & Godfrey, H. P. (1988). Some normative, reliability, and factor analytic data for the revised UCLA Loneliness Scale. Journal of Clinical Psychology, 44, 203–206.PubMedCrossRefGoogle Scholar
  14. Lai, J. C. L. (1994). Differential predictive power of the positively versus the negatively worded items of the life orientation test. Psychological Reports, 75, 1507–1515.Google Scholar
  15. Levine, M. V., & Drasgow, F. (1982). Appropriateness measurement: Review, critique, and validating studies. British Journal of Mathematical and Statistical Psychology, 35, 42–56.Google Scholar
  16. Marsh, H. W. (1986). Negative item bias in ratings scales for preadolescent children: A cognitive-developmental phenomenon. Developmental Psychology, 22, 37–49.CrossRefGoogle Scholar
  17. Marsh, H. W. (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors? Journal of Personality and Social Psychology, 70, 810–819.PubMedCrossRefGoogle Scholar
  18. Meijer, R. R., & Sijtsma, K. (1995). Detection of aberrant item score patterns: A review of recent developments. Applied Measurement in Education, 8, 261–272.CrossRefGoogle Scholar
  19. Meijer, R. R., & Sijtsma, K. (2001). Methodological review: Evaluating person fit. Applied Psychological Measurement, 25, 107–135.CrossRefGoogle Scholar
  20. Motl, R. W., Conroy, D. E., & Horan, P. M. (2000). The social physique anxiety scale: An example of the potential consequence of negatively worded items in factorial validity studies. Journal of Applied Measurement Special Issue: Constructing variables, 1, 327–345.Google Scholar
  21. Muthén, B. O. (1998–2004). Mplus Technical Appendices. Los Angeles, CA: Muthén & Muthén. Downloaded from on June, 2005.Google Scholar
  22. Muthén, L. K., & Muthén, B. O. (1998–2004). Mplus user's guide, 3rd ed. Los Angeles, CA: Muthén & Muthén.Google Scholar
  23. Pilotte, W. J., & Gable, R. K. (1990). The impact of positive and negative item stems on the validity of a computer anxiety scale. Educational and Psychological Measurement, 50, 603–610.CrossRefGoogle Scholar
  24. Reise, S. P. (1995). Scoring method and detection of person misfit in a personality assessment context. Applied Psychological Measurement, 19, 213–229.CrossRefGoogle Scholar
  25. Reise, S. P., & Flannery, P. W. (1996). Assessing person-fit on measures of typical performance. Applied Measurement in Education, 9, 9–26.CrossRefGoogle Scholar
  26. Rodebaugh, T. L., Woods, C. M., Thissen, D. M., Heimberg, R. G., Chambless, D. L., & Rapee, R. M. (2004). More information from fewer questions: The factor structure and item properties of the original and brief Fear of Negative Evaluation Scale. Psychological Assessment, 16, 169–181. (Contributions of the first two authors are equal.)PubMedCrossRefGoogle Scholar
  27. Rodebaugh, T. L., Woods, C. M., Heimberg, R. G., Liebowitz, M. R., & Schneier, F. R. (in press). The Factor Structure, Item Properties, and Screening Utility of the Social Interaction Anxiety Scale. Psychological Assessment.Google Scholar
  28. Schmitt, N., & Stults, D. M. (1985). Factors defined by negatively keyed items: The result of careless respondents? Applied Psychological Measurement, 9, 367–373.CrossRefGoogle Scholar
  29. Schriesheim, C. A., & Eisenbach, R. J. (1995). An exploratory and confirmatory factor- analytic investigation of item wording effects on the obtained factor structures of survey questionnaire measures. Journal of Management, 21, 1177–1193.CrossRefGoogle Scholar
  30. Schriesheim, C. A., Eisenbach, R. J., & Hill, K. D. (1991). The effect of negation and polar opposite item reversals on questionnaire reliability and validity: An experimental investigation. Educational and Psychological Measurement, 51, 67–78.CrossRefGoogle Scholar
  31. Schriesheim, C. A., & Hill, K. D. (1981). Controlling acquiescence response bias by item reversals: The effect on questionnaire validity. Educational and Psychological Measurement, 41, 1101–1114.CrossRefGoogle Scholar
  32. Spector, P. E., Van Katwyk, P. T., Brannick, M. T., & Chen, P. Y. (1997). When two factors don't reflect two constructs: How item characteristics can produce artifactual factors. Journal of Management, 23, 659–677.CrossRefGoogle Scholar
  33. Steiger, J. H., & Lind, J. M. (1980). Statistically based tests for the number of common factors. Paper presented at the annual meeting of the Psychometric Society, Iowa City, IA.Google Scholar
  34. Tomas, J. M., & Oliver, A. (1999). Rosenberg's self-esteem scale: Two factors or method effects. Structural Equation Modeling, 6, 84–98.CrossRefGoogle Scholar
  35. Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10.CrossRefGoogle Scholar
  36. Watson, D., & Friend, R. (1969). Measurement of social-evaluative anxiety. Journal of Consulting and Clinical Psychology, 33, 448–457.PubMedCrossRefGoogle Scholar
  37. Woods, C. M., & Rodebaugh, T. L. (2005). Factor Structures of the Original (FNE) and Brief (BFNE) Fear of Negative Evaluation Scales: Correction to an Erroneous Footnote. Psychological Assessment, 17, 385–386.Google Scholar

Copyright information

© Springer Science+Business Media, Inc. 2006

Authors and Affiliations

  1. 1.Psychology DepartmentWashington University in St. LouisSt. LouisUSA

Personalised recommendations