Skip to main content
Log in

Judgmental relativism as a validity threat to standardized psychiatric rating scales

  • Published:
Journal of Psychopathology and Behavioral Assessment Aims and scope Submit manuscript

Abstract

Judgemental relativism is a threat to the replicability and validity of measures of client behavior from direct rating scales whenever raters are exposed to different levels of client functioning since the internal standards, or anchor points, used to judge dimensional continua may vary on the basis of prior experience. Traditional interrater reliability indexes fail to identify such effects. The influence of judgmental relativism on summated ratings from the Nurses Observational Scale for Inpatient Evaluation (NOSIE-30) for 1040 adult mentally ill clients was examined with clinical staff raters from 24 treatment units in which the Time-Sample Behavioral Checklist (TSBC) provided full-week objective measures of actual client functioning via hourly direct observational coding (DOC). Regression analyses found that the same level of objective performance received higher or lower ratings across treatment units dependent on the raters'exposure to client groups that differed in level of functioning. Analyses of rating errors found that clients with better levels of functioning relative to others within treatment units were rated even higher than performance warranted. The operation of halo and contrast effects is explored and guidelines are provided for determining when judgmental relativism may produce or nullify “significant differences.” DOC assessments should be used instead of retrospective ratings to support most decisions in residential settings. Specific recommendations for the application of rating scales and improving data quality are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Campbell, D. T. (1958). Systematic error on the part of human links in communication systems.Information and Control, 1, 334–369.

    Google Scholar 

  • Campbell, D. T., Lewis, N. A., & Hunt, W. A. (1958). Context effects with judgmental language that is absolute, extensive, and extra-experimentally anchored.Journal of Experimental Psychology, 55, 220–228.

    Google Scholar 

  • Cooper, W. H. (1981). Ubiquitous halo.Psychological Bulletin, 90, 218–244.

    Google Scholar 

  • Dingemans, P. M., Bleeker, J. A. C., & Frohn-de Winter, M. (1984). A cross-cultural study of the reliability and factorial dimensions of the Nurses' Observational Scale for Inpatient Evaluation (NOSIE).Journal of Clinical Psychology, 40(1), 169–172.

    Google Scholar 

  • Fabrega, H., Jr., Swartz, J. D., & Wallace, C. A. (1968). Ethnic differences in psychopathology. I. Clinical correlates under varying conditions.Archives of General Psychiatry, 19, 218–226.

    Google Scholar 

  • Farrell, A. D., & Mariotto, M. J. (1982). A multimethod validation of two psychiatric rating scales.Journal of Consulting and Clinical Psychology, 50, 273–280.

    Google Scholar 

  • Fiske, D. W. (1978).Strategies of personality research: The observation versus interpretation of behavior. San Francisco: Jossey-Bass.

    Google Scholar 

  • Genthner, R. W., & Graham, J. R. (1976). Effects of short-term public psychiatric hospitalization for both black and white patients.Journal of Consulting and Clinical Psychology, 44, 118–124.

    Google Scholar 

  • Graham, J. R., Friedman, I., Paolino, A. F., & Lilly, R. S. (1974). An appraisal of the therapeutic value of the mental hospital milieu.Journal of Community Psychology, 2, 153–160.

    Google Scholar 

  • Gray, J. E. (1972). The NOSIE-30 ward behavior rating scale: Factor structure and sex differences.Journal of Clinical Psychology, 28, 390–393.

    Google Scholar 

  • Hartmann, D. P. (1977). Considerations in the choice of interobserver reliability estimates.Journal of Applied Behavior Analysis, 10, 103–116.

    Google Scholar 

  • Hollingshead, A. B., & Redlick, F. C. (1958).Social class and mental illness. New York: Wiley.

    Google Scholar 

  • Honigfeld, G. (1974). NOSIE-30: History and current status of its use in pharmacopsychiatric research. In P. Pichot & R. Oliver-Martin (Eds.),Psychological measurements in psychopharmacology (pp. 238–263). Basel, Switzerland: S. Karger.

    Google Scholar 

  • Honigfeld, G., & Gillis, R. (1967). The role of institutionalization in the natural history of schizophrenia.Diseases of the Nervous System, 28, 660–663.

    Google Scholar 

  • Honigfeld, G. H., Rosenblum, M. P., Blumenthal, I. J., Lambert, H. L., & Roberts, A. J. (1965). Behavioral improvement in the older schizophrenic patient: Drug and social therapies.Journal of the American Geriatrics Society, 13, 57–72.

    Google Scholar 

  • Honigfeld, G., Gillis, R. D., & Klett, C. J. (1966). NOSIE-30: A treatment sensitive behavior scale.Psychological Reports, 19, 180–182.

    Google Scholar 

  • Kazdin, A. E. (1977). Artifact, bias, and complexity of assessment: The ABCs of reliability.Journal of Applied Behavioral Analysis, 10, 141–150.

    Google Scholar 

  • Kish, G. B. (1970). Reduced cognitive innovation and stimulus-seeking in chronic schizophrenia.Journal of Clinical Psychology, 26, 170–174.

    Google Scholar 

  • Lentz, R. J., Paul, G. L., & Calhoun, J. F. (1971). Reliability and validity of three measures of functioning with hard-core chronic mental patients.Journal of Abnormal Psychology, 78, 69–76.

    Google Scholar 

  • Licht, M. H., Paul, G. L., & Power, C. T. (1986). Standardized direct-multivariate DOC systems for service and research. In G. L. Paul (Ed.),Principles and methods to support cost-effective quality operations: Assessment in residential treatment settings, Part 1 (pp. 223–265). Champaign, IL: Research Press.

    Google Scholar 

  • Lyerly, S. B., & Abbott, P. S. (1973).Handbook of Psychiatric Rating Scales (2nd ed.). Washington, DC: U.S. Government Printing Office.

    Google Scholar 

  • Mariotto, M. J. (1979). Observational assessment systems use for basic and applied research.Journal of Behavioral Assessment, 1, 239–250.

    Google Scholar 

  • Mariotto, M. J., & Farrell, A. D. (1979). Comparability of absolute level of ratings on the In-patient Multidimensional Psychiatric Scale within a homogenous group of raters.Journal of Consulting and Clinical Psychology, 47, 59–64.

    Google Scholar 

  • Mariotto, M. J., Paul, G. L., & Licht, M. H. (1987). Concurrent relationships of TSBC higherorder scores with information from other instruments. In G. L. Paul (Ed.),Observational assessment instrumentation for service and research-the Time-Sample Behavioral Checklist: Assessment in residential treatment settings, Part 2 (pp. 177–210). Champaign, IL: Research Press.

    Google Scholar 

  • McMordie, W. R., & Swint, E. B. (1979). Predictive utility, sex of rater differences, and interrater reliabilities of the NOISE-30.Journal of Clinical Psychology, 35, 773–775.

    Google Scholar 

  • McNemar, Q. (1962).Psychological statistics (3rd ed.). New York: Wiley.

    Google Scholar 

  • Muzekari, L. H., Weinman, B., & Kreiger, P. A. (1973). Self-experiential treatment in chronic schizophenia.Journal of Nervous and Mental Disease, 157, 420–427.

    Google Scholar 

  • Pattison, E. M., & Rhodes, R. J. (1974). Clinical prediction with the NOISE-30 scale.Journal of Clinical Psychology, 30, 200–201.

    Google Scholar 

  • Paul, G. L. (Ed.) (1986a).Principles and methods to support cost-effective quality operations: Assessment in residential treatment settings, Part 1. Champaign, IL: Research Press.

    Google Scholar 

  • Paul, G. L. (1986b). The nature of DOC and QICS encoding devices. In G. L. Paul (Ed.),Principles and methods to support cost-effective quality operations: Assessment in residential treatment settings, Part 1 (pp. 63–112). Champaign, IL: Research Press.

    Google Scholar 

  • Paul, G. L. (Ed.) (1987).Observational assessment instrumentation for service and research the Time-Sample Behavioral Checklist: Assessment in residential treatment settings, Part 2. Champaign, IL: Research Press.

    Google Scholar 

  • Paul, G. L., & Lentz, R. J. (1988).Psychosocial treatment of chronic mental patients: Milieu versus social learning programs (2nd ed.). Champaign, IL: Research Press.

    Google Scholar 

  • Paul, G. L., & Mariotto, M. J. (1986). Potential utilty of the sources and methods: A comprehensive paradigm. In G. L. Paul (Ed.),Principles and methods to support cost-effective quality operations: Assessment in residential treatment settings, Part 1 (pp. 113–164). Champaign, IL: Research Press.

    Google Scholar 

  • Paul, G. L., Mariotto, M. J., & Redfield, J. P. (1986a). Assessment purposes, domains, and utility for decision making. In G. L. Paul (Ed.),Principles and methods to support costeffective quality operations: Assessment in residential treatment settings, Part 1 (pp. 1–25). Champaign, IL: Research Press.

    Google Scholar 

  • Paul, G. L., Mariotto, M. J., & Redfield, J. P. (1986b). Sources and methods for gathering information in formal assessment. In G. L. Paul (Ed.),Principles and methods to support cost-effective quality operations: Assessment in residential treatment settings, Part 1 (pp. 27–62). Champaign, IL: Research Press.

    Google Scholar 

  • Paul, G. L., Licht, M. H., Power, C. T., & Engel, K. L. (1987). The data base for TSBC evidence and normative comparisons. In G. L. Paul (Ed.),Observational assessment instrumentation for service and research-the Time-Sample Behavioral Checklist: Assessment in residential treatment settings, Part 2 (pp. 51–68). Champaign. IL: Research Press.

    Google Scholar 

  • Pepper, S. (1981). Problems in the quantification of frequency expressions. In D. Fiske (Ed.),New directions for methodology of social and behavioral science: Problems with language imprecision, No. 9 (pp, 25–41). San Francisco: Jossey-Bass.

    Google Scholar 

  • Philip, A. E. (1973). A note on the Nurses' Observational Scale for Inpatient Evaluation (NOSIE).British Journal of Psychiatry, 122, 595–596.

    Google Scholar 

  • Philip, A. E. (1977). Cross-cultural study of the factorial dimensions of the NOSIE.Journal of Clinical Psychology, 33, 467–468.

    Google Scholar 

  • Philip, A. E. (1979). Prediction of successful rehabilitation by nurse rating scales.British Journal of Psychiatry, 134, 422–426.

    Google Scholar 

  • Prien, R. F., & Cole, J. O. (1968). High dose chlorpromazine therapy in chronic schizophrenia.Archives of General Psychiatry, 18, 482–495.

    Google Scholar 

  • Ravensborg, M. R. (1972). Simulated work therapy in improving behavior of regressed schizophrenics.Perceptual and Motor Skills, 34, 555–558.

    Google Scholar 

  • Schwartz, J., & Bellack, A. S. (1975). A comparison of a token economy with standard inpatient treatment.Journal of Consulting and Clinical Psychology, 43, 107–108.

    Google Scholar 

  • Shadish, W. R., Jr., Bootzin, R. R., Koller, D., & Brownell, L. (1981). Psychometric instability of measures in novel settings: Use of psychiatric rating scales in nursing homes.Journal of Behavioral Assessment, 3, 221–232.

    Google Scholar 

  • Spiegel, D. E., Keith-Spiegel, P., & Grayson, H. M. (1967). Behavior of the typical mental patient as seen by eight groups of hospital personnel.Journal of Psychiatric Research, 5, 317–325.

    Google Scholar 

  • Spitzer, R. L., & Endicott, J. (1975). Psychiatric rating scales. In A. M. Freedman, H. I. Kaplan, & B. J. Sadick (Eds.),Comprehensive textbook of psychiatry-II (Vol. 2, 2nd ed., pp. 2015–2031). Baltimore: Williams & Wilkins.

    Google Scholar 

  • Wiggins, J. S. (1988).Personality and prediction: Principles of personality assessment (Reprint ed.). Malibas, FL: Robert E. Krieger.

    Google Scholar 

  • Wittenborn, J. R. (1967). Do rating scales objectify clinical impressions?Comprehensive Psychiatry, 8, 386–392.

    Google Scholar 

  • Wittenborn, J. R. (1972). Reliability, validity, and objectivity of symptom rating scales.Journal of Nervous and Mental Disease, 154, 79–87.

    Google Scholar 

  • Wittenborn, J. R. (1975). Different types and concepts of rating scales. In J. R. Boissier, H. Hippius, & P. Pichot (Eds.),Neuropsychopharmacology. New York: American Elsevier.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This study was the basis of a master's thesis at the University of Houston by Betty E. Rich under the direction of Gordon L. Paul and Marco J. Mariotto. Richard M. Rozelle, to whom appreciation is expressed for helpful comments, served on the examination committee. This study was partially supported by grants to Gordon L. Paul from the National Institute of Mental Health, Public Health Service (MH-15353; MH-25464); the Illinois Department of Mental Health and Developmental Disabilities; the Joyce Foundation; the MacArthur Foundation; the Owsley Foundation; the Cullen Foundation; and the Center for Public Policy, University of Houston.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rich, B.E., Paul, G.L. & Mariotto, M.J. Judgmental relativism as a validity threat to standardized psychiatric rating scales. J Psychopathol Behav Assess 10, 241–257 (1988). https://doi.org/10.1007/BF00962548

Download citation

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00962548

Key words

Navigation