Reliability and expected loss: A unifying principle
- 128 Downloads
We provide a unified, theoretical basis on which measures of data reliability may be derived or evaluated, for both quantitative and qualitative data. This approach evaluates reliability as the “proportional reduction in loss” (PRL) that is attained in a sample by an optimal estimator. The resulting measure is between 0 and 1, linearly related to expected loss, and provides a direct way of contrasting the measured reliability in the sample with the least reliable and most reliable data-generating cases. The PRL measure is a generalization of many of the commonly-used reliability measures.
We show how the quantitative measures from generalizability theory can be derived as PRL measures (including Cronbach's alpha and measures proposed by Winer). For categorical data, we develop a new measure for the general case in which each of N judges assigns a subject to one of K categories and show that it is equivalent to a measure proposed by Perreault and Leigh for the case where N is 2.
Key wordsalpha kappa agreement intercoder reliability decision rule generalizability theory test theory
Unable to display preview. Download preview PDF.
- Brennan, R. L. (1983).Elements of generalizability theory. Iowa City: American College Testing Program.Google Scholar
- Brennan, R. L., & Prediger, D. J. (1981). Coefficient kappa: Some uses, misuses, and alternatives.Educational and Psychological Measurement, 41, 687–699.Google Scholar
- Cohen, J. (1960). A coefficient of agreement for nominal scales.Educational and Psychological Measurement, 20, 37–46.Google Scholar
- Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit.Psychological Bulletin, 70, 213–220.Google Scholar
- Conger, A. J. (1980). Integration and generalization of kappas for multiple raters.Psychological Bulletin, 88, 322–328.Google Scholar
- Costner, H. L. (1965). Criteria for measures of association.American Sociological Review, 30, 341–353.Google Scholar
- Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests.Psychometrika, 16, 297–334.Google Scholar
- Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972).The dependability of behavioral measurements: theory of generalizability for scores and profiles. New York: John Wiley & Sons.Google Scholar
- Hildebrand, D. K., Laing, J. D., & Rosenthal, H. (1977).Prediction analysis of cross classifications. New York: John Wiley & Sons.Google Scholar
- Hughes, M. A., & Garrett, D. E. (1990). Intercoder reliability estimation approaches in marketing: A generalizability theory framework for quantitative data.Journal of Marketing Research, 27, 185–195.Google Scholar
- Kassarjian, H. H. (1977). Content analysis in consumer research.Journal of Consumer Research, 4, 8–18.Google Scholar
- Krippendorff, K. (1980).Content analysis: An introduction to its methodology. Beverly Hills, CA: Sage Publications.Google Scholar
- Peter, J. P. (1977). Reliability, generalizability, and consumer behavior. In W. D. Perreault (Ed.),Advances in consumer research (Vol. 4, pp. 394–400). Atlanta: Association for Consumer Research.Google Scholar
- Perreault, W. D. Jr., & Leigh, L. E. (1989). Reliability of nominal data based on qualitative judgments.Journal of Marketing Research, 26, 135–148.Google Scholar
- Schouten, H. J. A. (1982). Measuring pairwise agreement among many observers, II: Some improvements and additions.Biometrical Journal, 24, 431–435.Google Scholar
- Schouten, H. J. A. (1986). Nominal scale agreement among observers.Psychometrika, 51, 453–466.Google Scholar
- Winer, B. J. (1971).Statistical principles in experimental design. New York: McGraw-Hill.Google Scholar