Reliability and Validity in Expert Judgment

  • Fergus Bolger
  • George Wright


As the world of human affairs becomes increasingly more complex, our reliance upon expert judgment grows correspondingly. Technological, economic, legal, and political developments—to name but a few—place ever-larger information-processing demands upon us, thereby forcing specialization. A single person can no longer be a master of his or her whole field and, consequently, knowledge becomes distributed among a number of specialist experts.


Probability Estimate Subjective Probability Expert Judgment Probability Assessment Brier Score 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Ashton, R. H. (1974). Cue utilization and expert judgments: A comparison of independent auditors with other judges. Journal of Applied Psychology, 59(4), 437–444CrossRefGoogle Scholar
  2. Balthasar, H. U., Boschi, R. A. A., & Menke, M. M. (1978). Calling the shots in R and D, Harvard Business Review, May–June, 151–160.Google Scholar
  3. Bamber, E. M. (1983). Expert judgment in the audit team: A source reliability approach, Journal o f Accounting Research, 21, 396–412.CrossRefGoogle Scholar
  4. Barnes, V. E. (1984). The quality of judgment: An alternative perspective. Unpublished doctoral dissertation, University of Washington, Seattle, Wash.Google Scholar
  5. Basi, B. A., Carey, K. J., & Twark, R. D. (1976). A comparison of the accuracy of corporate and security analysis forecasts of earnings. The Accounting Review, 51, 244–254.Google Scholar
  6. Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78, 1–3.CrossRefGoogle Scholar
  7. Brown, R. V., Kahr, A. S., & Peterson, C. (1974). Decision analysis for the manager, New York: Holt, Rinehart & Winston.Google Scholar
  8. Blattberg, R. C., & Hoch, S. J. (1989). Database models and managerial intuition: 50% model and 50% manager. Report from the Center for Decision Research, Graduate School of Business, University of Chicago, May.Google Scholar
  9. Centor, R. M., Dalton, H. P., & Yates, J. F. (1984). Are physicians’ probability estimates better or worse than regression model estimates? Paper presented at the sixth Annual Meeting of the Society for Medical Decision Making, Bethesda, MD, November.Google Scholar
  10. Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity assessment. Sage University Papers Series, Beverly Hills, CA.Google Scholar
  11. Casey, C., & Selling, T. I. (1986). The effect of task predictability and prior probability disclosure on judgement quality and confidence. The Accounting Review, 61, 302–317.Google Scholar
  12. Chalos, P. (1985). The superior performance of loan review committee. Journal of Commercial Bank Lending, 68, 60–66.Google Scholar
  13. Christensen-Szalanski, J. J. J., & Bushyhead, J. B. (1981). Physicians’ use of probabilistic information in a real clinical setting. Journal of Experimental Psychology: Human Perception and Performance, 7, 928–935.PubMedCrossRefGoogle Scholar
  14. Christensen-Szalanski, J. J. J., Beck, D. E., Christensen-Szalanski, C. M., & Koepsell, T. D. (1983). Effects of expertise and experience on risk judgments. Journal of Applied Psychology, 68, 278–284.PubMedCrossRefGoogle Scholar
  15. Clarke, F. R. (1960). Confidence ratings, second-choice responses, and confusion matrices in intelligibility tests. Journal of the Acoustical Society of America, 32, 35–46.CrossRefGoogle Scholar
  16. Dawes, R. M., & Corrigan, B. (1974). Linear models in decision-making. Psychological Bulletin, 81, 95–106.CrossRefGoogle Scholar
  17. Dawes, R. M., Faust, D., & Meehl, P. (1989). Clinical versus actuarial judgement. Science, 243, 1668–1673.PubMedCrossRefGoogle Scholar
  18. Dolan, J. G., Bordley, D. R., & Mushlin, A. I. (1986). An evaluation of clinicians’ subjective prior probability estimates. Medical Decision Making, 6, 216–223.PubMedCrossRefGoogle Scholar
  19. DuCharme, W. M., & Peterson, C. R. (1968). Intuitive inference about normally distributed populations. Journal of Experimental Psychology, 78, 269–275.PubMedCrossRefGoogle Scholar
  20. Dube-Rioux, L., & Russo, J. E. (1988). An availability bias in professional judgment. Journal of Behavioral Decision Making, 1, 223–237.CrossRefGoogle Scholar
  21. Ebbesen, E., & Konecni, V. (1975). Decision making and information integration in the courts: the setting of bail. Journal of Personality and Social Psychology, 32, 805–821.CrossRefGoogle Scholar
  22. Edwards, W., Phillips, L. D., Hays, W. L., & Goodman, B. C. (1968). Probabilistic information processing systems. IEEE Transactions on Systems Science and Cybernetics, 4, 248–265.CrossRefGoogle Scholar
  23. Eddy, D. M. (1982). Probabilistic reasoning in clinical medicine: problems and opportunities. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment under uncertainty: Heuristics and biases. Cambridge: Cambridge University Press.Google Scholar
  24. Einhorn, H. J. (1974). Expert judgment: Some necessary conditions and an example. Journal of Applied Psychology, 59, 562–571.CrossRefGoogle Scholar
  25. Ettenson, R., Krogstad, J., & Shanteau, J. (1985). Schema and strategy shifting in auditors’ evidence gathering. In Symposium on Audit Judgment and Evidence Evaluation, USC School of Auditing.Google Scholar
  26. Feigenbaum, E. A. (1979). Themes and case studies in knowledge engineering. In D. Michie (Ed.), Expert systems in the microelectronic age. Edinburgh: Edinburgh University Press.Google Scholar
  27. Frenkel-Brunswik. (1943). Motivation and behaviour. Genetic Psychology Monographs, 26, 121–265.Google Scholar
  28. Ferrell, W. R., & McGoey, P. J. (1980). A model of calibration for subjective probabilities. Organizational Behaviour and Human Performance, 25, 32–53.CrossRefGoogle Scholar
  29. Fischhoff, B., Slovic, P., & Lichtenstein, S. (1978). Fault trees: Sensitivity of estimated failure probabilities to problem representation. Journal of Experimental Psychology: Human Perception and Performance, 4, 330–334.CrossRefGoogle Scholar
  30. Gaeth, G. J., & Shanteau, J. (1984). Reducing the influence of irrelevant information on experienced decision makers. Organizational Behaviour and Human Performance, 33, 263–282.CrossRefGoogle Scholar
  31. Hammond, K. R. (1955). Probabilistic functioning and the clinical method. Psychological Review, 62, 255–262.PubMedCrossRefGoogle Scholar
  32. Hindle, T., & Torkzadeh, G. (1985). Estimating the incapacity time caused by home accidents, particularly using expert judgments. Journal of the Operational Research Society, 35, 193–201.Google Scholar
  33. Hlatky, M., Botvinick, E., & Brundage, B. (1982). Diagnostic accuracy of cardiologists compared with probability calculations using Bayes’ Rule. American Journal of Cardiology, 49, 1927–1931.PubMedCrossRefGoogle Scholar
  34. Hoerl, A., & Fallin, H. K. (1974). Reliability of subjective evaluation in a high incentive situation. Journal of the Royal Statistical Society, 137, 227–230.Google Scholar
  35. Hoffman, P. J. (1960). The paramorphic representation of clinical judgment. Psychological Bulletin, 57, 116–131.PubMedCrossRefGoogle Scholar
  36. Hughes, H. D. (1917). An interesting seed corn experiment. Iowa Agriculturalist, 17, 424–425.Google Scholar
  37. Jenks, J. M. (1983). Non-computer forecasts to use right now. Business Marketing, 68, 82–84.Google Scholar
  38. Johnson, E. J. (1988). Expertise and decision under uncertainty: Performance and process. In Chi, M. T. H., Glaser, R. & Farr, M. J. (Eds.), The nature of expertise, Hillsdale, NJ: Erlbaum.Google Scholar
  39. Kabus, I. (1976). You can bank on uncertainty. Harvard Business Review, May–June, 95–105.Google Scholar
  40. Kahneman, D., Slovic, P., & Tversky, A. (1982). Judgment under uncertainty: Heuristics and biases. Cambridge: Cambridge University Press.Google Scholar
  41. Kanal, L. N., & Lemmer, J. F. (Eds.) (1986). Uncertainty and artificial intelligence. Amsterdam: Elsevier.Google Scholar
  42. Kelly, E. L., & Fiske, D. W. (1951). The prediction of performance in clinical psychology. Ann Arbor, Mich: University of Michigan Press.Google Scholar
  43. Keren, G. (1987). Facing uncertainty in the game of bridge: a calibration study. Organizational Behaviour and Human Decision Processes, 39, 98–114.CrossRefGoogle Scholar
  44. Lawson, R. W. (1981). Traffic usage forecasting: is it an art or a science? Telephony, February, 19–24.Google Scholar
  45. Lemmer, J. F., & Kanal, L. N. (1988). Uncertainty and artificial intelligence 2. Amsterdam: Elsevier.Google Scholar
  46. Levi, K. R. (1986). Numerical likelihood estimates from physicians and linear models. Unpublished doctoral dissertation, University of Michigan, Ann Arbor.Google Scholar
  47. Libby, R. (1975). Accounting rations and the prediction of failure: Some behavioral evidence. Journal of Accounting Research, Spring, 150–161.Google Scholar
  48. Lichtenstein, S., & Fischhoff, B. (1977). Do those who know more also know more about how much they know? Organizational Behavior and Human Performance, 20, 159–183.CrossRefGoogle Scholar
  49. Lichtenstein, S., Slovic, P., Fischhoff, B., Layman, M., & Coombs, B. (1978). Judged frequency of lethal events. Journal of Experimental Psychology: Human Learning and Memory, 4, 551–78.CrossRefGoogle Scholar
  50. Lichtenstein, S., Fischhoff, B., & Phillips, L. D. (1982). Calibration of probabilities: The state of the art to 1980. In D. Kahneman, P. Slovic, & A. Tversky, (Eds.), Judgment under uncertainty: Heuristics and biases, New York: Cambridge, 1982.Google Scholar
  51. Manu, P., Runge, L. A., Lee, J. Y., & Oppenheim, A. D. (1984). Judged frequency of complications after invasive diagnostic procedures: Systematic biases of a physician population. Medical Care, 22, 366–370.PubMedCrossRefGoogle Scholar
  52. Meehl, P. E. (1954). Clinical versus Statistical Prediction: A Theoretical Analysis and Review of the Evidence, Minneapolis, Minn: University of Minnesota Press.Google Scholar
  53. Meehl, P. E. (1957). When shall we use our heads instead of the formula? Journal of Counselling Psychology, 4, 268–273.CrossRefGoogle Scholar
  54. Milburn, M. A. (1978). Sources of bias in the prediction of future events. Organizational Behavior and Human Performance, 21, 17–26.CrossRefGoogle Scholar
  55. Murphy, A. H., & Brown, B. G. (1985). A comparative evaluation of objective and subjective weather forecasts in the United States. In G. Wright (Ed.), Behavioral decision making. New York: Plenum.Google Scholar
  56. Murphy, A. H., & Winkler, R. L. (1977). Reliability of subjective probability forecasts of precipitation and temperature. Applied Statistics, 26, 41–47.CrossRefGoogle Scholar
  57. Nagy, G. F. (1981). How are personnel selection decisions made? An analysis of decision strategies in a simulated personnel selection task. Unpublished doctoral dissertation, Kansas State University.Google Scholar
  58. Nickerson, R. S., & McGoldrick, C. C. (1965). Confidence ratings and level of performance on a judgmental task. Perceptual Motor Skills, 20, 311–316.Google Scholar
  59. Northcroft, M. A., & Neale, G. B. (1987). Experts, amateurs and real-estate: An anchoring and adjust perspective in property pricing decisions. Organizational Behavior and Human Decision Processes, 39, 84–97.CrossRefGoogle Scholar
  60. Oskamp, S. (1962). The relationship of clinical experience and training methods to several criteria of clinical prediction. Psychological Monographs, 76.Google Scholar
  61. Oskamp, S. (1965). Overconfidence in case-study judgments. Journal of Consulting Psychology, 29, 261–265.PubMedCrossRefGoogle Scholar
  62. Phelps, R. H., & Shanteau, J. (1978). Livestock judges: how much information can an expert use? Organizational Behavior and Human Performance, 21, 209–219.CrossRefGoogle Scholar
  63. Phillips, L. D., Hays, W. L., & Edwards, W. (1966). Conservatism in complex-probabilistic inferences. IEEE Transactions on Human Factors in Electronics, 7–18.Google Scholar
  64. Phillips, L. D., & Edwards, W. (1966). Conservatism in a simple probabilistic inference task. Journal of Experimental Psychology, 72, 346–354.PubMedCrossRefGoogle Scholar
  65. Phillips, L. D., & Wright, G. (1977). Cultural differences in viewing uncertainty and assessing probabilities. In H. Jungermann & G. de Zeeuw (Eds.), Decision making and change in human affairs. Dordecht, Holland: Reidel.Google Scholar
  66. Pitz, G. F. (1974). Subjective probability distributions for imperfectly known quantities. In L. W. Gregg (Ed.), Knowledge and cognition, New York: Wiley.Google Scholar
  67. Poulton, E. C. (1989). Bias in quantifying judgments. New York: LEA.Google Scholar
  68. Root, H. E. (1962). Probability statements in weather forecasting. Journal of Applied Meteorology, 2, 163–167.CrossRefGoogle Scholar
  69. Sanders, F. (1963). On subjective probability forecasting. Journal of Applied Meteorology, 2, 191–201.CrossRefGoogle Scholar
  70. Schaefer, R. E., Borcherding, K., & Laemmerhold, C. (1977). Consistency of future event assessments. In H. Jungermann & G. de Zeeuw (Eds.), Decision making and change in human affairs. Dordecht, Holland: Reidel (pp. 331–345).Google Scholar
  71. Shafer, G. (1987). Probability judgment in artificial intelligence and expert systems. Statistical Science, 2, 3–44.Google Scholar
  72. Shanteau, J. (1978). When does a response error become a judgmental bias? Journal of Experimental Psychology: Human Learning and Memory, 4, 579–581.CrossRefGoogle Scholar
  73. Shanteau, J. (1987). Psychological characteristics of expert decision makers. In J. Mumpower, L. D. Phillips, O. Renn, & Y. R. R. Uppuluri (Eds.), Expert judgment and expert systems (pp. 289–304). Berlin: Springer-Verlag.Google Scholar
  74. Shanteau, J., & Phelps, R. H. (1977). Judgment and swine: Approaches and issues in applied judgment analysis. In M. F. Kaplan & S. Schwartz (Eds.), Human judgment and decision processes in applied settings. New York: Academic Press.Google Scholar
  75. Shanteau, J., Grier, M., Johnson, J., & Berner, E. (1981). “Improving decision making skills of Nurses”. In ORSA-TIMS Proceedings, Houston, Tex: ORSA-TIMS.Google Scholar
  76. Shepanski, A. (1983). Tests of theories of information processing behaviour in credit judgment. The Accounting Review, 58, 581–599.Google Scholar
  77. Smith, M., & Ferrell, W. R. (1983). The effect of base rate on calibration of subjective probability for true-false questions: model and experiment. In P. Humphreys, O. Svenson, & A. Vari (Eds.). Analyzing and aiding decision processes, Amsterdam: North Holland.Google Scholar
  78. Snyder, W. W. (1978). Horse racing. Journal of Finance, 33, 1109–1118.CrossRefGoogle Scholar
  79. Soergel, R. F. (1983). Probing the past for the future. Sales and Marketing Management, 130, 39–43.Google Scholar
  80. Stael von Holstein, C. S. (1971). An experiment in probabilistic weather forecasting. Journal of Applied Meteorology, 10, 635–645.CrossRefGoogle Scholar
  81. Stael von Holstein, C. S. (1972). Probabilistic forecasting: An experiment related to the stock market. Organizational Behavior and Human Performance, 8, 139–158.CrossRefGoogle Scholar
  82. Stael von Holstein, C. S., & Matheson, J. (1979). A manual for encoding probability distributions, Menlo Park, Cal: SRI International.Google Scholar
  83. Tierney, W. M., et al. (1986). Physicians’ estimates of probability of myocardial infarction in emergency room patients with chest pain. Medical Decision Making, 6, 12–17.PubMedCrossRefGoogle Scholar
  84. Trumbo, D. A., Adams, C. K., Milner, M., & Schipper, L. (1962). Reliability and accuracy in the inspection of hard red winter wheat. Cereal Science Today, 7, 62–71.Google Scholar
  85. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131.CrossRefPubMedGoogle Scholar
  86. Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211, 453–458.PubMedCrossRefGoogle Scholar
  87. Von Winterfeldt, D., & Edwards, W. (1986). Decision analysis and behavioral research, Cambridge: Cambridge University Press.Google Scholar
  88. Wagenaar, W. A., & Keren, G. B. (1986). Does the expert know? The reliability of predictions and confidence ratings of experts. In E. Hollnagel, G. Mancini, & D. D. Woods (Eds.), Intelligent decision support in process environments. Berlin: Springer-Verlag.Google Scholar
  89. Wallace, H. A. (1923). What is in the corn judge’s mind? Journal of the American Society of Agronomy, 15, 300–304.Google Scholar
  90. Wallesten, T. S., & Budescu, D. V. (1983). Encoding subjective probabilities: A psychological and psychometric review. Management Science, 29, 151–173.CrossRefGoogle Scholar
  91. Watson, S. R., & Buede, D. M. (1987). Decision synthesis. Cambridge: Cambridge University Press.Google Scholar
  92. Whitred, G., & Zimmer, I. (1985). The implications of distress prediction models for corporate lending. Accounting and Finance, 25, 1–13.Google Scholar
  93. Wright, G., & Ayton, P. (1984). Judgmental forecasting: Personologism, situationism or interactionism? Paper presented to the 2nd European Conference on Personality, Bielefeld, FRG.Google Scholar
  94. Wright, G., & Ayton, P. (1986). Subjective confidence in forecasts: A response to Fischhoff and MacGregor. Journal of Forecasting, 5, 117–123.CrossRefGoogle Scholar
  95. Wright, G., & Ayton, P. (1987a). Eliciting and modelling expert knowledge. Decision Support Systems, 3, 13–26.CrossRefGoogle Scholar
  96. Wright, G., & Ayton, P. (1987b). The psychology of forecasting. In G. Wright & P. Ayton (Eds.), Judgmental Forecasting, Chichester, UK: Wiley.Google Scholar
  97. Wright, G., & Phillips, L. D. (1984). Decision making: Cognitive style or task-related behaviour? In H. Bonarius, G. van Heck, & N. Smid (Eds.), Personality psychology in Europe. Lisse: Swets & Zeitlinger.Google Scholar
  98. Wright, G., Saunders, C., & Ayton, P. (1988). The consistency, coherence and calibration of holistic, decomposed and recomposed judgmental probability forecasts. Journal of Forecasting, 7, 185–199.CrossRefGoogle Scholar
  99. Wright, G., Phillips, L. D., Whalley, P. C., Choo, G. T. G., Ng, K.-O., Tan, I., & Wishuda, A. (1978). Cultural differences in probabilistic thinking. Journal of Cross-Cultural Psychology, 9, 285–299.CrossRefGoogle Scholar
  100. Wright, G., Rowe, G., Bolger, F., & Gammack, J. (1991). Coherence, calibration and expertise in judgmental probability forecasting. Organizational Behavior and Human Decision Processes.Google Scholar
  101. Yates, J. F. (1982). External correspondence: decompositions of the mean probability score. Organizational Behaviour and Human Performance, 30, 132–156.CrossRefGoogle Scholar
  102. Yates, J. F. (1990). Judgment and decision making. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
  103. Yates, J. F., & Curley, S. P. (1985). Conditional distribution analyses of probabilistic forecasting. Journal of Forecasting, 4, 61–73.CrossRefGoogle Scholar
  104. Yates, J. F., McDaniel, L., & Brown, E. (1991). Probabilistic forecasts of stock prices and earnings: The hazards of nascent expertise. Organizational Behaviour and Human Decision Processes. (In press.).Google Scholar
  105. Youssef, Z. I., & Peterson, C. R. (1973). Intuitive cascaded inferences. Organizational Behavior and Human Performance, 10, 349–58.CrossRefGoogle Scholar
  106. Zakay, D. (1983). “The relationship between the probability assessor and the outcomes of an event as a determiner of subjective probability. Ada Psychologica, 53, 271–280.CrossRefGoogle Scholar

Copyright information

© Plenum Press, New York 1992

Authors and Affiliations

  • Fergus Bolger
    • 1
  • George Wright
    • 2
  1. 1.Department of PsychologyUniversity College LondonLondonEngland
  2. 2.Strathclyde Graduate Business SchoolGlasgowScotland

Personalised recommendations