Abstract
Who is good at prediction? Addressing this question is key to recruiting and cultivating accurate crowds and effectively aggregating their judgments. Recent research on superforecasting has demonstrated the importance of individual, persistent skill in crowd prediction. This chapter takes stock of skill identification measures in probability estimation tasks, and complements the review with original analyses, comparing such measures directly within the same dataset. We classify all measures in five broad categories: (1) accuracy-related measures, such as proper scores, model-based estimates of accuracy and excess volatility scores; (2) intersubjective measures, including proxy, surrogate and similarity scores; (3) forecasting behaviors, including activity, belief updating, extremity, coherence, and linguistic properties of rationales; (4) dispositional measures of fluid intelligence, cognitive reflection, numeracy, personality and thinking styles; and (5) measures of expertise, including demonstrated knowledge, confidence calibration, biographical, and self-rated expertise. Among non-accuracy-related measures, we report a median correlation coefficient with outcomes of r = 0.20. In the absence of accuracy data, we find that intersubjective and behavioral measures are most strongly correlated with forecasting accuracy. These results hold in a LASSO machine-learning model with automated variable selection. Two focal applications provide context for these assessments: long-term, existential risk prediction and corporate forecasting tournaments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We refer to measures correlating with skill as predictors or correlates. To avoid confusion, we refer to individuals engaged in forecasting tasks as forecasters.
- 2.
Normalization doesn’t account for question difficulties on its own, just transforms the distribution. So, when used as criterion variables, normalized scores are then standardized: \( {SMNB}_{f,q}=\frac{MNB_{f,q}-{\overline{MNB}}_q}{SD\left({MNB}_{f,q}\right)} \).
- 3.
The authors were members of the SAGE team in the Hybrid Forecasting Competition. Linguistic properties of rationales were among the features used in aggregation weighting algorithms. The SAGE team the achieved highest accuracy in 2020, the last season of the tournament.
- 4.
We do not offer complete coverage of intersubjective measures, including surrogate scores and similarity measures, but given our current results, further empirical investigation seems worthwhile.
- 5.
We have notified Epstein of this. As a result, he shared plans to edit the sentence in future editions of Range.
- 6.
Readers who have been exposed to research on forecaster skill identification through general media or popular science outlets may find some of our findings surprising. For example, a recent admittedly non-scientific poll of 30 twitter users by one of us (Atanasov) revealed that the plurality (40%) of respondents thought active open mindedness was more strongly correlated with accuracy than update magnitude, fluid intelligence or subject matter knowledge scores. Fewer than 20% correctly guessed that the closest correlate of accuracy was update magnitude.
References
Arthur, W., Jr., Tubre, T. C., Paul, D. S., & Sanchez-Ku, M. L. (1999). College-sample psychometric and normative data on a short form of the raven advanced progressive matrices test. Journal of Psychoeducational Assessment, 17(4), 354–361.
Aspinall, W. (2010). A route to more tractable expert advice. Nature, 463(7279), 294–295.
Atanasov, P., Rescober, P., Stone, E., Servan-Schreiber, E., Tetlock, P., Ungar, L., & Mellers, B. (2017). Distilling the wisdom of crowds: Prediction markets vs. prediction polls. Management Science, 63(3), 691–706.
Atanasov, P., Diamantaras, A., MacPherson, A., Vinarov, E., Benjamin, D. M., Shrier, I., Paul, F., Dirnagl, U., & Kimmelman, J. (2020a). Wisdom of the expert crowd prediction of response for 3 neurology randomized trials. Neurology, 95(5), e488–e498.
Atanasov, P., Witkowski, J., Ungar, L., Mellers, B., & Tetlock, P. (2020b). Small steps to accuracy: Incremental belief updaters are better forecasters. Organizational Behavior and Human Decision Processes, 160, 19–35.
Atanasov, P., Joseph, R., Feijoo, F., Marshall, M., & Siddiqui, S. (2022a). Human forest vs. random forest in time-sensitive Covid-19 clinical trial prediction. Working Paper.
Atanasov, P., Witkowski, J., Mellers, B., & Tetlock, P. (2022b) Crowdsourced prediction systems: Markets, polls, and elite forecasters. Working Paper.
Augenblick, N., & Rabin, M. (2021). Belief movement, uncertainty reduction, and rational updating. The Quarterly Journal of Economics, 136(2), 933–985.
Bandalos, D. L. (2018). Measurement theory and applications for the social sciences. Guilford Publications.
Baron, J. (2000). Thinking and deciding. Cambridge University Press.
Baron, J., Scott, S., Fincher, K., & Metz, S. E. (2015). Why does the cognitive reflection test (sometimes) predict utilitarian moral judgment (and other things)? Journal of Applied Research in Memory and Cognition, 4(3), 265–284.
Barrick, M. R., & Mount, M. K. (1991). The big five personality dimensions and job performance: A meta-analysis. Personnel Psychology, 44(1), 1–26.
Beard, S., Rowe, T., & Fox, J. (2020). An analysis and evaluation of methods currently used to quantify the likelihood of existential hazards. Futures, 115, 102469.
Benjamin, D., Mandel, D. R., & Kimmelman, J. (2017). Can cancer researchers accurately judge whether preclinical reports will reproduce? PLoS Biology, 15(6), e2002212.
Bennett, S., & Steyvers, M. (2022). Leveraging metacognitive ability to improve crowd accuracy via impossible questions. Decision, 9(1), 60–73.
Bland, J. M., & Altman, D. G. (2011). Correlation in restricted ranges of data. BMJ: British Medical Journal, 342.
Blattberg, R. C., & Hoch, S. J. (1990). Database models and managerial intuition: 50% model + 50% manager. Management Science, 36(8), 887–1009.
Bo, Y. E., Budescu, D. V., Lewis, C., Tetlock, P. E., & Mellers, B. (2017). An IRT forecasting model: Linking proper scoring rules to item response theory. Judgment & Decision Making, 12(2), 90–103.
Bors, D. A., & Stokes, T. L. (1998). Raven’s advanced progressive matrices: Norms for first-year university students and the development of a short form. Educational and Psychological Measurement, 58(3), 382–398.
Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78(1), 1–3.
Broomell, S. B., & Budescu, D. V. (2009). Why are experts correlated? Decomposing correlations between judges. Psychometrika, 74(3), 531–553.
Bruine de Bruin, W., Parker, A. M., & Fischhoff, B. (2007). Individual differences in adult decision-making competence. Journal of Personality and Social Psychology, 92(5), 938–956.
Budescu, D. V., Weinberg, S., & Wallsten, T. S. (1988). Decisions based on numerically and verbally expressed uncertainties. Journal of Experimental Psychology: Human Perception and Performance, 14(2), 281–294.
Budescu, D. V., & Chen, E. (2015). Identifying expertise to extract the wisdom of crowds. Management Science, 61(2), 267–280.
Budescu, D.V., Himmelstein, M & Ho, E. (2021, October) Boosting the wisdom of crowds with social forecasts and coherence measures. In Presented at annual meeting of Society of Multivariate Experimental Psychology (SMEP).
Burgman, M. A., McBride, M., Ashton, R., Speirs-Bridge, A., Flander, L., Wintle, B., Fider, F., Rumpff, L., & Twardy, C. (2011). Expert status and performance. PLoS One, 6(7), e22998.
Cacioppo, J. T., & Petty, R. E. (1982). The need for cognition. Journal of Personality and Social Psychology, 42(1), 116–131.
Chang, W., Atanasov, P., Patil, S., Mellers, B., & Tetlock, P. (2017). Accountability and adaptive performance: The long-term view. Judgment and Decision making, 12(6), 610–626.
Chen, E., Budescu, D. V., Lakshmikanth, S. K., Mellers, B. A., & Tetlock, P. E. (2016). Validating the contribution-weighted model: Robustness and cost-benefit analyses. Decision Analysis, 13(2), 128–152.
Cokely, E. T., Galesic, M., Schulz, E., Ghazal, S., & Garcia-Retamero, R. (2012). Measuring risk literacy: The Berlin numeracy test. Judgment and Decision making, 7(1), 25–47.
Collins, R. N., Mandel, D. R., Karvetski, C. W., Wu, C. M., & Nelson, J. D. (2021). The wisdom of the coherent: Improving correspondence with coherence-weighted aggregation. Preprint available at PsyArXiv. Retrieved from https://psyarxiv.com/fmnty/
Collins, R., Mandel, D., & Budescu, D. (2022). Performance-weighted aggregation: Ferreting out wisdom within the crowd. In M. Seifert (Ed.), Judgment in predictive analytics. Springer [Reference to be updated with page numbers].
Cooke, R. (1991). Experts in uncertainty: Opinion and subjective probability in science. Oxford University Press.
Costa, P. T., Jr., & McCrae, R. R. (2008). The revised neo personality inventory (NEO-PI-R). Sage.
Cowgill, B., & Zitzewitz, E. (2015). Corporate prediction markets: Evidence from Google, Ford, and Firm X. The Review of Economic Studies, 82(4), 1309–1341.
Dana, J., Atanasov, P., Tetlock, P., & Mellers, B. (2019). Are markets more accurate than polls? The surprising informational value of “just asking”. Judgment and Decision making, 14(2), 135–147.
Davis-Stober, C. P., Budescu, D. V., Dana, J., & Broomell, S. B. (2014). When is a crowd wise? Decision, 1(2), 79–101.
Dieckmann, N. F., Gregory, R., Peters, E., & Hartman, R. (2017). Seeing what you want to see: How imprecise uncertainty ranges enhance motivated reasoning. Risk Analysis, 37(3), 471–486.
Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press.
Epstein, D. (2019). Range: How generalists triumph in a specialized world. Pan Macmillan.
Fan, Y., Budescu, D. V., Mandel, D., & Himmelstein, M. (2019). Improving accuracy by coherence weighting of direct and ratio probability judgments. Decision Analysis, 16, 197–217.
Frederick, S. (2005). Cognitive reflection and decision making. Journal of Economic Perspectives, 19(4), 25–42.
Galton, F. (1907). Vox populi (the wisdom of crowds). Nature, 75(7), 450–451.
Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477), 359–378.
Goldstein, D. G., McAfee, R. P., & Suri, S. (2014, June). The wisdom of smaller, smarter crowds. In Proceedings of the Fifteenth ACM Conference on Economics and Computation (pp. 471–488).
Good, I. J. (1952). Rational decisions. Journal of the Royal Statistical Society. Series B (Methodological), 1952, 107–114.
Hanea, A. D., Wilkinson, D., McBride, M., Lyon, A., van Ravenzwaaij, D., Singleton Thorn, F., Gray, C., Mandel, D. R., Willcox, A., Gould, E., Smith, E., Mody, F., Bush, M., Fidler, F., Fraser, H., & Wintle, B. (2021). Mathematically aggregating experts’ predictions of possible futures. PLoS One, 16(9), e0256919. https://doi.org/10.1371/journal.pone.0256919
Haran, U., Ritov, I., & Mellers, B. A. (2013). The role of actively open-minded thinking in information acquisition, accuracy, and calibration. Judgment and Decision making, 8(3), 188–201.
Hastie, T., Qian, J., & Tay, K. (2021). An introduction to glmnet. CRAN R Repository.
Himmelstein, M., Atanasov, P., & Budescu, D. V. (2021). Forecasting forecaster accuracy: Contributions of past performance and individual differences. Judgment & Decision Making, 16(2), 323–362.
Himmelstein, M., Budescu, D. V., & Han, Y. (2023a). The wisdom of timely crowds. In M. Seifert (Ed.), Judgment in predictive analytics. Springer.
Himmelstein, M., Budescu, D. V., & Ho, E. (2023b). The wisdom of many in few: Finding individuals who are as wise as the crowd. Journal of Experimental Psychology: General. Advance online publication.
Ho, E. H. (2020, June). Developing and validating a method of coherence-based judgment aggregation. Unpublished PhD Dissertation. Fordham University, Bronx NY.
Horowitz, M., Stewart, B. M., Tingley, D., Bishop, M., Resnick Samotin, L., Roberts, M., Chang, W., Mellers, B., & Tetlock, P. (2019). What makes foreign policy teams tick: Explaining variation in group performance at geopolitical forecasting. The Journal of Politics, 81(4), 1388–1404.
Joseph, R., & Atanasov, P. (2019). Predictive training and accuracy: Self-selection and causal factors. Working Paper, Presented at Collective Intelligence 2019.
Karger, E., Monrad, J., Mellers, B., & Tetlock, P. (2021). Reciprocal scoring: A method for forecasting unanswerable questions. Retrieved from SSRN
Karger, J., Atanasov, P., & Tetlock, P. (2022). Improving judgments of existential risk: Better forecasts, questions, explanations, policies. SSRN Working Paper.
Karvetski, C. W., Olson, K. C., Mandel, D. R., & Twardy, C. R. (2013). Probabilistic coherence weighting for optimizing expert forecasts. Decision Analysis, 10(4), 305–326.
Karvetski, C. W., Meinel, C., Maxwell, D. T., Lu, Y., Mellers, B. A., & Tetlock, P. E. (2021). What do forecasting rationales reveal about thinking patterns of top geopolitical forecasters? International Journal of Forecasting, 38(2), 688–704.
Kurvers, R. H., Herzog, S. M., Hertwig, R., Krause, J., Moussaid, M., Argenziano, G., Zalaudek, I., Carney, P. A., & Wolf, M. (2019). How to detect high-performing individuals and groups: Decision similarity predicts accuracy. Science Advances, 5(11), eaaw9011.
Lipkus, I. M., Samsa, G., & Rimer, B. K. (2001). General performance on a numeracy scale among highly educated samples. Medical Decision Making, 21(1), 37–44.
Liu, Y., Wang, J., & Chen, Y. (2020, July). Surrogate scoring rules. In Proceedings of the 21st ACM Conference on Economics and Computation (pp. 853–871).
Mannes, A. E., Soll, J. B., & Larrick, R. P. (2014). The wisdom of select crowds. Journal of Personality and Social Psychology, 107(2), 276.
Matzen, L. E., Benz, Z. O., Dixon, K. R., Posey, J., Kroger, J. K., & Speed, A. E. (2010). Recreating Raven’s: Software for systematically generating large numbers of Raven-like matrix problems with normed properties. Behavior Research Methods, 42(2), 525–541.
Mauksch, S., Heiko, A., & Gordon, T. J. (2020). Who is an expert for foresight? A review of identification methods. Technological Forecasting and Social Change, 154, 119982.
McAndrew, T., Cambeiro, J., & Besiroglu, T. (2022). Aggregating human judgment probabilistic predictions of the safety, efficacy, and timing of a COVID-19 vaccine. Vaccine, 40(15), 2331–2341.
Mellers, B., Ungar, L., Baron, J., Ramos, J., Gurcay, B., Fincher, K., Scott, S. E., Moore, D., Atanasov, P., Swift, S. A., Murray, T., Stone, E., & Tetlock, P. E. (2014). Psychological strategies for winning a geopolitical forecasting tournament. Psychological Science, 25(5), 1106–1115.
Mellers, B., Stone, E., Atanasov, P., Rohrbaugh, N., Metz, S. E., Ungar, L., Bishop, M. M., Horowitz, M., Merkle, E., & Tetlock, P. (2015a). The psychology of intelligence analysis: Drivers of prediction accuracy in world politics. Journal of Experimental Psychology: Applied, 21(1), 1.
Mellers, B., Stone, E., Murray, T., Minster, A., Rohrbaugh, N., Bishop, M., Chen, E., Baker, J., Hou, Y., Horowitz, M., Ungar, L., & Tetlock, P. (2015b). Identifying and cultivating superforecasters as a method of improving probabilistic predictions. Perspectives on Psychological Science, 10(3), 267–281.
Mellers, B. A., Baker, J. D., Chen, E., Mandel, D. R., & Tetlock, P. E. (2017). How generalizable is good judgment? A multitask, multi-benchmark study. Judgment and Decision making, 12(4), 369–381.
Merkle, E. C., Steyvers, M., Mellers, B., & Tetlock, P. E. (2016). Item response models of probability judgments: Application to a geopolitical forecasting tournament. Decision, 3(1), 1–19.
Milkman, K. L., Gandhi, L., Patel, M. S., Graci, H. N., Gromet, D. M., Ho, H., Kay, J. S., Lee, T. W., Rothschild, J., Bogard, J. E., Brody, I., Chabris, C. F., & Chang, E. (2022). A 680,000-person megastudy of nudges to encourage vaccination in pharmacies. Proceedings of the National Academy of Sciences, 119(6), e2115126119.
Miller, N., Resnick, P., & Zeckhauser, R. (2005). Eliciting informative feedback: The peer-prediction method. Management Science, 51(9), 1359–1373.
Morstatter, F., Galstyan, A., Satyukov, G., Benjamin, D., Abeliuk, A., Mirtaheri, M., et al. (2019). SAGE: A hybrid geopolitical event forecasting system. IJCAI, 1, 6557–6559.
Murphy, A. H., & Winkler, R. L. (1987). A general framework for forecast verification. Monthly Weather Review, 115(7), 1330–1338.
Palley, A. B., & Soll, J. B. (2019). Extracting the wisdom of crowds when information is shared. Management Science, 65(5), 2291–2309.
Peters, E., Västfjäll, D., Slovic, P., Mertz, C. K., Mazzocco, K., & Dickert, S. (2006). Numeracy and decision making. Psychological Science, 17(5), 407–413.
Predd, J. B., Osherson, D. N., Kulkarni, S. R., & Poor, H. V. (2008). Aggregating probabilistic forecasts from incoherent and abstaining experts. Decision Analysis, 5(4), 177–189.
Prelec, D. (2004). A Bayesian truth serum for subjective data. Science, 306(5695), 462–466.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph, 34, 1–97.
Seifert, M., Siemsen, E., Hadida, A. L., & Eisingerich, A. B. (2015). Effective judgmental forecasting in the context of fashion products. Journal of Operations Management, 36, 33–45.
Sell, T. K., Warmbrod, K. L., Watson, C., Trotochaud, M., Martin, E., Ravi, S. J., Balick, M., & Servan-Schreiber, E. (2021). Using prediction polling to harness collective intelligence for disease forecasting. BMC Public Health, 21(1), 1–9.
Shipley, W. C., Gruber, C. P., Martin, T. A., & Klein, A. M. (2009). Shipley-2 manual. Western Psychological Services.
Stanovich, K. E., & West, R. F. (1997). Reasoning independently of prior belief and individual differences in actively open-minded thinking. Journal of Educational Psychology, 89(2), 342–357.
Stewart, T. R., Roebber, P. J., & Bosart, L. F. (1997). The importance of the task in analyzing expert judgment. Organizational Behavior and Human Decision Processes, 69(3), 205–219.
Suedfeld, P., & Tetlock, P. (1977). Integrative complexity of communications in international crises. Journal of Conflict Resolution, 21(1), 169–184.
Tannenbaum, D., Fox, C. R., & Ülkümen, G. (2017). Judgment extremity and accuracy under epistemic vs. aleatory uncertainty. Management Science, 63(2), 497–518.
Tetlock, P. E. (2005). Expert political judgment. Princeton University Press.
Tetlock, P. E., & Gardner, D. (2016). Superforecasting: The art and science of prediction. Random House.
Toplak, M. E., West, R. F., & Stanovich, K. E. (2014). Assessing miserly information processing: An expansion of the cognitive reflection test. Thinking & Reasoning, 20(2), 147–168.
Tsai, J., & Kirlik, A. (2012). Coherence and correspondence competence: Implications for elicitation and aggregation of probabilistic forecasts of world events. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting (Vol. 56, pp. 313–317). Sage.
Wallsten, T. S., Budescu, D. V., & Zwick, R. (1993). Comparing the calibration and coherence of numerical and verbal probability judgments. Management Science, 39(2), 176–190.
Webster, D. M., & Kruglanski, A. W. (1994). Individual differences in need for cognitive closure. Journal of Personality and Social Psychology, 67(6), 1049–1162.
Witkowski, J., & Parkes, D. (2012). A robust bayesian truth serum for small populations. Proceedings of the AAAI Conference on Artificial Intelligence, 26(1), 1492–1498.
Witkowski, J., Atanasov, P., Ungar, L., & Krause, A. (2017) Proper proxy scoring rules. In Presented at AAAI-17: Thirty-First AAAI Conference on Artificial Intelligence.
Zong, S., Ritter, A., & Hovy, E. (2020). Measuring forecasting skill from text. arXiv preprint arXiv:2006.07425.
Acknowledgments
We thank Matthias Seifert, David Budescu, David Mandel, Stefan Herzog and Philip Tetlock for helpful suggestions. All remaining errors are our own. No project-specific funding was used for the completion of this chapter.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Methodological Details of Selected Predictors
Appendix: Methodological Details of Selected Predictors
1.1 Item Response Theory Models
In forecasting, one such confounder is the timing in which forecasts are made. In forecasting tournaments, forecasters make many forecasts about the same problems at various time points. Those who forecast problems closer to their resolution date have an accuracy advantage which may be important to account for in assessing their talent level (for more detail, see Himmelstein et al., this issue). IRT models can be extended so that their diagnostic properties change relative to the time point at which a forecaster makes their forecast. One such model is given below (Himmelstein et al., 2021; Merkle et al., 2016).
The three b parameters represent how an item’s difficult changes as time passes: b0, q represents an item’s maximum difficulty (as time to resolution goes to infinity), b1, q an item’s minimum difficulty (immediately prior to resolution), and b2 the shape of the curve between b0, q and b1, q based on how much time is remaining in the question at the time of the forecast (tf, q, d). The other two parameters represent how well an item discriminates between forecasters of different skill levels (λq) and how skilled the individual forecasters are (θf). As the estimate of forecaster skill, talent spotters will typically be most interested in this θf parameter, which is conventionally scaled so that it is on a standard normal distribution, θf~N(0, 1), with scores of 0 indicating an average forecaster, −1 a forecaster that is 1 SD below average, and 1 a forecaster that is 1 SD above average.
One potential problem with this model is that, in some cases, the distribution of Brier Scores is not well behaved. This typically occurs in cases which have many binary questions, so that the Brier score is a direct function of the accuracy assigned to the correct option. In such cases, the distribution of Brier scores can be multi-modal, because forecasters will tend to input many extreme and round number probability estimates, such as 0, .5, and 1 (Bo et al., 2017; Budescu et al., 1988; Merkle et al., 2016; Wallsten et al., 1993). To accommodate such multi-modal distributions, one option is to discretize the distribution of Brier scores into bins and reconfigure the model into an ordinal response model. Such models, such as the graded response model (Samejima, 1969), have a long history in the IRT literature.
Merkle et al. (2016) and Bo et al. (2017) describe examples of ordinal IRT models for forecasting judgment. However, the former found that the continuous and ordinal versions of the model were highly correlated (r = .87) in their assessment of forecaster ability level, and that disagreements tended to be focused on poor performing forecasters (who tend to make large errors) than high performing forecasters.
1.2 Contribution Scores
To obtain contribution scores for individual forecasters, it is necessary to first define some aggregation method for all of their judgments for each question. The simplest, and most common form of aggregation would just be to obtain the mean of all probabilities for all events associated with a forecasting problem. The aggregate probability (AP) for each of the c events associated with a forecasting question across all forecasters would be
And the aggregate Brier score (AB) would then be
Based on this aggregation approach, defining the contribution of individual forecasters to the aggregate is algebraically straightforward. We can define APq, c, − f as the aggregate probability with an individual forecaster’s judgment removed as
And the aggregate Brier score with an individual forecaster’s judgment removed as
Finally, we define a forecaster’s average contribution to the accuracy of the aggregate crowd forecasts as
Cf is a representation of how much information a forecaster brings to the table, on average, that is both unique and beneficial. It is possible that a forecaster ranked very highly on individual accuracy might be ranked lower in terms of their contribution, because their forecasts tended to be very similar to the forecasts of others, and so they did less to move the needle when averaged into the crowd.
Both weighting members of the crowd by average contribution scores, as well as selecting positive or high performing contributors, have been demonstrated to improve the aggregate crowd judgment (Budescu & Chen, 2015; Chen et al., 2016). The approach is especially appealing because it can be extended into a model that is dynamic, in that it is able to update contribution scores for each member of a crowd as more information about their performance comes available; it requires relatively little information about past performance to reliably estimate high performing contributors; and it is cost effective, in that is able to select a relatively small group of high performing contributors who can produce an aggregate judgment that matches or exceeds the judgment of larger crowds in terms of accuracy (Chen et al., 2016).
The advent of contribution assessment was initially designed with a particular goal in mind: to improve the aggregate wisdom of the crowd (Budescu & Chen, 2015; Chen et al., 2016). One might challenge as slightly as a slightly narrower goal than pure talent spotting. It is clearly an effective tool for maximizing crowd wisdom, but is it a valid tool for assessing expertise? The answer appears to be yes. Chen et al. (2016) not only studied contribution scores as an aggregation tool but tested how well contribution scores perform at selecting forecasters known to have a skill advantage through various manipulations known to benefit expertise, such as explicit training and interactive collaboration.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Atanasov, P., Himmelstein, M. (2023). Talent Spotting in Crowd Prediction. In: Seifert, M. (eds) Judgment in Predictive Analytics. International Series in Operations Research & Management Science, vol 343. Springer, Cham. https://doi.org/10.1007/978-3-031-30085-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-30085-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30084-4
Online ISBN: 978-3-031-30085-1
eBook Packages: Business and ManagementBusiness and Management (R0)