Abstract
Aggregating information across multiple testimonies may improve crime reconstructions. However, different aggregation methods are available, and research on which method is best suited for aggregating multiple observations is lacking. Furthermore, little is known about how variance in the accuracy of individual testimonies impacts the performance of competing aggregation procedures. We investigated the superiority of aggregation-based crime reconstructions involving multiple individual testimonies and whether this superiority varied as a function of the number of witnesses and the degree of heterogeneity in witnesses’ ability to accurately report their observations. Moreover, we examined whether heterogeneity in competence levels differentially affected the relative accuracy of two aggregation procedures: a simple majority rule, which ignores individual differences, and the more complex general Condorcet model (Romney et al., Am Anthropol 88(2):313–338, 1986; Batchelder and Romney, Psychometrika 53(1):71–92, 1988), which takes into account differences in competence between individuals. 121 participants viewed a simulated crime and subsequently answered 128 true/false questions about the crime. We experimentally generated groups of witnesses with homogeneous or heterogeneous competences. Both the majority rule and the general Condorcet model provided more accurate reconstructions of the observed crime than individual testimonies. The superiority of aggregated crime reconstructions involving multiple individual testimonies increased with an increasing number of witnesses. Crime reconstructions were most accurate when competences were heterogeneous and aggregation was based on the general Condorcet model. We argue that a formal aggregation should be considered more often when eyewitness testimonies have to be assessed and that the general Condorcet model provides a good framework for such aggregations.
Similar content being viewed by others
Notes
In addition to the true/false responses, participants rated their confidence with respect to each response. This was done for an unrelated study that is not part of the present article.
Parameter estimates were based on 11,000 iterations, of which the first 1000 iterations were used as burn-ins and therefore discarded.
In estimating the statistical power, we assumed an odds ratio of 3 and a proportion of discordant pairs of .55. The odds ratio is determined by the ratio of the two cells in the 2 × 2 table in which the aggregation methods did not perform equally well.
To determine whether all model parameters were needed to explain the observed data, we computed the badness-of-fit Deviance Information Criterion (DIC; cf. Karabatsos & Batchelder, 2003) for the GCM. In both conditions, the most complex variant of the GCM showed the best trade-off between model fit and the number of parameters and was, therefore, used in all analyses.
The GCM further considers differences in guessing bias and item difficulty. However, because these parameters were not important for present purposes, we do not discuss them any further.
References
Allwood, C. M., Ask, K., & Granhag, P. A. (2005). The cognitive interview: Effects on the realism in witnesses’ confidence in their free recall. Psychology, Crime and Law, 11(2), 183–198.
Anders, R., Oravecz, Z., & Batchelder, W. H. (2014). Cultural conseus theory for continuous responses: A latent appraisal model for information pooling. Journal of Mathematical Psychology, 61, 1–13.
Armstrong, J. S. (2004). Combining forecasts. In J. S. Armstrong (Ed.), Principles of forecasting. A handbook for researchers and practitioners (pp. 417–439). Boston: Kluwer.
Aßfalg, A., & Erdfelder, E. (2012). CAML—maximum likelihood consensus analysis. Behavior Research Methods, 44(1), 189–201.
Batchelder, W. H., Kumbasar, E., & Boyd, J. P. (1997). Consensus analysis of three-way social network data. Journal of Mathematical Sociology, 22(1), 29–58.
Batchelder, W. H., & Romney, A. K. (1986). The statistical analysis of a general Condorcet model for dichotomous choice situations. In B. Grofman & G. Owen (Eds.), Information pooling and group decision making (pp. 103–112). Greenwich: JAL.
Batchelder, W. H., & Romney, A. K. (1988). Test theory without an answer key. Psychometrika, 53(1), 71–92.
Batchelder, W. H., & Romney, A. K. (1989). New results in test theory without an answer key. In E. E. Roskam (Ed.), Mathematical psychology in progress (pp. 229–248). Berlin: Springer.
Bernstein, D. M., & Loftus, E. F. (2009). How to tell if a particular memory is true or false. Perspectives on Psychological Science, 4(4), 370–374.
Boland, P. J. (1989). Majority systems and the Condorcet Jury Theorem. The Statistician, 38(3), 181–189.
Bredenkamp, J., & Erdfelder, E. (1996). Methoden der Gedächtnispsychologie [Methods of the psychology of memory]. In D. Albert & K.-H. Stapf (Eds.), Gedächtnis (Enzyklopädie der Psychologie, Themenbereich C, Serie II, Band 4, S. 1–94) [Memory (Encyclopedia of Psychology, Topics C, Series II, Issue 4, pp. 1–94)]. Göttingen: Hogrefe.
Brigham, J. C., & Bothwell, R. K. (1983). The ability of prospective jurors to estimate the accuracy of eyewitness identifications. Law and Human Behavior, 7(1), 19–30.
Clark, S. E., & Wells, G. L. (2008). On the diagnosticity of multiple-witness identifications. Law and Human Behavior, 32(5), 406–422.
Clemen, R. T. (1989). Combining forecasts: A review and annotated bibliography. International Journal of Forecasting, 5(4), 559–583.
Crowther, C. S., Batchelder, W. H., & Hu, X. (1995). A measurement-theoretic analysis of the fuzzy logic model of perception. Psychological Review, 102(2), 396–408.
Davis-Stober, C., Budescu, D., Dana, J., & Broomell, S. (2014). When is a crowd wise? Decision, 1(2), 1–4.
Estlund, D. M. (1994). Opinion leaders, independence, and Condorcet’s Jury Theorem. Theory and Decision, 36(2), 131–162.
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191.
Fisher, R. P., Geiselman, R. E., & Raymond, D. S. (1987). Critical analysis of police interview techniques. Journal of Police Science and Administration, 15(3), 177–185.
Fisher, R. P., Vrij, A., & Leins, D. A. (2013). Does testimonial inconsistency indicate memory inaccuracy and deception? Beliefs, empirical research, and theory. In B. S. Cooper, D. Griesel, & M. Ternes (Eds.), Applied issues in investigative interviewing, eyewitness memory, and credibility assessment (pp. 173–189). New York: Springer.
Frenda, S. J., Nichols, R. M., & Loftus, E. F. (2011). Current issues and advances in misinformation research. Current Directions in Psychological Science, 20(1), 20–23.
Gabbert, F., Memon, A., & Allan, K. (2003). Memory conformity: Can eyewitnesses influence each other’s memories for an event? Applied Cognitive Psychology, 17(5), 533–543.
Gabbert, F., Memon, A., & Wright, D. B. (2006). Memory conformity: Disentangling the steps toward influence during a discussion. Psychonomic Bulletin and Review, 13(3), 480–485.
Galton, F. (1907). Vox populi. Nature Photonics, 75(1949), 450–451.
Greenberg, M. S., Westcott, D. R., & Bailey, S. E. (1998). When believing is seeing: The effect of scripts on eyewitness memory. Law and Human Behavior, 22(6), 685–694.
Grofman, B., Owen, G., & Feld, S. (1983). Thirteen theorems in search of the truth. Theory and Decision, 15(3), 261–278.
Gruneberg, M. M., & Sykes, R. B. (1993). The generalisability of confidence-accuracy studies in eyewitnessing. Memory, 1(3), 185–189.
Holst, V. F., & Pezdek, K. (1992). Scripts for typical crimes and their effects on memory for eyewitness testimony. Applied Cognitive Psychology, 6(7), 573–587.
Kanazawa, S. (1998). A brief note on a further refinement of the Condorcet Jury Theorem for heterogeneous groups. Mathematical Social Sciences, 35(1), 69–73.
Karabatsos, G., & Batchelder, W. (2003). Markov chain estimation for test theory without an answer key. Psychometrika, 68(3), 373–389.
Kazmann, R. G. (1973). Democratic organization: A preliminary mathematical model. Public Choice, 16(1), 17–26.
Koriat, A. (2012). When are two heads better than one and why? Science, 336(6079), 360–362.
Krause, J., Ruxton, G. D., & Krause, S. (2010). Swarm intelligence in animals and humans. Trends in Ecology and Evolution, 25(1), 28–34.
Ladha, K. K. (1992). The Condorcet Jury Theorem, free speech, and correlated votes. American Journal of Political Science, 36(3), 617–634.
Lindsay, D. S., Nilsen, E., & Read, J. D. (2000). Witnessing-condition heterogeneity and witnesses’ versus investigators’ confidence in the accuracy of witnesses’ identification decisions. Law and Human Behavior, 24(6), 685–697.
Loftus, E. F. (1975). Leading questions and the eyewitness report. Cognitive Psychology, 7(4), 560–572.
Loftus, E. F. (1996). Eyewitness testimony. Cambridge: Harvard University Press.
Meade, M. L., & Roediger, H. L. (2002). Explorations in the social contagion of memory. Memory and Cognition, 30(7), 995–1009.
Oravecz, Z., Vandekerckhove, J., & Batchelder, W. H. (2014). Bayesian cultural consensus theory. Field Methods, 26(3), 207–222.
Paterson, H. M., & Kemp, R. I. (2006). Co-witness talk: A survey of eyewitness discussion. Psychology, Crime and Law, 12(2), 181–191.
Peterson, C., & Grant, M. (2001). Forced-choice: Are forensic interviewers asking the right questions? Canadian Journal of Behavioural Science (Revue Canadienne Des Sciences Du Comportement), 33(2), 118–127.
President’s Commission on the Assassination of President Kennedy. (1964). Report of the President’s Commission on the Assassination of President Kennedy. Washington, DC: U.S. Government Printing Office. Retrieved from http://www.archives.gov/research/jfk/warren-commission-report/. Accessed 25 Oct 2016.
R Development Core Team. (2016). The R-project for statistical computing. Retrieved from http://www.r-project.org/, Accessed 25 Oct 2016.
Read, J. D., Lindsay, D. S., & Nicholls, T. (1998). The relation between confidence and accuracy in eyewitness identification studies: Is the conclusion changing? In C. P. Thompson, D. J. Herrmann, J. D. Read, & D. Bruce (Eds.), Eyewitness memory: Theoretical and applied perspectives (pp. 107–130). Mahwah: Lawrence Erlbaum Associates Publishers.
Roberts, W. T., & Higham, P. A. (2002). Selecting accurate statements from the cognitive interview using confidence ratings. Journal of Experimental Psychology: Applied, 8(1), 33–43.
Romney, A. K. (1999). Consensus as a statistical model. Current Anthropology, 40(S1), 103–115.
Romney, A. K., & Batchelder, W. H. (1999). Cultural consensus theory. In R. A. Wilson & F. C. Keil (Eds.), The MIT encyclopedia of the cognitive sciences (pp. 208–209). Cambridge: MIT Press.
Romney, A. K., Batchelder, W. H., & Weller, S. C. (1987). Recent applications of cultural consensus theory. American Behavioral Scientist, 31(2), 163–177.
Romney, A. K., Weller, S. C., & Batchelder, W. H. (1986). Culture as consensus: A theory of culture and informant accuracy. American Anthropologist, 88(2), 313–338.
Sanders, G. S., & Warnick, D. H. (1982). Evaluating identification evidence from multiple eyewitnesses. Journal of Applied Social Psychology, 12(3), 182–192.
Scheck, B., Neufeld, P., & Dwyer, J. (2000). Actual innocence: Five days to execution and other dispatches from the wrongly convicted. New York: Doubleday.
Schmechel, R. S., O’Toole, T. P., Easterly, C., & Loftus, E. F. (2006). Beyond the ken? Testing jurors’ understanding of eyewitness reliability evidence. Jurimetrics, 46(2), 177–214.
Sharman, S. J., & Powell, M. B. (2012). A comparison of adult witnesses’ suggestibility across Various types of leading questions. Applied Cognitive Psychology, 26(1), 48–53.
Shaw, J. S., Garven, S., & Wood, J. M. (1997). Co-witness information can have immediate effects on eyewitness memory reports. Law and Human Behavior, 21(5), 503–523.
Simons, D. J., & Chabris, C. F. (2011). What people believe about how memory works: A representative survey of the US population. PLoS One, doi:10.1371/journal.pone.0022757.
Simons, D. J., & Chabris, C. F. (2012). Common (mis)beliefs about memory: A replication and comparison of telephone and Mechanical Turk survey methods. PLoS One, doi:10.1371/journal.pone.0051876.
Skagerberg, E. M., & Wright, D. B. (2008). The prevalence of co-witnesses and co-witness discussions in real eyewitnesses. Psychology Crime and Law, 14(6), 513–521.
Snodgrass, J. G., & Corwin, J. (1988). Pragmatics of measuring recognition memory: Applications to dementia and amnesia. Journal of Experimental Psychology: General, 117(1), 34–50.
Surowiecki, J. (2004). The wisdom of crowds. New York: Doubleday.
Troyer, A. K., & Craik, F. I. (2000). The effect of divided attention on memory for items and their context. Canadian Journal of Experimental Psychology (Revue Canadienne de Psychologie Expérimentale), 54(3), 161–171.
Vredeveldt, A., Hildebrandt, A., & van Koppen, P. J. (2015). Acknowledge, repeat, rephrase, elaborate: Witnesses can help each other remember more. Memory, doi:10.1080/09658211.2015.1042884.
Vredeveldt, A., Hitch, G. J., & Baddeley, A. D. (2011). Eye closure helps memory by reducing cognitive load and enhancing visualisation. Memory and Cognition, 39(7), 1253–1263.
Vredeveldt, A., & Sauer, J. D. (2015). Effects of eye-closure on confidence-accuracy relations in eyewitness testimony. Journal of Applied Research in Memory and Cognition, 4(1), 51–58.
Waubert de Puiseau, B., Aßfalg, A., Erdfelder, E., & Bernstein, D. M. (2012). Extracting the truth from conflicting eyewitness reports: A formal modeling approach. Journal of Experimental Psychology: Applied, 18(4), 390–403.
Weller, S. C. (1987). Shared knowledge, intracultural variation, and knowledge aggregation. American Behavioral Scientist, 31(2), 178–193.
Weller, S. C. (2007). Cultural Consensus Theory: Applications and frequently asked questions. Field Methods, 19(4), 339–368.
Wells, G. L., Memon, A., & Penrod, S. D. (2006). Eyewitness evidence. Improving its probative value. Psychological Science in the Public Interest, 7(2), 45–75.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
All procedures performed in the study reported in this manuscript were in accordance with the 1964 Helsinki declaration and its later amendments. Informed consent was obtained from all individual participants included in the study. The authors declare that they have no conflict of interest.
Additional information
B. Waubert de Puiseau and S. Greving contributed equally to this work.
Appendix: Formalization of the general Condorcet model
Appendix: Formalization of the general Condorcet model
In a witness recognition experiment, N witnesses first observe a crime and then make recognition judgments about M statements regarding their observations. In the 2-HTM, responses are modelled as a function of a witness’s competence, D i , i ∈ {1,…,N}, and the witness’s tendency to guess that a statement is “true”, g i , when the witness does not know the answer. Each witness is assumed to judge a statement as “true” if the witness believes that a detail has occurred, or as “false” if the witness believes that a detail has not occurred. Thus, “true” responses can be classified either as hits if the statement is true or as false alarms if the statement is false. In the 2-HTM, hits occur either because the witness remembers the relevant fact with probability D i or because the witness does not remember the relevant fact with probability (1 − D i ) but guesses correctly with probability g i. In the 2-HTM, competence and guessing bias are assumed to be constant across questions. Thus, the probability of a hit, H i , is H i = D i + (1 − D i )g i . False alarms are assumed to occur when the witness does not remember the relevant fact with probability (1 − D i ) and then incorrectly guesses “true” with probability g i . Thus, the probability of a false alarm, F i , can be computed as F i = (1 − D i )g i . Solving these equations for D i and g i yields:
and
Using the observed hit and false alarm rates as estimates of H i and F i , respectively, a witness’s competence and guessing bias can be estimated with Eqs. (1) and (2).
Computing competence and guessing bias in the GCM is more complex because the answer key is unknown. The GCM, therefore, extends the 2-HTM by adding a latent variable, the answer key Z = (Z k )1×M , which is a vector of correct responses for items k ∈ {1,…,M}:
Further, the GCM (Karabatsos, & Batchelder, 2003; Oravecz et al., 2014) includes another latent variable, the difficulty of item k, δ k , with 0 < δ k < 1. Taking item difficulty into account, Karabatsos and Batchelder (2003) define the probability of witness i knowing the correct response to item k as
where θ i denotes the competence of witness i, independent of item difficulty, with 0 < θ i < 1.Footnote 6
On the basis of these equations, the GCM defines the probability that witness i correctly recognizes statement k as:
The parameter estimates for the latent parameters competence θ i , guessing bias g i , item difficulty δ k , and the answer key Z k are determined simultaneously from the response matrix X = (X ik ) N×M ,
We used the Markov-chain-Monte-Carlo procedure described by Karabatsos and Batchelder (2003) to find parameter estimates that maximize the likelihood function
where \(\varOmega = \left\{{\theta_{{\left\langle {i = 1, \ldots, N} \right\rangle}}, g_{{\left\langle {i = 1, \ldots, N} \right\rangle, }} \delta_{{\left\langle {k = 1} \right\rangle}}, Z_{{\left\langle {k = 1, \ldots, M} \right\rangle}}} \right\}\) are the parameters of the GCM. More detailed descriptions of the 2-HTM and the GCM can be found elsewhere (cf. Aßfalg, & Erdfelder, 2012; Batchelder, & Romney, 1986, 1988, 1989; Karabatsos, & Batchelder, 2003; Oravecz et al., 2014; Romney, Batchelder, & Weller, 1987; Romney et al., 1986).
Rights and permissions
About this article
Cite this article
Waubert de Puiseau, B., Greving, S., Aßfalg, A. et al. On the importance of considering heterogeneity in witnesses’ competence levels when reconstructing crimes from multiple witness testimonies. Psychological Research 81, 947–960 (2017). https://doi.org/10.1007/s00426-016-0802-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00426-016-0802-1