Skip to main content
Log in

On the importance of considering heterogeneity in witnesses’ competence levels when reconstructing crimes from multiple witness testimonies

  • Original Article
  • Published:
Psychological Research Aims and scope Submit manuscript

Abstract

Aggregating information across multiple testimonies may improve crime reconstructions. However, different aggregation methods are available, and research on which method is best suited for aggregating multiple observations is lacking. Furthermore, little is known about how variance in the accuracy of individual testimonies impacts the performance of competing aggregation procedures. We investigated the superiority of aggregation-based crime reconstructions involving multiple individual testimonies and whether this superiority varied as a function of the number of witnesses and the degree of heterogeneity in witnesses’ ability to accurately report their observations. Moreover, we examined whether heterogeneity in competence levels differentially affected the relative accuracy of two aggregation procedures: a simple majority rule, which ignores individual differences, and the more complex general Condorcet model (Romney et al., Am Anthropol 88(2):313–338, 1986; Batchelder and Romney, Psychometrika 53(1):71–92, 1988), which takes into account differences in competence between individuals. 121 participants viewed a simulated crime and subsequently answered 128 true/false questions about the crime. We experimentally generated groups of witnesses with homogeneous or heterogeneous competences. Both the majority rule and the general Condorcet model provided more accurate reconstructions of the observed crime than individual testimonies. The superiority of aggregated crime reconstructions involving multiple individual testimonies increased with an increasing number of witnesses. Crime reconstructions were most accurate when competences were heterogeneous and aggregation was based on the general Condorcet model. We argue that a formal aggregation should be considered more often when eyewitness testimonies have to be assessed and that the general Condorcet model provides a good framework for such aggregations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. In addition to the true/false responses, participants rated their confidence with respect to each response. This was done for an unrelated study that is not part of the present article.

  2. Parameter estimates were based on 11,000 iterations, of which the first 1000 iterations were used as burn-ins and therefore discarded.

  3. In estimating the statistical power, we assumed an odds ratio of 3 and a proportion of discordant pairs of .55. The odds ratio is determined by the ratio of the two cells in the 2 × 2 table in which the aggregation methods did not perform equally well.

  4. To determine whether all model parameters were needed to explain the observed data, we computed the badness-of-fit Deviance Information Criterion (DIC; cf. Karabatsos & Batchelder, 2003) for the GCM. In both conditions, the most complex variant of the GCM showed the best trade-off between model fit and the number of parameters and was, therefore, used in all analyses.

  5. The GCM further considers differences in guessing bias and item difficulty. However, because these parameters were not important for present purposes, we do not discuss them any further.

  6. Because different combinations of θ i and δ k yield the same \(D_{ik}\), an additional constraint on Eq. (4) is necessary (Crowther, Batchelder, & Hu, 1995). Following the procedure employed by Crowther et al. (1995) and Waubert de Puiseau et al. (2012), we therefore set δ 1  = .5 in all analyses.

References

  • Allwood, C. M., Ask, K., & Granhag, P. A. (2005). The cognitive interview: Effects on the realism in witnesses’ confidence in their free recall. Psychology, Crime and Law, 11(2), 183–198.

    Article  Google Scholar 

  • Anders, R., Oravecz, Z., & Batchelder, W. H. (2014). Cultural conseus theory for continuous responses: A latent appraisal model for information pooling. Journal of Mathematical Psychology, 61, 1–13.

    Article  Google Scholar 

  • Armstrong, J. S. (2004). Combining forecasts. In J. S. Armstrong (Ed.), Principles of forecasting. A handbook for researchers and practitioners (pp. 417–439). Boston: Kluwer.

    Google Scholar 

  • Aßfalg, A., & Erdfelder, E. (2012). CAML—maximum likelihood consensus analysis. Behavior Research Methods, 44(1), 189–201.

    Article  PubMed  Google Scholar 

  • Batchelder, W. H., Kumbasar, E., & Boyd, J. P. (1997). Consensus analysis of three-way social network data. Journal of Mathematical Sociology, 22(1), 29–58.

    Article  Google Scholar 

  • Batchelder, W. H., & Romney, A. K. (1986). The statistical analysis of a general Condorcet model for dichotomous choice situations. In B. Grofman & G. Owen (Eds.), Information pooling and group decision making (pp. 103–112). Greenwich: JAL.

    Google Scholar 

  • Batchelder, W. H., & Romney, A. K. (1988). Test theory without an answer key. Psychometrika, 53(1), 71–92.

    Article  Google Scholar 

  • Batchelder, W. H., & Romney, A. K. (1989). New results in test theory without an answer key. In E. E. Roskam (Ed.), Mathematical psychology in progress (pp. 229–248). Berlin: Springer.

    Chapter  Google Scholar 

  • Bernstein, D. M., & Loftus, E. F. (2009). How to tell if a particular memory is true or false. Perspectives on Psychological Science, 4(4), 370–374.

    Article  PubMed  Google Scholar 

  • Boland, P. J. (1989). Majority systems and the Condorcet Jury Theorem. The Statistician, 38(3), 181–189.

    Article  Google Scholar 

  • Bredenkamp, J., & Erdfelder, E. (1996). Methoden der Gedächtnispsychologie [Methods of the psychology of memory]. In D. Albert & K.-H. Stapf (Eds.), Gedächtnis (Enzyklopädie der Psychologie, Themenbereich C, Serie II, Band 4, S. 1–94) [Memory (Encyclopedia of Psychology, Topics C, Series II, Issue 4, pp. 1–94)]. Göttingen: Hogrefe.

    Google Scholar 

  • Brigham, J. C., & Bothwell, R. K. (1983). The ability of prospective jurors to estimate the accuracy of eyewitness identifications. Law and Human Behavior, 7(1), 19–30.

    Article  Google Scholar 

  • Clark, S. E., & Wells, G. L. (2008). On the diagnosticity of multiple-witness identifications. Law and Human Behavior, 32(5), 406–422.

    Article  PubMed  Google Scholar 

  • Clemen, R. T. (1989). Combining forecasts: A review and annotated bibliography. International Journal of Forecasting, 5(4), 559–583.

    Article  Google Scholar 

  • Crowther, C. S., Batchelder, W. H., & Hu, X. (1995). A measurement-theoretic analysis of the fuzzy logic model of perception. Psychological Review, 102(2), 396–408.

    Article  PubMed  Google Scholar 

  • Davis-Stober, C., Budescu, D., Dana, J., & Broomell, S. (2014). When is a crowd wise? Decision, 1(2), 1–4.

    Article  Google Scholar 

  • Estlund, D. M. (1994). Opinion leaders, independence, and Condorcet’s Jury Theorem. Theory and Decision, 36(2), 131–162.

    Article  Google Scholar 

  • Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191.

    Article  PubMed  Google Scholar 

  • Fisher, R. P., Geiselman, R. E., & Raymond, D. S. (1987). Critical analysis of police interview techniques. Journal of Police Science and Administration, 15(3), 177–185.

    Google Scholar 

  • Fisher, R. P., Vrij, A., & Leins, D. A. (2013). Does testimonial inconsistency indicate memory inaccuracy and deception? Beliefs, empirical research, and theory. In B. S. Cooper, D. Griesel, & M. Ternes (Eds.), Applied issues in investigative interviewing, eyewitness memory, and credibility assessment (pp. 173–189). New York: Springer.

    Chapter  Google Scholar 

  • Frenda, S. J., Nichols, R. M., & Loftus, E. F. (2011). Current issues and advances in misinformation research. Current Directions in Psychological Science, 20(1), 20–23.

    Article  Google Scholar 

  • Gabbert, F., Memon, A., & Allan, K. (2003). Memory conformity: Can eyewitnesses influence each other’s memories for an event? Applied Cognitive Psychology, 17(5), 533–543.

    Article  Google Scholar 

  • Gabbert, F., Memon, A., & Wright, D. B. (2006). Memory conformity: Disentangling the steps toward influence during a discussion. Psychonomic Bulletin and Review, 13(3), 480–485.

    Article  PubMed  Google Scholar 

  • Galton, F. (1907). Vox populi. Nature Photonics, 75(1949), 450–451.

    Article  Google Scholar 

  • Greenberg, M. S., Westcott, D. R., & Bailey, S. E. (1998). When believing is seeing: The effect of scripts on eyewitness memory. Law and Human Behavior, 22(6), 685–694.

    Article  PubMed  Google Scholar 

  • Grofman, B., Owen, G., & Feld, S. (1983). Thirteen theorems in search of the truth. Theory and Decision, 15(3), 261–278.

    Article  Google Scholar 

  • Gruneberg, M. M., & Sykes, R. B. (1993). The generalisability of confidence-accuracy studies in eyewitnessing. Memory, 1(3), 185–189.

    Article  PubMed  Google Scholar 

  • Holst, V. F., & Pezdek, K. (1992). Scripts for typical crimes and their effects on memory for eyewitness testimony. Applied Cognitive Psychology, 6(7), 573–587.

    Article  Google Scholar 

  • Kanazawa, S. (1998). A brief note on a further refinement of the Condorcet Jury Theorem for heterogeneous groups. Mathematical Social Sciences, 35(1), 69–73.

    Article  Google Scholar 

  • Karabatsos, G., & Batchelder, W. (2003). Markov chain estimation for test theory without an answer key. Psychometrika, 68(3), 373–389.

    Article  Google Scholar 

  • Kazmann, R. G. (1973). Democratic organization: A preliminary mathematical model. Public Choice, 16(1), 17–26.

    Article  Google Scholar 

  • Koriat, A. (2012). When are two heads better than one and why? Science, 336(6079), 360–362.

    Article  PubMed  Google Scholar 

  • Krause, J., Ruxton, G. D., & Krause, S. (2010). Swarm intelligence in animals and humans. Trends in Ecology and Evolution, 25(1), 28–34.

    Article  PubMed  Google Scholar 

  • Ladha, K. K. (1992). The Condorcet Jury Theorem, free speech, and correlated votes. American Journal of Political Science, 36(3), 617–634.

    Article  Google Scholar 

  • Lindsay, D. S., Nilsen, E., & Read, J. D. (2000). Witnessing-condition heterogeneity and witnesses’ versus investigators’ confidence in the accuracy of witnesses’ identification decisions. Law and Human Behavior, 24(6), 685–697.

    Article  PubMed  Google Scholar 

  • Loftus, E. F. (1975). Leading questions and the eyewitness report. Cognitive Psychology, 7(4), 560–572.

    Article  Google Scholar 

  • Loftus, E. F. (1996). Eyewitness testimony. Cambridge: Harvard University Press.

    Book  Google Scholar 

  • Meade, M. L., & Roediger, H. L. (2002). Explorations in the social contagion of memory. Memory and Cognition, 30(7), 995–1009.

    Article  PubMed  Google Scholar 

  • Oravecz, Z., Vandekerckhove, J., & Batchelder, W. H. (2014). Bayesian cultural consensus theory. Field Methods, 26(3), 207–222.

    Article  Google Scholar 

  • Paterson, H. M., & Kemp, R. I. (2006). Co-witness talk: A survey of eyewitness discussion. Psychology, Crime and Law, 12(2), 181–191.

    Article  Google Scholar 

  • Peterson, C., & Grant, M. (2001). Forced-choice: Are forensic interviewers asking the right questions? Canadian Journal of Behavioural Science (Revue Canadienne Des Sciences Du Comportement), 33(2), 118–127.

    Article  Google Scholar 

  • President’s Commission on the Assassination of President Kennedy. (1964). Report of the President’s Commission on the Assassination of President Kennedy. Washington, DC: U.S. Government Printing Office. Retrieved from http://www.archives.gov/research/jfk/warren-commission-report/. Accessed 25 Oct 2016.

  • R Development Core Team. (2016). The R-project for statistical computing. Retrieved from http://www.r-project.org/, Accessed 25 Oct 2016.

  • Read, J. D., Lindsay, D. S., & Nicholls, T. (1998). The relation between confidence and accuracy in eyewitness identification studies: Is the conclusion changing? In C. P. Thompson, D. J. Herrmann, J. D. Read, & D. Bruce (Eds.), Eyewitness memory: Theoretical and applied perspectives (pp. 107–130). Mahwah: Lawrence Erlbaum Associates Publishers.

    Google Scholar 

  • Roberts, W. T., & Higham, P. A. (2002). Selecting accurate statements from the cognitive interview using confidence ratings. Journal of Experimental Psychology: Applied, 8(1), 33–43.

    PubMed  Google Scholar 

  • Romney, A. K. (1999). Consensus as a statistical model. Current Anthropology, 40(S1), 103–115.

    Article  Google Scholar 

  • Romney, A. K., & Batchelder, W. H. (1999). Cultural consensus theory. In R. A. Wilson & F. C. Keil (Eds.), The MIT encyclopedia of the cognitive sciences (pp. 208–209). Cambridge: MIT Press.

    Google Scholar 

  • Romney, A. K., Batchelder, W. H., & Weller, S. C. (1987). Recent applications of cultural consensus theory. American Behavioral Scientist, 31(2), 163–177.

    Article  Google Scholar 

  • Romney, A. K., Weller, S. C., & Batchelder, W. H. (1986). Culture as consensus: A theory of culture and informant accuracy. American Anthropologist, 88(2), 313–338.

    Article  Google Scholar 

  • Sanders, G. S., & Warnick, D. H. (1982). Evaluating identification evidence from multiple eyewitnesses. Journal of Applied Social Psychology, 12(3), 182–192.

    Article  Google Scholar 

  • Scheck, B., Neufeld, P., & Dwyer, J. (2000). Actual innocence: Five days to execution and other dispatches from the wrongly convicted. New York: Doubleday.

    Google Scholar 

  • Schmechel, R. S., O’Toole, T. P., Easterly, C., & Loftus, E. F. (2006). Beyond the ken? Testing jurors’ understanding of eyewitness reliability evidence. Jurimetrics, 46(2), 177–214.

    Google Scholar 

  • Sharman, S. J., & Powell, M. B. (2012). A comparison of adult witnesses’ suggestibility across Various types of leading questions. Applied Cognitive Psychology, 26(1), 48–53.

    Article  Google Scholar 

  • Shaw, J. S., Garven, S., & Wood, J. M. (1997). Co-witness information can have immediate effects on eyewitness memory reports. Law and Human Behavior, 21(5), 503–523.

    Article  PubMed  Google Scholar 

  • Simons, D. J., & Chabris, C. F. (2011). What people believe about how memory works: A representative survey of the US population. PLoS One, doi:10.1371/journal.pone.0022757.

    Google Scholar 

  • Simons, D. J., & Chabris, C. F. (2012). Common (mis)beliefs about memory: A replication and comparison of telephone and Mechanical Turk survey methods. PLoS One, doi:10.1371/journal.pone.0051876.

    Google Scholar 

  • Skagerberg, E. M., & Wright, D. B. (2008). The prevalence of co-witnesses and co-witness discussions in real eyewitnesses. Psychology Crime and Law, 14(6), 513–521.

    Article  Google Scholar 

  • Snodgrass, J. G., & Corwin, J. (1988). Pragmatics of measuring recognition memory: Applications to dementia and amnesia. Journal of Experimental Psychology: General, 117(1), 34–50.

    Article  Google Scholar 

  • Surowiecki, J. (2004). The wisdom of crowds. New York: Doubleday.

    Google Scholar 

  • Troyer, A. K., & Craik, F. I. (2000). The effect of divided attention on memory for items and their context. Canadian Journal of Experimental Psychology (Revue Canadienne de Psychologie Expérimentale), 54(3), 161–171.

    Article  Google Scholar 

  • Vredeveldt, A., Hildebrandt, A., & van Koppen, P. J. (2015). Acknowledge, repeat, rephrase, elaborate: Witnesses can help each other remember more. Memory, doi:10.1080/09658211.2015.1042884.

    PubMed  Google Scholar 

  • Vredeveldt, A., Hitch, G. J., & Baddeley, A. D. (2011). Eye closure helps memory by reducing cognitive load and enhancing visualisation. Memory and Cognition, 39(7), 1253–1263.

    Article  PubMed  Google Scholar 

  • Vredeveldt, A., & Sauer, J. D. (2015). Effects of eye-closure on confidence-accuracy relations in eyewitness testimony. Journal of Applied Research in Memory and Cognition, 4(1), 51–58.

    Article  Google Scholar 

  • Waubert de Puiseau, B., Aßfalg, A., Erdfelder, E., & Bernstein, D. M. (2012). Extracting the truth from conflicting eyewitness reports: A formal modeling approach. Journal of Experimental Psychology: Applied, 18(4), 390–403.

    PubMed  Google Scholar 

  • Weller, S. C. (1987). Shared knowledge, intracultural variation, and knowledge aggregation. American Behavioral Scientist, 31(2), 178–193.

    Article  Google Scholar 

  • Weller, S. C. (2007). Cultural Consensus Theory: Applications and frequently asked questions. Field Methods, 19(4), 339–368.

    Article  Google Scholar 

  • Wells, G. L., Memon, A., & Penrod, S. D. (2006). Eyewitness evidence. Improving its probative value. Psychological Science in the Public Interest, 7(2), 45–75.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Berenike Waubert de Puiseau.

Ethics declarations

All procedures performed in the study reported in this manuscript were in accordance with the 1964 Helsinki declaration and its later amendments. Informed consent was obtained from all individual participants included in the study. The authors declare that they have no conflict of interest.

Additional information

B. Waubert de Puiseau and S. Greving contributed equally to this work.

Appendix: Formalization of the general Condorcet model

Appendix: Formalization of the general Condorcet model

In a witness recognition experiment, N witnesses first observe a crime and then make recognition judgments about M statements regarding their observations. In the 2-HTM, responses are modelled as a function of a witness’s competence, D i , i ∈ {1,…,N}, and the witness’s tendency to guess that a statement is “true”, g i , when the witness does not know the answer. Each witness is assumed to judge a statement as “true” if the witness believes that a detail has occurred, or as “false” if the witness believes that a detail has not occurred. Thus, “true” responses can be classified either as hits if the statement is true or as false alarms if the statement is false. In the 2-HTM, hits occur either because the witness remembers the relevant fact with probability D i or because the witness does not remember the relevant fact with probability (1 − D i ) but guesses correctly with probability g i. In the 2-HTM, competence and guessing bias are assumed to be constant across questions. Thus, the probability of a hit, H i , is H i  = D i  + (1 − D i )g i . False alarms are assumed to occur when the witness does not remember the relevant fact with probability (1 − D i ) and then incorrectly guesses “true” with probability g i . Thus, the probability of a false alarm, F i , can be computed as F i  = (1 − D i )g i . Solving these equations for D i and g i yields:

$$D_{i} = H_{i} - F_{i},$$
(1)

and

$$g_{i} = \frac{{F_{i}}}{{\left({1 - H_{i} + F_{i}} \right)}}.$$
(2)

Using the observed hit and false alarm rates as estimates of H i and F i , respectively, a witness’s competence and guessing bias can be estimated with Eqs. (1) and (2).

Computing competence and guessing bias in the GCM is more complex because the answer key is unknown. The GCM, therefore, extends the 2-HTM by adding a latent variable, the answer key Z = (Z k )M , which is a vector of correct responses for items k ∈ {1,…,M}:

$$Z_{k} = \left\{ \begin{array}{ll} 1, & {\rm {\rm{if}}\; {\rm{the}}\; {\rm{correct}}\; {\rm{judgment}}\; {\rm{of}}\; {\rm{item}}}\; k\; {\rm is}\; {\text{``}}{\rm true}{\text{''}}\\ 0, & {\rm {\rm{if}}\; {\rm{the}}\; {\rm{correct}}\; {\rm{judgment}}\; {\rm{of}}\; {\rm{item}}\;} k\; {\rm is}\; {\text{``}}{\rm false}{\text{''}} \end{array}\right.$$
(3)

Further, the GCM (Karabatsos, & Batchelder, 2003; Oravecz et al., 2014) includes another latent variable, the difficulty of item k, δ k , with 0 < δ k  < 1. Taking item difficulty into account, Karabatsos and Batchelder (2003) define the probability of witness i knowing the correct response to item k as

$$D_{ik} = \frac{{\theta_{i} \left({1 - \delta_{k}} \right)}}{{\theta_{i} \left({1 - \delta_{k}} \right) + \left({1 - \theta_{i}} \right)\delta_{k}}},$$
(4)

where θ i denotes the competence of witness i, independent of item difficulty, with 0 < θ i  < 1.Footnote 6

On the basis of these equations, the GCM defines the probability that witness i correctly recognizes statement k as:

$$p_{ik} = D_{ik}^{{Z_{k}}} + g_{i} \left({1 - D_{ik}} \right)\left({2Z_{k} - 1} \right).$$
(5)

The parameter estimates for the latent parameters competence θ i , guessing bias g i , item difficulty δ k , and the answer key Z k are determined simultaneously from the response matrix X = (X ik ) N×M ,

$$X_{ik} = \left\{ \begin{array}{ll} 1, & { {\rm{if}}\; {\rm{witness}}\; {i}\; {\rm{answers}}\; {\text{``}}{\rm true}{\text{''}}\; {\rm{to}}\; {\rm{item}}}\; k\\ 0, & {{\rm{if}}\; {\rm{witness}}\; i\; {\rm{answers}}\; {\text{``}}{\rm false}{\text{''}}\; {\rm{to}}\; {\rm{item}}\; } k \end{array}\right.$$
(6)

We used the Markov-chain-Monte-Carlo procedure described by Karabatsos and Batchelder (2003) to find parameter estimates that maximize the likelihood function

$$\begin{aligned} L ({\bf{X}|\varOmega} ) = \mathop \prod \limits_{i = 1}^{N} \mathop \prod \limits_{k = 1}^{M} p_{ik}^{{Z_{k} X_{ik} + ({1 - Z_{k}} )({1 - X_{ik}} )}} \times ({1 - p_{ik}} )^{{Z_{k} ({1 - X_{ik}} ) + ({1 - Z_{k}} )X_{ik}}}, \end{aligned}$$
(7)

where \(\varOmega = \left\{{\theta_{{\left\langle {i = 1, \ldots, N} \right\rangle}}, g_{{\left\langle {i = 1, \ldots, N} \right\rangle, }} \delta_{{\left\langle {k = 1} \right\rangle}}, Z_{{\left\langle {k = 1, \ldots, M} \right\rangle}}} \right\}\) are the parameters of the GCM. More detailed descriptions of the 2-HTM and the GCM can be found elsewhere (cf. Aßfalg, & Erdfelder, 2012; Batchelder, & Romney, 1986, 1988, 1989; Karabatsos, & Batchelder, 2003; Oravecz et al., 2014; Romney, Batchelder, & Weller, 1987; Romney et al., 1986).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Waubert de Puiseau, B., Greving, S., Aßfalg, A. et al. On the importance of considering heterogeneity in witnesses’ competence levels when reconstructing crimes from multiple witness testimonies. Psychological Research 81, 947–960 (2017). https://doi.org/10.1007/s00426-016-0802-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00426-016-0802-1

Keywords

Navigation