, Volume 33, Issue 2, pp 323–337 | Cite as

Assessing the Integrity of Clinical Data: When is Statistical Evidence Too Good to be True?

  • Margaret MacDougallEmail author


Evidence, as viewed through the lens of statistical significance, is not always as it appears! In the investigation of clinical research findings arising from statistical analyses, a fundamental initial step for the emerging fraud detective is to retrieve the source data for cross-examination with the study data. Recognizing that source data are not always forthcoming and that, realistically speaking, the investigator may be uninitiated in fraud detection and investigation, this paper will highlight some key methodological procedures for providing a sounder evidence base for withdrawing from a study on grounds of integrity. The promotion of patient safety is paramount. However, there is a broader rationale for disseminating these ideas. This includes empowering researchers to optimize their personal integrity, make informed choices regarding membership of future research collaborations and successfully voice their concerns to journal editors, particularly where a conflict of interests can render such dialogues particularly difficult. Recommendations will be supported by topical case studies and practical steps involving data exploration, testing of baseline data and application of Benford’s Law. While this paper has a clinical focus, the advice provided is transferrable to a wide range of multidisciplinary research settings outside of Medicine.


Baseline data Benford’s law Cluster analysis Mahalanobis distance Scientific fraud Statistical evidence 



Thanks are due to the following persons for highlighting useful references: Mr Marc Schwartz, Biostatistics MedNet Solutions (who also provided valuable advice on the detection of scientific fraud); Dr Pedro Emmanuel A A do Brasil, Oswaldo Cruz Foundation; Dr Gordon B Drummond, University of Edinburgh, Professor Jose Maisog, GLOTECH, Inc and Dr Lisa M McShane, National Cancer Institute. I am also most grateful to the anonymous reviewers who provided insightful and constructive comments which enhanced the quality of this paper. Some of the content of this paper was presented by the author at the September 2012 conference Evidence and Causality in the Sciences, which was hosted by the University of Canterbury, UK and organized by Drs Phyllis Illari and Federica Russo.


  1. Abelson RP (1995) Statistics as principled argument. Lawrence Erlbaum Associates, HillsdaleGoogle Scholar
  2. Acuña E and Rodriguez C (2004) A Meta analysis study of outlier detection methods in classification. Paper presented at the IPSI 2004, Venice, Italy.
  3. Akhtar-Danesh N, Dehghan-Kooshkghazi M (1993) How does correlation structure differ between real and fabricated data-sets? BMC Med Res Methodol 3(18):18Google Scholar
  4. Allnark P (2001) Is it in a neonate’s best interest to enter a randomised controlled trial? J Med Ethics 27(2):110–113CrossRefGoogle Scholar
  5. Al-Marzouki S, Evans S, Marshall T, Roberts I (2005) Are these data real? Statistical methods for the detection of data fabrication in clinical trials. Br Med J 331:268–270CrossRefGoogle Scholar
  6. Altman D (1985) Comparability of randomised groups. Statistician 34:125–136CrossRefGoogle Scholar
  7. Altman D, Schulz KM, Moher D, Egger M, Davidoff F, Elbourne D, Gøtzsche PC, Lang T (2001) The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med 134(8):663–694CrossRefGoogle Scholar
  8. Baggerly KA, Coombes KR (2010) Deriving chemosensitivity from cell lines. The Annals of Applied Statstics 3(4):1309–1334CrossRefGoogle Scholar
  9. Bossuyt P, Reitsma B, Bruns D, Gatsonis C, Galsziou P, Irwig L, Lijmer JG (2003) The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Clinical Chemistry 49(1):7–18CrossRefGoogle Scholar
  10. Buyse M, George SL, Evans S, Geller NL, Ranstam J, Scherrer B, Verma BL (1999) The role of biostatistics in the prevention, detection and treatment of fraud in clinical trials. Stat Med 18:3435–3451CrossRefGoogle Scholar
  11. Campbell M, Piaggio G, Elbourne D, Altman D (2012) CONSORT 2010 statement: extension to cluster randomised trials. Br Med J 345:e5661CrossRefGoogle Scholar
  12. Carlisle JB (2012) The analysis of 168 randomised controlled trials to test data integrity. Anaesthesia 67:521–537. doi:  10.1111/j.1365-2044.2012.07128.x
  13. Cartwright N (2011) A philosopher’s view of the long road from RCTs to effectiveness. The Lancet 377(9775):1400–1401CrossRefGoogle Scholar
  14. Colata G (2011) How bright promise in cancer testing fell apart, The New York times. Retrieved from
  15. Colbert AP (2004) How useful are randomized placebo-controlled clinical trials to acupuncturists? Medical Acupuncture 16(1):12–13Google Scholar
  16. de Vocht E and Kronhout H (2012) The use of Benford’s law for evaluation of quality of occupational hygiene data. The Annals of Occupational Hygiene, Advance access: Sep 2012. doi: 10.1093/annhyg/mes067
  17. Deception at Duke: Fraud in cancer care? (2012). Retrieved from
  18. Evans S (2001) Statistical aspects of the detection of fraud. In: Lock S, Wells F, Farthing M (eds) Fraud and Misconduct in Medical Research, 3rd edn. BMJ Publishing Group, London, pp 186–204Google Scholar
  19. Everitt BS, Landau S, Lesse M, Stahl M (2012) Cluster Analysis, 5th edn. Wiley & Sons, West Sussex, UKGoogle Scholar
  20. Grant J (2007) Corrupted Science: Fraud, Ideology and Politics in Science. Artists’ and Photographers’ Press Ltd, SurreyGoogle Scholar
  21. Ince D (2011) The Duke University scandal—what can be done? Significance 8(3):113–115Google Scholar
  22. Kang M, Ragan BG, Park J-H (2008) Issues in outcomes research: an overview of randomization techniques for clinical trials. Journal of Athletic Training 43(2):215–221CrossRefGoogle Scholar
  23. Kranke P (2012) Putting the record straight: granisetron’s efficacy as an antiemetic ‘post-Fujii’. Anaesthesia 67:1–5CrossRefGoogle Scholar
  24. Kranke P, Apfel CC, Roewer N (2000) Reported data on granisetron and postoperative nausea and vomiting by Fujii et al. are incredibly nice! Anesth Analg 90(4):1004CrossRefGoogle Scholar
  25. MacDougall M (2010) Threshold concepts in statistics and online discussion as a basis for curriculum innovation in undergraduate medicine. MSOR Connections 10(3):21–41. doi: 10.11120/msor.2010.10030021 CrossRefGoogle Scholar
  26. McShane L, Altman D, Sauerbrei W, Taube S, Gion M, Clark G (2005) Reporting recommendations for tumour marker prognostic studies (REMARK). J Natl Cancer Inst 97:1180–1184CrossRefGoogle Scholar
  27. Micheel CM, Nass SJ and Omenn GS (2012) Evolution of translational omics: lessons learned and the path forward; Committee on the Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials, Board on Health Care Services, Board on Health Sciences Policy, Institute of Medicine (Eds) Retrieved from
  28. Misconduct in science: an array of errors (2011) The Economist. Retrieved from
  29. Moher D, Hopewell S, Schulz K, Montori V, Gøtzsche P, Devereaux P, Altman D (2010) CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. J Clin Epidemiol 63(8):e1–37CrossRefGoogle Scholar
  30. Moore HM, Kelly AB, Jewell SD, McShane LM, Clark DP, Greenspan R, Vaught J (2011) Biospecimen Reporting for Improved Study Quality (BRISQ). Cancer Cytopathology 119:92–101CrossRefGoogle Scholar
  31. NCI (Producer) (2013) NCI Board of Scientific Advisors Meeting—March 2013. The 53rd meeting of the NCI Board of Scientific Advisors Retrieved from
  32. Nigrini M (1999) I’ve got your number. Journal of Accountancy 187(5):79–83Google Scholar
  33. Nigrini M, Mittermaier L (1997) The use of Benford’s law as an aid in analytical procedures. Auditing: A Journal of Practice and Theory 16:52–67Google Scholar
  34. Overall JE, Gorham DR (1962) The brief psychiatric rating scale. Psychol Rep 10:799–812CrossRefGoogle Scholar
  35. Papineau D (1994) The virtues of randomization. British Journal for the Philosophy of Science 45:437–450CrossRefGoogle Scholar
  36. Piaggio G, Elbourne DR, Altman DG, Pocock SJ, Evans SJ (2006) Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT statement. J Am Med Assoc 295(10):1152–1160CrossRefGoogle Scholar
  37. Singleton TW (2011) Understanding and applying Benford's law. ISACA J 3:6–9.
  38. Smith CA (2001) Detecting anomalies in your data using Benford’s Law. Paper presented at the Midwest SAS User Group 2001 Proceedings, Kansas City, MissouriGoogle Scholar
  39. Stroup DF, Berlin JAM, Sally C, Olkin I, Williamson GD, Rennie D, Moher D, Thacker SB (2000) Meta-analysis of observational studies in epidemiology: a proposal for reporting. J Am Med Assoc 283(15):2008–2012CrossRefGoogle Scholar
  40. Taylor RN, McEntegart DJ, Stillman EC (2002) Statistical techniques to detect fraud and other data irregularities in clinical questionnaire data. Drug Information Journal 56:115–125CrossRefGoogle Scholar
  41. Tukey J (1977) Exploratory data analysis. Addison-Wesley Publishing Company, ReadingGoogle Scholar
  42. Westfall RS (1973) Newton and the fudge factor. Science 179(4075):751–758CrossRefGoogle Scholar
  43. Wheeler G (2011) The trouble with Bayes’ theorem—the simple and the serious. Significance, 01 Sep (Olympics Special Issue)Google Scholar
  44. Wu X, Carlsson M (2011) Detecting data fabrication in clinical trials from cluster analysis perspective. Pharmaceutical Statistics 2011(10):257–264CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Medical Statistician and Researcher in Education, Centre for Population Health Sciences, College of Medicine and Veterinary MedicineUniversity of EdinburghEdinburghUK

Personalised recommendations