Assessing the Integrity of Clinical Data: When is Statistical Evidence Too Good to be True?

Abstract

Evidence, as viewed through the lens of statistical significance, is not always as it appears! In the investigation of clinical research findings arising from statistical analyses, a fundamental initial step for the emerging fraud detective is to retrieve the source data for cross-examination with the study data. Recognizing that source data are not always forthcoming and that, realistically speaking, the investigator may be uninitiated in fraud detection and investigation, this paper will highlight some key methodological procedures for providing a sounder evidence base for withdrawing from a study on grounds of integrity. The promotion of patient safety is paramount. However, there is a broader rationale for disseminating these ideas. This includes empowering researchers to optimize their personal integrity, make informed choices regarding membership of future research collaborations and successfully voice their concerns to journal editors, particularly where a conflict of interests can render such dialogues particularly difficult. Recommendations will be supported by topical case studies and practical steps involving data exploration, testing of baseline data and application of Benford’s Law. While this paper has a clinical focus, the advice provided is transferrable to a wide range of multidisciplinary research settings outside of Medicine.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    One virtue of more recent versions of the statistical package SPSS (Statistical Package for the Social Sciences) is that they allow the user to conveniently navigate, via a single menu command, from an outlier in a boxplot to the corresponding row(s) in the underlying data file so that the overall profile(s) of patient(s) with extremes can be surveyed with a view to finding plausible reasons for anomalies in their clinical measurements.

  2. 2.

    The genuineness of any research or writing partnership implicit from the majority of Fujii’s co-authored papers has been openly discredited (http://retractionwatch.wordpress.com/).

  3. 3.

    The available lists of guidelines have since been extended to include, among others, SPIRIT: Standard Protocol Items—Recommendations for Interventional Trials (http://www.spirit-statement.org/) and adapted to accommodate different types of randomized controlled trials (Campbell et al. 2012; Moher et al. 2010; Piaggio et al. 2006). More recently, with input from the REMARK committee, among others, the American Cancer Society have published the Biospecimen Reporting for Improved Study Quality (BRISQ) guidelines. These guidelines are for the reporting of all biomedical research specifically based on human biospecimens. The emphasis is on accuracy and comprehensiveness, while providing scope for refinement and re-weighting according to the study context and progress in medical science (Moore et al. 2011). The EQUATOR (Enhancing the QUAlity and Transparency Of health Research) network have also published advice on guidelines development (http://www.equator-network.org/).

References

  1. Abelson RP (1995) Statistics as principled argument. Lawrence Erlbaum Associates, Hillsdale

    Google Scholar 

  2. Acuña E and Rodriguez C (2004) A Meta analysis study of outlier detection methods in classification. Paper presented at the IPSI 2004, Venice, Italy. http://academic.uprm.edu/eacuna/paperout.pdf

  3. Akhtar-Danesh N, Dehghan-Kooshkghazi M (1993) How does correlation structure differ between real and fabricated data-sets? BMC Med Res Methodol 3(18):18

    Google Scholar 

  4. Allnark P (2001) Is it in a neonate’s best interest to enter a randomised controlled trial? J Med Ethics 27(2):110–113

    Article  Google Scholar 

  5. Al-Marzouki S, Evans S, Marshall T, Roberts I (2005) Are these data real? Statistical methods for the detection of data fabrication in clinical trials. Br Med J 331:268–270

    Article  Google Scholar 

  6. Altman D (1985) Comparability of randomised groups. Statistician 34:125–136

    Article  Google Scholar 

  7. Altman D, Schulz KM, Moher D, Egger M, Davidoff F, Elbourne D, Gøtzsche PC, Lang T (2001) The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med 134(8):663–694

    Article  Google Scholar 

  8. Baggerly KA, Coombes KR (2010) Deriving chemosensitivity from cell lines. The Annals of Applied Statstics 3(4):1309–1334

    Article  Google Scholar 

  9. Bossuyt P, Reitsma B, Bruns D, Gatsonis C, Galsziou P, Irwig L, Lijmer JG (2003) The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Clinical Chemistry 49(1):7–18

    Article  Google Scholar 

  10. Buyse M, George SL, Evans S, Geller NL, Ranstam J, Scherrer B, Verma BL (1999) The role of biostatistics in the prevention, detection and treatment of fraud in clinical trials. Stat Med 18:3435–3451

    Article  Google Scholar 

  11. Campbell M, Piaggio G, Elbourne D, Altman D (2012) CONSORT 2010 statement: extension to cluster randomised trials. Br Med J 345:e5661

    Article  Google Scholar 

  12. Carlisle JB (2012) The analysis of 168 randomised controlled trials to test data integrity. Anaesthesia 67:521–537. doi: 10.1111/j.1365-2044.2012.07128.x

  13. Cartwright N (2011) A philosopher’s view of the long road from RCTs to effectiveness. The Lancet 377(9775):1400–1401

    Article  Google Scholar 

  14. Colata G (2011) How bright promise in cancer testing fell apart, The New York times. Retrieved from http://www.nytimes.com/2011/07/08/health/research/08genes.html?_r=0

  15. Colbert AP (2004) How useful are randomized placebo-controlled clinical trials to acupuncturists? Medical Acupuncture 16(1):12–13

    Google Scholar 

  16. de Vocht E and Kronhout H (2012) The use of Benford’s law for evaluation of quality of occupational hygiene data. The Annals of Occupational Hygiene, Advance access: Sep 2012. doi:10.1093/annhyg/mes067

  17. Deception at Duke: Fraud in cancer care? (2012). Retrieved from http://www.cbsnews.com/8301-18560_162-57376073/deception-at-duke/

  18. Evans S (2001) Statistical aspects of the detection of fraud. In: Lock S, Wells F, Farthing M (eds) Fraud and Misconduct in Medical Research, 3rd edn. BMJ Publishing Group, London, pp 186–204

    Google Scholar 

  19. Everitt BS, Landau S, Lesse M, Stahl M (2012) Cluster Analysis, 5th edn. Wiley & Sons, West Sussex, UK

  20. Grant J (2007) Corrupted Science: Fraud, Ideology and Politics in Science. Artists’ and Photographers’ Press Ltd, Surrey

    Google Scholar 

  21. Ince D (2011) The Duke University scandal—what can be done? Significance 8(3):113–115

    Google Scholar 

  22. Kang M, Ragan BG, Park J-H (2008) Issues in outcomes research: an overview of randomization techniques for clinical trials. Journal of Athletic Training 43(2):215–221

    Article  Google Scholar 

  23. Kranke P (2012) Putting the record straight: granisetron’s efficacy as an antiemetic ‘post-Fujii’. Anaesthesia 67:1–5

    Article  Google Scholar 

  24. Kranke P, Apfel CC, Roewer N (2000) Reported data on granisetron and postoperative nausea and vomiting by Fujii et al. are incredibly nice! Anesth Analg 90(4):1004

    Article  Google Scholar 

  25. MacDougall M (2010) Threshold concepts in statistics and online discussion as a basis for curriculum innovation in undergraduate medicine. MSOR Connections 10(3):21–41. doi:10.11120/msor.2010.10030021

    Article  Google Scholar 

  26. McShane L, Altman D, Sauerbrei W, Taube S, Gion M, Clark G (2005) Reporting recommendations for tumour marker prognostic studies (REMARK). J Natl Cancer Inst 97:1180–1184

    Article  Google Scholar 

  27. Micheel CM, Nass SJ and Omenn GS (2012) Evolution of translational omics: lessons learned and the path forward; Committee on the Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials, Board on Health Care Services, Board on Health Sciences Policy, Institute of Medicine (Eds) Retrieved from http://www.nap.edu/

  28. Misconduct in science: an array of errors (2011) The Economist. Retrieved from http://www.economist.com/node/21528593

  29. Moher D, Hopewell S, Schulz K, Montori V, Gøtzsche P, Devereaux P, Altman D (2010) CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. J Clin Epidemiol 63(8):e1–37

    Article  Google Scholar 

  30. Moore HM, Kelly AB, Jewell SD, McShane LM, Clark DP, Greenspan R, Vaught J (2011) Biospecimen Reporting for Improved Study Quality (BRISQ). Cancer Cytopathology 119:92–101

    Article  Google Scholar 

  31. NCI (Producer) (2013) NCI Board of Scientific Advisors Meeting—March 2013. The 53rd meeting of the NCI Board of Scientific Advisors Retrieved from http://videocast.nih.gov/summary.asp?file=17833&bhcp=1

  32. Nigrini M (1999) I’ve got your number. Journal of Accountancy 187(5):79–83

    Google Scholar 

  33. Nigrini M, Mittermaier L (1997) The use of Benford’s law as an aid in analytical procedures. Auditing: A Journal of Practice and Theory 16:52–67

    Google Scholar 

  34. Overall JE, Gorham DR (1962) The brief psychiatric rating scale. Psychol Rep 10:799–812

    Article  Google Scholar 

  35. Papineau D (1994) The virtues of randomization. British Journal for the Philosophy of Science 45:437–450

    Article  Google Scholar 

  36. Piaggio G, Elbourne DR, Altman DG, Pocock SJ, Evans SJ (2006) Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT statement. J Am Med Assoc 295(10):1152–1160

    Article  Google Scholar 

  37. Singleton TW (2011) Understanding and applying Benford's law. ISACA J 3:6–9. http://www.isacajournal-digital.org/isacajournal/2011vol3#pg8

  38. Smith CA (2001) Detecting anomalies in your data using Benford’s Law. Paper presented at the Midwest SAS User Group 2001 Proceedings, Kansas City, Missouri

  39. Stroup DF, Berlin JAM, Sally C, Olkin I, Williamson GD, Rennie D, Moher D, Thacker SB (2000) Meta-analysis of observational studies in epidemiology: a proposal for reporting. J Am Med Assoc 283(15):2008–2012

    Article  Google Scholar 

  40. Taylor RN, McEntegart DJ, Stillman EC (2002) Statistical techniques to detect fraud and other data irregularities in clinical questionnaire data. Drug Information Journal 56:115–125

    Article  Google Scholar 

  41. Tukey J (1977) Exploratory data analysis. Addison-Wesley Publishing Company, Reading

    Google Scholar 

  42. Westfall RS (1973) Newton and the fudge factor. Science 179(4075):751–758

    Article  Google Scholar 

  43. Wheeler G (2011) The trouble with Bayes’ theorem—the simple and the serious. Significance, 01 Sep (Olympics Special Issue)

  44. Wu X, Carlsson M (2011) Detecting data fabrication in clinical trials from cluster analysis perspective. Pharmaceutical Statistics 2011(10):257–264

    Article  Google Scholar 

Download references

Acknowledgments

Thanks are due to the following persons for highlighting useful references: Mr Marc Schwartz, Biostatistics MedNet Solutions (who also provided valuable advice on the detection of scientific fraud); Dr Pedro Emmanuel A A do Brasil, Oswaldo Cruz Foundation; Dr Gordon B Drummond, University of Edinburgh, Professor Jose Maisog, GLOTECH, Inc and Dr Lisa M McShane, National Cancer Institute. I am also most grateful to the anonymous reviewers who provided insightful and constructive comments which enhanced the quality of this paper. Some of the content of this paper was presented by the author at the September 2012 conference Evidence and Causality in the Sciences, which was hosted by the University of Canterbury, UK and organized by Drs Phyllis Illari and Federica Russo.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Margaret MacDougall.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

MacDougall, M. Assessing the Integrity of Clinical Data: When is Statistical Evidence Too Good to be True?. Topoi 33, 323–337 (2014). https://doi.org/10.1007/s11245-013-9216-5

Download citation

Keywords

  • Baseline data
  • Benford’s law
  • Cluster analysis
  • Mahalanobis distance
  • Scientific fraud
  • Statistical evidence