Hercule Poirot: “I gave to you the clues and every chance to discover the truth…”

Agatha Christie, Curtain: Poirot’s Last Case (1975)

Well-conducted randomized clinical trials have now become mandatory to confirm the effects of new therapies and to obtain marketing approval from the healthcare authorities. Even after registration and filing, post-marketing, investigator-driven clinical trials are often conducted to provide repeated confirmation of a new drug’s efficacy and safety. Although stringent regulations have long been imposed on registration trials, rules for the conduct of post-marketing trials have not been standardized, and the lack of oversight of these studies has occasionally led to investigator misconduct. In Japan, for instance, a major case of data fabrication was recently disclosed by a whistleblower in an investigator-driven clinical trial for an antihypertensive blockbuster drug. This case is by no means isolated [1]. More rigorous, transparent, and well-conducted post-marketing clinical trials are needed, as mandated by the new ethical guideline enacted by the Japanese Ministry of Labor, Health, and Welfare in April 2015.

Concerns over the possibility of fraud or serious errors are exacerbated when new therapies show remarkable effects, with clinical trial results “too good to be true.” In gastric cancer, the ACTS-GC clinical trial showed outstanding efficacy of the oral fluorinated pyrimidine S-1 as adjuvant chemotherapy for Japanese patients after curative tumor resection [2]. Although the result of this study was overwhelmingly positive, S-1 has not been widely used as adjuvant chemotherapy of gastric cancer in the Western world. The SAMIT study set out to test two independent hypotheses, using a factorial design: (1) that sequential administration of paclitaxel and oral fluorinated pyrimidine (either S-1 or UFT) results in longer disease-free survival than oral fluorinated pyrimidine alone and (2) that UFT is not inferior to S-1 in terms of disease-free survival. As it turns out, the study failed to show superiority of sequential therapy, but it did show superiority of S-1 as compared with UFT [3]. Since two independent large-scale trials (ACTS-GC, n = 1056 and SAMIT, n = 1495) have now demonstrated significant benefits for S-1 in the adjuvant setting, this drug can be considered standard adjuvant chemotherapy for gastric cancer in Japan.

The SAMIT trial was conducted using standard monitoring and auditing procedures; hence, there was no particular reason to doubt the quality of the data. The Steering Committee of SAMIT however recognized that traditional approaches to quality control using source data verification and data management checks may not be fully effective, so they decided to seek advice from an independent third party to investigate the quality of the data before submitting a paper with the trial results for publication. All individual patient data were sent to CluePoints, a company that is pioneering the use of central statistical monitoring (CSM) to provide an objective and exhaustive assessment of data quality and consistency in multicenter clinical trials. In this issue of Gastric Cancer, a group of investigators from CluePoints and several universities in Belgium reported on the practical use of the SMART™ CSM software on this large randomized phase III trial for gastric cancer [4].

The methods of CSM are technically complex, but the concept behind the approach is stunningly simple: it consists of identifying centers whose data differ significantly from the data provided by all other centers. For instance, in the SAMIT trial, one center was identified as atypical because the treatment starting date was a Saturday for the six patients treated at this center. An investigation revealed that this center was fully open on Saturdays, and therefore this inconsistency in the data could be explained. Data inconsistencies found at other centers included small variability for the drug volume, similar trends for blood-test-related measurements and symptom grades, failure of entering the last date the patient was reported to be relapse-free, and other missing values [4]. None of these data patterns were suspected to have had any impact on the outcome of the trial, nor was there any evidence of misconduct that could have affected the safety or the management of the patients at these centers. The analyses performed by CluePoints provided reassurance that the clinical trial protocol had been consistently followed across all participating centers [3].

How does CSM actually work? In the approach implemented by CluePoints, all variables collected in the case report form are analyzed using a variety of statistical tests to detect inconsistencies with respect to (1) reporting, (2) data tendency, (3) visit-to-visit evolution, and (4) dates. The analyses generated nearly 50,000 quality control-checked P values from over 60,000 statistical tests. Centers were considered to be outlying if they had either a large overall “data inconsistency score” (i.e., many P values pointing to atypical data) or at least one statistic had an extreme P value (i.e., P < 10−5). All outlying centers were further scrutinized to identify the cause of the atypical data observed in these centers. Such an approach is aligned with recent guidance documents on risk-based monitoring from the US Food and Drug Administration and the European Medicines Agency [5, 6], both of which recommend focusing efforts on “the most critical data elements and processes necessary to achieve the study objectives”. In on-going studies, CSM can complement and/or replace conventional monitoring methods, but even in completed studies CSM can be used as part of quality assurance to ensure trial validity [7].

Why is CSM important in post-marketing clinical trials such as SAMIT? These trials often involve large numbers of sites and patients, and as a result they require extravagant budgets if conventional approaches are adopted to conduct them. In such cases, budgets need to be revised with cost-efficiency in mind, and processes that do not provide value for money must be replaced by novel approaches that are less costly and more effective. As in many other industries, novel approaches to clinical research will increasingly replace manpower by computer power. Highly labor-intensive activities such as source data verification during on-site visits will be replaced by highly computer-intensive methods such as central statistical monitoring. While one may lament the jobs lost in the process, one must at the same time rejoice over the accompanying improvements in quality of the trial data and reliability of the trial results. Post-marketing trials cannot simply be conducted at lower cost. The temptation for investigators to manipulate the data for either promotion or profit is too acute to be ignored [1]. Also, the increasing complexity of clinical trials could make fraudulent behavior increasingly difficult to uncover. We cannot loosen the standards of clinical research merely for financial reasons and run the risk of fabricated evidence influencing clinical guidelines and leading to changes in standards of care. What we need instead is a detective in the room, a Hercule Poirot of clinical research. However, there are two major differences from Agatha Christie’s novels. First, computer software will now provide the detective’s gray cells with unprecedentedly powerful clues. Second, in most cases no crime will have been committed. We would then expect Poirot to find no clues and to state, stroking his upward-curled moustache: “Now, that also is something to celebrate, n’est-ce pas?” (The Million Dollar Bond Robbery, TV Episode, 1991).