The Web-Recognising Adverse Drug Reactions (WEB-RADR) project  developed an app for spontaneous reporting of ADRs, based on a simplified reporting form. This WEB-RADR app was used to collect the app reports used in this study. The form contained questions about the patient, the suspected drug(s), ADR(s), and medical history and is fully compatible with the ICH-E2B(R2) format, which is a current standard for the electronic transfer of ADR reports . In designing the app, we tried to create a reporting form that would be easy to use and quick to complete on a mobile device. By reducing the number of structured fields, and allowing for more free text, the number of questions to answer was drastically reduced. Additionally, by letting the answer on a previous question trigger the next one, the burden on the reporter was further reduced. For example, in the conventional web form, several separate questions are asked regarding the ADR, whereas in the app, these questions can be answered more freely in one single text field. This type of simplification is used throughout the form. Besides the reporting functionality, the app also provided drug safety information . Users were able to create a list of medications on which to receive news from the national competent authority (NCA), and to view numbers of reports received by the NCA.
The app was launched in three countries: UK (July 2015), Netherlands (January 2016) and Croatia (May 2016). The app was promoted through social media in the UK, and through a press release and information on the Lareb website in the Netherlands. In Croatia, the app was introduced in a press conference and received national media attention.
Data Extraction and Processing
The data collection period was between the launch date of the app in each country and 1 September, 2016. All reports within the time period for each country were extracted from VigiBase, the World Health Organization global database of individual case safety reports . The data were filtered to meet the following criteria: (1) report type “spontaneous” to exclude study reports; (2) a single reporter that is either a patient or healthcare professional; and (3) the earliest available version of all reports to avoid introducing a bias owing to differing follow-up procedures. Reporter type “patient” was defined as the category “consumer/non-health professional”, and reporter type “healthcare professional” as “physician”, “pharmacist” or “other health professional” in the ICH-E2B reporting standard . Reports from market authorisation holders and automatically generated reports from external sources such as poison control centres were excluded because these types of reports could not be collected through the WEB-RADR app. A reference sample of reports from conventional sources was collected for each of the three countries during the same period that app reports were collected for that country. These reports consist of approximately 95% electronic reports for the Netherlands and the UK, and about 30% for Croatia. The other reports were received on paper. The questions on the paper forms and electronic forms are close to identical to meet the ICH-E2B reporting standard . The reference sample was extracted from VigiBase together with the app reports and adhered to the same criteria.
Characteristics of Reports
vigiPoint Features in the Full Dataset
vigiPoint  was used on the full dataset for the comparison between the reports received through the app and the reference sample. vigiPoint automatically highlights key features in a user-defined case series compared to a reference sample using log-odds ratios (ORs) to identify statistically significant differences between the case series and the reference sample in various pre-defined stratifications such as patient age, sex, co-reported drugs, co-reported ADRs, drug indication, type of reporter and geographical location. A key feature is defined as having a log OR005 > 0.5 or a log OR995 < − 0.5, where log OR005 and log OR995 are the lower and upper limits of the 99% credibility interval of the log OR, respectively. Requiring that the credible interval is at least 0.5 above or below zero ensures that the key feature, apart from being statistically significant, also deviates enough to be considered a characteristic feature of the case series. The vigiPoint analysis was performed using all extracted reports and was done for each country separately.
Reports Submitted by Patients
The difference in the proportion of patient reports between the app and reference sample was compared for each country separately, using the χ2 test. P values < 0.01 were considered statistically significant. For the subset of reports from patients, patient age and sex were compared between the app and the reference sample for each country separately, using the Mann–Whitney U test and the χ2 test, respectively. P values < 0.01 were considered statistically significant. Because the vigiPoint analysis includes both patient and healthcare professional reports, and the Mann–Whitney U test and the χ2 test use different approaches in measuring statistical differences, the results between the vigiPoint and the two traditional tests are not identical.
Quality of Reports
Clinical Documentation Tool
To measure and compare the clinical quality, a subsample of a maximum of 100 app reports and a corresponding number of reference reports were selected randomly for each country by the Uppsala Monitoring Centre using VigiBase. The randomisation was performed using a fixed ratio between reports sent by healthcare professionals and reports sent by patients, based on the country-specific proportion in the app reports. App reports and reference reports were manually scored by blinded assessors, separately for each country, with the clinical documentation assessment tool ClinDoc . The assessors used the report version stored in their national database. Clinical quality was divided into three categories: well (≥ 75%), moderately (46–74%) or poorly (≤ 45%) documented reports. The proportion of reports of at least moderate clinical quality was compared between app reports and reference reports, using a χ2 test. A p value < 0.01 was considered statistically significant.
vigiGrade Completeness Score
The manually assessed reports were also subjected to an automated evaluation of quality using vigiGrade . The vigiGrade completeness score measures how complete the information on the report is, based on a selection of ICH-E2B fields sent from the national databases to VigiBase. Each report field carries a weight depending on the field’s importance in regard to causality assessment. The score ranges between zero and one, where zero represents a report with all considered fields empty, and one represents a fully complete report.
With the introduction of an additional reporting route, report duplication is a potential concern. To identify possible duplicate reports both within each sample and between the app and reference samples, an automatic algorithm called vigiMatch was used. vigiMatch uses a probabilistic hit-miss model, by scoring a report pair while taking into account the amount of matching data. If the calculated total score reaches above a pre-set threshold, the report pair is flagged as a suspected duplicate [18, 19].
Each report within the dataset that is flagged as a suspected duplicate will be counted, regardless whether its suspected duplicate is within the dataset used in this study or not. For example, if one report is flagged as a suspected duplicate, but its pair entered VigiBase before or after the dates defined for the datasets in this study (until January 2017), it is counted as one suspected duplicate. However, if its duplicate pair is within the study sample, both will be counted as both are flagged as suspected duplicates. Note that the countries perform some duplicate detection in their own database before submitting reports to VigiBase.
Prospective Contribution of App Reports to Signal Detection
The contribution of app reports to signal detection can highlight the potential value of app reports. This was quantified by counting how many drug–event combinations, analysed potential signals and issued signals that included any app reports. The following signal detection processes were investigated: (1) NCAs in the three countries and (2) Uppsala Monitoring Centre.
Identifying Signals in National Competent Authorities
In the Netherlands and Croatia, case-by-case review is performed to identify signals. The Netherlands is also performing disproportionality analysis as a supportive measure on the reports that are not assessed manually, mostly those submitted by pharmaceutical companies. In the UK, drug–event combinations of potential interest are flagged for assessment based on pre-defined criteria. These criteria include both statistical methods to identify drug–event combinations that are disproportionately present in the database and rule-based approaches to highlight populations or events of particular interest and particularly serious cases. Signal assessors then review the drug–event combinations. When identifying a potential signal, these are discussed at a signal detection meeting where it is decided whether further action is required.
Identifying Signals at the Uppsala Monitoring Centre
At the Uppsala Monitoring Centre, the first step for signal detection is performed in a signal detection sprint, in which initial assessments of drug–event combinations take place with a pre-defined set of filters (usually at the report and drug–event combination level with a specific theme). After filtering, drug–event combinations are ordered using vigiRank . Drug–event combinations are assessed manually in order of decreasing vigiRank score, and a subset of drug–event combinations are taken further for an in-depth manual assessment.
Contribution of App Reported Drug–Event Combinations at the European Medicines Agency
Reports from the app are not distinguished in EudraVigilance (operated by the European Medicines Agency on behalf of the European Union medicines regulatory network) and therefore no prospective analysis of the contribution to signal detection could be performed. Drug–event combinations were identified for the reports received via the app, and the Proportional Reporting Ratio  was calculated based on all EudraVigilance reports to identify signals of disproportionate reporting. A distinction was made between products authorised through the single ‘centralised’ procedure for the European Union (using a European Medicines Agency internal signal detection tracking table) and products authorised through other procedures (using the European Issues Tracking Tool, EPITT). It was investigated how many of these drug–event combinations would have qualified as a validated signal including those that were identified historically.