FormalPara Key Points for Decision Makers

Among the 497 EPARs related to medicines authorised between 2017 and 2022, almost half reported any use of PROs/PROMs, and this figure was not significantly different for the 19 medicines refused over the same period.

The use of PROMs, was more common for orphan drugs and in some therapeutic areas (e.g., diseases of the digestive system, diseases of the skin and subcutaneous tissue).

Patient relevance in evidence generation is key in the strategic vision launched by EMA and other regulators; harmonisation of PROMs and optimisation of PRO data collection would facilitate this goal.

1 Introduction

Health authorities and payers worldwide are increasingly recognising the importance of the patient's perspective at all stages of medicines development and regulatory decision making [1, 2]. Patient input involves patient-reported outcomes (PROs) as a quantitative source of evidence [3]. According to the Food and Drug Administration (FDA), a clinical outcome assessment (COA) is a measure that describes or reflects how a patient feels, functions, or survives. Types of COAs include PROs, observer-reported outcomes (ObsROs), clinician-reported outcomes (ClinROs) and Performance outcomes (PerfOs) measures [4]. Among them, PROs refer to a health or treatment outcome reported directly by the patient without the interpretation of a clinician or another person, according to a definition initially issued by the FDA in 2009 [5]. In practice, this is an “umbrella” term that covers various concepts such as subjectively perceived health state, unobservable symptoms (e.g., pain intensity), physical-, mental- and social- functioning, health-related quality of life (HRQoL) and other aspects of health from the patient’s perspective. A PRO can be measured by self-report or by interview provided that the interviewer records only the patient’s response. The instruments adopted to collect PROs in a standardised manner are called patient-reported outcomes measures (PROMs), which are often self-completed questionnaires, rating scales or single questions [6]. They can either be general in nature or disease specific. The generic instruments consider broad aspects (e.g., general quality of life [QoL], emotional distress) that fit a variety of conditions and allow comparison among them, while disease-specific instruments address symptoms characterising a specific condition and their impact on the functioning and disease-specific QoL of a given patient population [7, 8].

The advantage of using PROs in clinical practice and research is that they provide a more holistic interpretation and a comprehensive assessment of the outcomes of the intervention delivered [9,10,11]. In clinical practice, patient-reported outcomes allow to assess the severity of intangible symptoms (e.g., fatigue, depression, insomnia), to inform treatment strategies and track the effects of treatments over time, to guide follow-up visits and examination scheduling [6], to improve treatment adherence and patient empowerment [12]. In clinical research, the inclusion of PROs in data collection can provide valuable evidence of the risk-benefit profile of treatment from a patient perspective, inform drug approvals and clinical guidelines, and facilitate the interpretation of trial results by target patient populations [13]. Patient-reported outcomes-related endpoints are becoming common in clinical trials, as either primary outcomes or to complement primary outcomes (e.g., survival) [7], although the type and frequency of the instruments used vary greatly depending on the disease under consideration [14, 15]. For instance, 27 of clinical trials registered on clinicaltrial.gov between 2007 and 2013 were identified as using a PROM [16]. A more updated analysis on 480 published and 537 registered randomised cancer trials identified PRO measures in over 50 of published trials and up to 66 of registered protocols [17]. Of 55 registration trials used to support new oncology drugs approved by the FDA during 2014–2018, 41 (75%) included PRO assessments [18]. Indeed, adoption of PRO-related endpoints in clinical trials can in turn influence regulatory decisions. A review of new drug approvals between 2006 and 2015 by the FDA showed that approximately 20 of new drugs had labelling based on PROs [19, 20]. This proportion increases to 50% (47 of 94) from 2016 to 2020, for new molecular entities targeting diseases that traditionally rely on PRO assessments as primary or secondary endpoints for the evaluation of treatment benefit [21].

The European Medicine Agency (EMA) is the agency in charge of the scientific evaluation, supervision and monitoring of the safety of medicines for human and veterinary use in Europe. Through the centralised procedure, pharmaceutical companies can apply for marketing authorisation which, if positively recommended by EMA and granted by the European Commission, allows the product to be marketed throughout the European Union and the European Economic Area. The EMA publishes detailed information on medicines evaluated by the Committee for Medicinal Products for Human Use (CHMP) and the Committee for Medical Products for Veterinary Use (CVMP) as part of the centralised procedure through the European Public Assessment Reports (EPAR), in compliance with Article 13 (3) of the Regulation (EC) no. 726/2004. An EPAR is available for each medicine that has been granted or refused marketing authorisation. It is not a single document but a bulk of documents that describe in detail the assessments regarding individual medicines and the technical information on the product. The EPARs are freely accessible on the EMA website and consist of four sections (observations, authorisation details, product information and historical reports) [22].

The FDA and the EMA have historically used different evidentiary standards to assess PRO data [23]. Among 75 products approved by both agencies between 2006 and 2010, 35 (47%) had at least one EMA-granted PRO label claim compared with 14 (19%) by the FDA. Most FDA-grated claims focused on symptoms; however, EMA-granted claims were more likely to include higher order concepts, such as HRQoL, functioning or fatigue. The two agencies appear to agree on the exact type of labelling less than 12% of the time across approved products, indicating that differing levels of evidence are needed to facilitate positive reviews. Importantly, in all instances in which higher order claims were granted by the FDA, they were also granted by the EMA, but the opposite is not true. Of 64 indications for cancer medicines approved between 2012 and 2016 by the EMA and the FDA, 45 (70.3%) included PRO data in documents submitted to regulatory authorities. Nevertheless, none received PRO labelling from the FDA, whilst there were 21 indications (46.7%) with PRO-related language in EMA PRO labelling [24]. An updated analysis of all 70 indications of new oncology medicines authorised by EMA during 2017–2021 shows that 52 (74.3%) included PRO data for EMA review and 14 (20.0%) contained PRO-related language in the summaries of product characteristics [25].

In recent years, the FDA has developed a series of four methodological patient-focused drug development (PFDD) guidance documents intended to facilitate the collection of robust and meaningful patient and caregiver data that can better inform medical product development and regulatory decision making [26]. Guidance 4 addresses methodologies, standards, and technologies that may be used for the collection, capture, storage, and analysis of COA data and, particularly, patient perspective data (PROs), to better incorporate them into endpoints that are considered significantly robust for regulatory decision making and to determine clinically meaningful changes in these endpoints [27].

Over the last two decades, the EMA has acknowledged that the accurate measurement of the patient experience can complement existing measurements of safety and efficacy through progress in regulatory science related to PRO measurements. In 2005, the EMA published the "Reflection paper on the regulatory guidance for the use of health-related quality of life measures in the evaluation of medicinal products" to discuss the role that HRQoL, a specific type of PRO, can have in the medicine evaluation process. Health-related quality of life is a multidimensional concept that can be defined as the patient's subjective perception of the impact of illness and related treatments on daily life and physical, psychological and social well-being. Whilst the HRQoL assessment is optional, the agency confirmed it could be useful to inform the interpretation of the observed effect on the primary endpoint (e.g., survival) in terms of consequences for daily life and social functioning [28]. In 2014, new guidance was issued on the measurement of PROs in clinical trials as an appendix to the guidelines for the evaluation of cancer medicines. The Agency pointed out the lack of informativeness of PROs included in confirmatory clinical trials, often due to limitations in the PROs measurement such as poorly defined objectives, poor validity, reliability or responsiveness of the tool, and missing data. Among the instruments validated in the oncology area, the Functional Assessment of Cancer Therapy (FACT) or the European Organisation for Research and Treatment of Cancer (EORTC) questionnaires, are generally relevant and suitable to measure the consequences of the tumour or adverse drug reactions for patient well-being, especially in the context of palliative care [29].

In 2020, the EMA launched a new strategy called "Regulatory Science Strategy to 2025", with the aim of promoting patient-centred drug development, also through systematic means to incorporate PROs and patient preferences into the risk-benefit evaluation of medicines [30]. The EMA expects the use of PROs as endpoints in clinical trials to increase over time, thanks to a growing uptake of digital health solutions and interest in precision medicine. The latest strategic approach in support of patient experience data in EU medicines development and regulatory decision making endorsed by the agency may also encourage consideration of PROs in regulatory decisions; however, reviews of PRO labelling of medicines authorised by the EMA after 2016 are scarce [24]. While PROs represent only one type of COA, compared to ClinRO and PerfO, they have been overlooked for decades in drug regulatory decision making. Therefore, the purpose of this study is to assess PROs and PROMs consideration for medicines with any indication, either authorised or refused by EMA between 2017 and 2022.

2 Methods

The list of EPARs was downloaded from the EMA website on 5th January 2023 [31] and all medicines for human use authorised and refused between January 2017 and December 2022 were identified. Conversely, we excluded medicines for veterinary use, withdrawn medicines and medicines authorised or refused before 2017. For each compound, we analysed the corresponding EPARs focusing particularly on Public Assessment Reports within the Assessment History section. The presence of PROs and PROMs was identified by searching the documents using a list of keywords including: "functioning/functions", "health related quality of life", "HRQoL/HRQL", "index”, "instrument", "patient-reported", "patient reported", "patient-reported outcomes", “patient-reported outcome measures”, "quality of life", "QoL", "questionnaire(s)", "scale", "symptom(s)", "subjective", “VAS”, “wellbeing”. We qualified each measure retrieved as a PRO/PROM if the type of COA was “PRO” or “composite” including PRO as reported in ePROVIDE database (https://eprovide.mapi-trust.org/).

An ad hoc template was created in Microsoft® Excel to systematically record the use of PROs and/or PROMs for each EPAR and other relevant information, and initially tested on a sample of 20 EPARs by two reviewers independently (MM, FM). This pilot test resulted in minor changes to the extraction form, mainly for recording multiple PROMs per EPAR. More specifically, we collected data on medicine characteristics, such as trade name, active substance, authorisation or refusal year, therapeutic macro-area and specific area (classified following the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10) [32]), and whether the medicine was generic, biosimilar or orphan. We considered PRO-PROM dyads, meaning that for each specific PROM, we looked for the associated underlying PRO concept. We categorised the type of PRO endpoint (i.e., primary, secondary, exploratory or other) included in registration trials and the therapeutic area of corresponding PROM as reported in ePROVIDE database (https://eprovide.mapi-trust.org/). In some cases, either the PRO or the PROM was missing; however, we concluded that the EPAR did not report any evidence on PROs/PROMs only if both were missing.

The abstracted data were analysed to describe the spread and characteristics of the PROs/PROMs used in the EPARs over time and across disease areas, and to investigate which characteristics of the medicine were associated with their consideration. The occurrence and type of PROs/PROMs were analysed through descriptive statistics (absolute frequency and percentage). The chi-2 test was applied to detect statistically significant differences between medicine groups in the use of PROs/PROMs and other variables of interest. In addition, a multivariate logistic regression model was performed to identify medicine characteristics associated with the use of PROs/PROMs. The analysis initially used the whole sample of EPARs to estimate whether the use of PROs/PROMs was related to a medicine’s authorisation decision. Thereafter, we conducted the subsequent analyses on the sample of authorised medicines only. The level of statistical significance (p-value) in the analyses was set at 0.05; all statistical analyses were performed with the support of Stata 16 (StataCorp).

3 Results

3.1 Use of PROs/PROMs in EPARs

Of 1976 medicines evaluated by EMA between 2017 and 2022, we excluded veterinary medicines (n = 282), withdrawn medicines (n = 306), medicines authorised (n = 838) or refused (n = 34) before 2017. The remainder included 497 authorised and 19 refused medicines, with the remaining 516 medicines included in the final analyses (Fig. 1).

Fig. 1
figure 1

Flow chart of the screening process of European Public Assessment Reports (EPARs)

Overall, the use of PROs/PROMs was registered in 250 of the 516 EPARs selected (48.5%), with no statistically significant differences between authorised and refused medicines (48.3% vs 52.6%, p = 0.710).

By analysing the 497 EPARs related to the authorised medicines only, we identified statistically significant differences (p < 0.001) in the use of PROs/PROMs between generic (1.1%) and non-generic (59.2%) (Table 1). These findings were confirmed in the context of a multivariate logistic regression analysis (Table 2), where in addition we observed that biosimilar medicines were less likely (odds ratio [OR] < 1) and orphan medicines were more likely (OR >1) to show patient-reported evidence, although without highly statistically significant coefficients.

Table 1 Inclusion of PROs/PROMs according to medicines’ characteristics
Table 2 Multivariate logistic regression of drug’s characteristics on the use of PROs/PROMs

The use of PROs/PROMs went from 50.6% in 2017 to 47.9% in 2022 but did not show a clear pattern over the 6-year period considered (p = 0.758). The exclusion of generics and biosimilars resulted in a higher frequency of PROs/PROMs in all years (Fig. 2) but still without significant inter-year differences (p = 0.678). Table 3 illustrates that the use of PROs/PROMs varied across therapeutic areas, being very common in skin diseases (91.7%), respiratory system diseases (71.4%) and musculoskeletal system and connective tissue diseases (69.7%), whilst relatively uncommon in infectious diseases (15.2%) and cardiovascular diseases (21.4%).

Fig. 2
figure 2

Trend in the use of PROs/PROMs over time. PROs patient-reported outcomes, PROMs patient-reported outcome measures

Table 3 Use of PROs/PROMs in EPARs by therapeutic area

3.2 Characteristics of PROs and PROMs

A total of 816 dyads of PROs/PROMs were identified from the analysis of 497 EPARs of authorised medicines; for each, we reported the concept measured by the collected PRO and the type of PROM used (i.e., generic or specific). As shown in Fig. 3, the most frequently covered concept by the identified PROs was QoL/HRQoL in general (n = 265, 32.5%), followed by disease-specific QoL/HRQoL (n = 47, 5.8%) and pain (n = 41, 5.0%). In 229 EPARs (28.1%) the PRO concept was not specified, and the report only cited the instrument considered (i.e., the PROM).

Fig. 3
figure 3

Underlying PRO concepts across the 816 dyads of PROs/PROMs identified. PROs patient-reported outcomes, PROMs patient-reported outcome measures

Among the 240 EPARs reporting the use of PROs/PROMs, over half 128 (53.3%), indicated PROs as secondary endpoints in pivotal trials, and 45 (18.8%) as exploratory. In 25 EPARs (10.4%), PROs were a mix of primary and secondary endpoints (Fig. 4).

Fig. 4
figure 4

Type of PRO endpoints in EPARs reporting any use of PROs/PROMs (n = 240). EPARs European Public Assessment Reports, PROs patient-reported outcomes, PROMs patient-reported outcome measures

Table 4 shows the most frequently reported PROMs in EPARs. Of the 816 total PROMs identified (including repeated measures), 227 (27.8%) targeted general population, 55 (6.7%) were generic for neoplasm, 516 (63.2%) were disease-specific and the remaining 18 (2.2%) were not specified. The most frequently used instruments were EQ-5D (n = 90, 11.0%), followed by SF-36/SF-12 (n = 48, 5.9%) and EORTC QLQ-C30 (n = 46, 5.6%). Of these 816, 74 (9.0%) were composite COAs (e.g., PRO, ClinRO and biomarker). By excluding repeated measures (and not specified), we identified 304 unique PROMs, of which 25 (8.2%) were generic, 8 (2.6%) generic for neoplasm and 271 (89.1%) disease-specific. Of these 304, 17 (5.6%) were composite COAs. The mean number of PROMs per EPAR was 1.6 (range 0–14), and 3.4 (range 1–14) when considering only those with PROs/PROMs.

Table 4 Frequency and therapeutic area of different PROMs considered in EPARs

4 Discussion

This work aimed to investigate the use of PROs/PROMs in the context of medicine authorisation at the European level, through a review of the EPARs published by EMA. No significant differences were detected in the use of patient-reported evidence between authorised (n = 497) and non-authorised medicines (n = 19) in the period 2017–2022. Of the 497 EPARs concerning the medicines authorised only, almost half (240, 48.3%) had at least one PRO and/or PROM reported. This proportion increases to 61.8% when generics and biosimilars were excluded. Since generics and biosimilars undergo different abbreviated procedures for marketing authorisation, only one generic medicine (of 93) reported the use of PROs/PROMs.

In total, 816 PROs/PROMs dyads were identified, although in 28.1% of cases the PRO was not specified. The presence of PROs/PROMs was more common in some therapeutic areas (e.g., skin) than in others (e.g., infectious diseases). In this regard, a recent review of PRO-labelling of new molecular entities approved by FDA 2016–2020, introduced the distinction between PRO-dependent and non-PRO-dependent diseases, the latter including diseases that traditionally rely on survival, biomarkers, or clinical outcome assessments other than PRO for the evaluation of treatment benefit [21]. Our results support this distinction, with the notable exception of neoplasms (i.e., non-PRO-dependent diseases according to Gnanasakthy et al [21]) where more than half (51%) of the EPARs (2017–2021) reported any use of PROMs, and up to 79% when excluding generics and biosimilars [33]. Indeed, other authors considered PROs as an essential component of oncology drug development and regulatory review [34].

Our analysis revealed that orphan status was significantly associated with consideration of PROs/PROMs in EPARs (67.1% vs 43.8%). In a previous study focussing on labels for orphan drugs approved by the FDA between 2002 and 2017 [35], only 8.3% had PRO-based labelling, a frequency that is even lower than that reported for approvals in disease areas not traditionally relying on PROs for evaluation of treatment benefit. However, the two regulatory agencies are known to rely on different evidentiary standards to assess PRO data [24, 36].

In over two-thirds of EPARs, PROs consideration referred to underlying secondary or exploratory endpoints in clinical trials, in line with EMA recommendations about classification of PRO data in the clinical trial outcome hierarchy and other literature in the field [28, 29]. Whilst recommending the inclusion of symptoms and the evaluation of HRQoL in clinical trials and recognising the added value of PROs to integrate efficacy and safety data normally, EMA considered the measurement of PROs as optional and advisable provided that these results are significant and that their measurement does not cause discomfort for the patient. However, PRO- and PROM-related considerations sometimes lack details or generally refer to other/health outcomes endpoints in clinical trials. The concept most frequently captured by PROs was QoL/HRQoL in general; in a few cases the EPAR reported more specific measures such as tumour symptoms, fatigue and disease activity. That EMA is more likely than other agencies to accept the broad multidimensional concept of HRQoL to assess PRO data in oncology submission has been previously reported [24].

Among the PROMs considered, about one-fourth were generic and the remainder disease-specific. The three most commonly adopted questionnaires were EQ-5D, SF-36 and EORTC QLQ-C30. The first two are widely used generic HRQoL instruments, which are applicable to different populations (including healthy individuals), while the third is a cancer-specific questionnaire. Among the generic PROMs, the third most used instrument, after the EQ-5D and the SF-36/SF-12, was the PedsQL, which is a questionnaire for measuring QoL in children and adolescents. In total, 304 single PROMs were identified in the review of the EPARs, indicating a high heterogeneity of the measures used for regulatory purposes, which limits the comparability of PRO results even within the same condition. Let us take the example of atopic dermatitis, of two EPARs in this indication, both considering patient-reported evidence, 22 PROMs were assessed in total, or 18 excluding repeated instruments, meaning that only three PROMs were assessed in both EPARs.

The study results showed that, despite EMA having discussed and recommended the use of PROs/PROMs for medicine evaluation since 2005, the consideration of patient-reported evidence is still limited for the purposes of marketing authorisation for medicines at the European level, with less than half of the EPARs reporting any PROs/PROMs data in the last six years. The same conclusions were drawn in recent analyses focusing on cancer medicines authorised by EMA, where authors pointed out that PRO implementation remains challenging, despite growing recognition of their added value [37, 38]. Moreover, the EQ-5D, which is the instrument most frequently retrieved in our analysis, might have been included in some clinical trials only for assessing drug’s value-for-money and guiding subsequent reimbursement decisions. Therefore, the actual consideration of PROs/PROMs to support medicine authorisation is likely to be lower than that reported in this study.

Our study has limitations. First, we relied on published documentation included in the EPAR package for each medicine, without back-tracing inclusion of PROMs in clinical trials supporting each marketing authorisation submission, or assessing the quality of these data. Second, we did not consider effect sizes or their interpretation according to minimal clinically important differences or published value frameworks, such as the American Society of Clinical Oncology Value Framework or the European Society of Medical Oncology Magnitude of Clinical Benefit Scale [39]. Third, we focussed only on PROs disregarding the other type of COAs, which could be the subject of future research.

In March 2020, EMA launched its strategy for 2025 with the purpose, among others, of enhancing the use of patient-reported data to support decision-making through a variety of actions, from guidelines development on patient data collection to patient-engagement methodology and coordination on PROs and HRQoL use [30]. Similarly, PFDD guidance documents recently issued by the FDA are a key reference to guide the stakeholder community on how to collect, analyse and submit patient experience data, and other relevant information from patients and caregivers, in clinical research and drug approval process, both in the USA and at global level [26]. Future research should continue to monitor the use of patient-experience data in the EU (and extra-EU) medicine development and regulatory decision-making, with the ultimate objective to overcome the challenges that have for many years hampered a thorough inclusion of PROs in the benefit-risk assessments of new molecular entities. For example, a simplification of the available questionnaires, with the inclusion of a few targeted items on the patient-reported concepts to be measured, could facilitate the use of PROMs in clinical trials by reducing patients’ time and burden of compilation. Moreover, the increasing availability of sophisticated health care technologies and doctor-patient communication tools (e.g., apps, sensors, wearable devices) can foster the collection of PRO data that, indeed, are now called electronic PROs (ePROs).

5 Conclusion

Despite a growing interest in patient-reported data in the generation of scientific evidence to support medicine development, the actual role of PROs and related measures is still uncertain and heterogeneous. The strategic vision to strengthen patient relevance in evidence generation, launched by EMA in 2020, requires the promotion of the measurement of PROs in the evaluation of medicines for the purposes of marketing authorisation. This work revealed the need for improving the selection of PROs and their measures for consideration in regulatory decision making, and for developing a consensus on the most suitable instruments for measuring outcomes in each therapeutic area.