Background and introduction

Patient self-assessments have been used in various situations as a tool to understand patients’ health conditions (e.g., pain [1], fatigue [2], anxiety [3]). Numerous measures (questionnaires) [4, 5] and guidelines or guidance [6,7,8] have been developed and published. The term patient-reported outcome (PRO) was initially defined as the outcome of clinical trials that tested the efficacy and safety of pharmaceuticals [8, 9] but is now widely used in clinical practice [7, 10, 11].

The US Food and Drug Administration (FDA) published guidance for the use of PROs in clinical trials in 2009 [12] and 2014 [13], followed by the Patient-Focused Drug Development Guidance Series [14] around 2020. The European Medicines Agency (EMA) published the PRO guideline for the evaluation of anticancer drugs [15] in 2016 and the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) finalized the Guidance E8 (R1) [16] in 2021.

The US guidances adopted the phrase “clinical outcome assessment (COA)”, which is defined as a superordinate concept of PROs and non-PROs, such as clinician-reported outcomes (ClinRO) [13, 17]. However, the published EMA guideline [15] and the ICH guidance E8 (R1) [16] does not include COA or ClinRO. PROs measured in clinical trials have been consolidated in systematic reviews and clinical practice guidelines to facilitate clinical decision-making. However, in the guideline of systematic review for PRO reports [10, 18], the term clinical outcome set (COS) is used whereas the term COA is not. These differences in the terminology used in the different documents make it difficult for novices to understand their content. (Henceforth, “guideline”, “guidance”, or others regarding PROs were referred to as “guidance” regardless of the original title.)

PROs measured in clinical trials are also applied in health technology assessment (HTA) and reimbursement decisions [7, 10, 11]. However, the difference between preference-based measures (PBM) [19], the source of quality-adjusted life years in HTA, and PRO in a narrow sense is not clearly stated in the guidance [15] or expressed differently (patient preference ratings, utility measures, or PBM) [12, 15, 20], which can lead to confusion.

In clinical practice, PRO assessment has been recognized as a tool for understanding patients health conditions and is expected to promote patient-centered care [21]. The International Society for Quality of Life Research (ISOQOL) has compiled clinical practice reports into best practices for PRO assessment and published them as a guidance. These include PRO assessment in clinical practice, which improves patient-clinician communication and is used for clinical decision-making [20, 22].

Electronic PRO evaluations, collectively called electronic PRO (ePRO), are now widely used in clinical trials [12, 15, 16] and in clinical practice [20], making PRO more accessible.

The expanding use of PROs may cause challenges due to variations in terminology among PRO guidance, differences in PRO scope, and varying expectations (e.g., mere outcomes or more). These discrepancies can pose difficulties for novices seeking PRO guidance in academia, industry, clinical practice, regulatory, and reimbursement decision-making, particularly in selecting appropriate guidance and understanding the content.

This study comprehensively collected and organized the guidance for PRO evaluation from clinical trials to clinical practice to assist PRO novices in selecting and understanding the guidance.

Method

A scoping literature review was conducted using a search strategy and set of eligibility criteria to examine PRO guidance’s type, target, and purpose. Following the literature search, the experts were directly inquired about the collected guidance information to ensure it was comprehensive. The process followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) [23].

Eligibility criteria

First, the documents should be guidance, guidelines, guidebooks, task force reports, recommendations, declarations or etc., related to patient-reported outcomes (PROs); quality of life (QOL), health-related quality of life (HRQL or HRQoL), or health state utilities. Second, the guidance is intended for clinical practice, clinical studies, clinical trials, psychometrics, validation, translation, item response theory, differential item functioning, clinical interpretation, minimum important difference (MID), minimal clinically important difference (MCID), meaningful change, analysis, missing data, ePRO, monitoring, ethics, labeling claims, and health technology assessment (HTA) (For a taxonomy of the above terms, please see Additional file 1). A literature search was anticipated to yield disease-specific, region/country-specific, or race-specific guidance. However, this study did not include these to ensure the generalizability of the search results. As an exception, only oncology- or rheumatology-related PRO guidance with a long history of PRO evaluation and content applicable to other diseases was included in this study.

Data sources and search strategies

We developed a comprehensive search strategy for academic articles and books in collaboration with an information specialist (KS). Search terms were determined by TK from items addressed in the guidance for clinical trials or studies and clinical practice [12, 14, 20] and books on PRO and QOL [19, 24,25,26] and were discussed with MN and KS. Given that we anticipated that documents in various formats would be reviewed in electronic or printed form, such as unique monographs or reports, articles in academic journals, and a (series) of chapter(s) in a book, we performed a comprehensive search that included databases that did not focus exclusively on academic publications.

We searched MEDLINE and Embase for academic articles published after 2009 when the FDA PRO guidance was published. We searched Google Books, WorldCat, and the National Library of Medicine (NLM) Bookshelf for books published since the year after the EMA guidance was published in 2016 to reflect updated information in this area. Searches were conducted for MEDLINE and Embase on October 28, 2020, and September 14, 2023;, for WorldCat and the National Library of Medicine Bookshelf on October 22, 2020; and for Google Books on October 25, 2020. WorldCat, the National Library of Medicine Bookshelf, and Google Books were also searched on September 25, 2023 (Additional file 1).

After the systematic search, we emailed members of the ISOQOL Japan Special Interest Group (SY, TY, KT, and MT) to examine the reference lists of the collected studies and determine whether other important PRO guidance was excluded. The resulting candidate guidance were added to the selection process as subsequent documents from other sources.

Guidance selection

Academic articles were reviewed by three research team members (SK, NM, and KT), and books were reviewed by three (NM, HE, and KT). During the review process, we removed duplicate articles or book information, and the first reviewer screened all citations (title and abstract for articles, and title and table of contents for books) to confirm eligibility for this review. Guidance on technical details (overly narrow in scope) and health system assessment guidance using PRO as one of the datasets (vast in scope) were excluded from this study. A second reviewer screened the citations independently and both reviewers discussed the screening results. If the two reviewers disagreed on the selected article or book, a third reviewer (NM) was involved in the discussion to reach a consensus. All the reviewed articles and books were scrutinized using the same criteria.

Summary of review results

The collected PRO guidance was categorized by four co-authors (SK, NM, EH, and KT) as follows: adoption of PRO measures, design and reporting of trials or studies using PROs, implementation of PRO evaluation, analysis and interpretation of PROs, and application of PROs. Rather than examining detailed differences in the collected guidance, we focused solely on integrating the information and promoting novices’ understanding.

Results

Study selection

A total of 1,502 articles were identified in the PRO guidance search and 20 additional pieces of information were obtained from experts. After removing the duplicates, 1,522 titles and abstracts were reviewed and refined to 88. After a full-text review, 51 articles met the inclusion criteria. The PRISMA flowchart in Fig. 1a illustrates the process of selecting article information. A total of 581 books were identified and 387 titles and abstracts were selected after duplicates were removed. The full texts of 37 books were reviewed, and six met the inclusion criteria. The PRISMA flowchart in Fig. 1b illustrates the book selection process. They also re-evaluated whether articles and books were selected from the same perspective. Ultimately, information from 33 articles and one book was incorporated into this study.

Fig. 1
figure 1

a Review of article information, b Review of book information

Overview of guidance

Since the publication of the FDA PRO guidance in 2009 [12], the number of guidance issued has gradually increased (see Fig. 2, Year of Publication). A total of 10 PRO guidance was published from 2009 to 2016, whereas 23 were published in 2017 and beyond, the year after the EMA PRO guidance [15] was issued. Table 1 provides an overview of the articles and books included in this study. The final selected guidance designations were guideline (n = 9) [15, 18, 27,28,29,30,31,32,33], recommendation (n = 8) [34,35,36,37,38,39,40,41], review (n = 4) [42,43,44,45], guide [46,47,48], handbook [5, 49, 50], guidance [14, 15, 51] (all n = 3), task force report [52, 53], (n = 2), checklist [54] and reflection paper [55] (n = 1). Regarding guidance specific to PRO evaluation, three were for drug efficacy or safety [14, 15, 51], 11 documents were related to the adoption of PRO measures [5, 14, 15, 30, 32, 34, 35, 38, 45, 49, 55], four were related to the design and reporting of trials/studies [14, 15, 29, 31], seven were related to implementation during PRO evaluation including ePRO and electronic health records [36, 37, 41, 44, 46, 52, 56], and six were related to the analysis and interpretation of PROs [27, 28, 39, 40, 42, 43]. The guidance for the application of PRO was identified as systematic reviews [18, 50], HTA [33, 53], and clinical practice applications [46,47,48, 51, 54].

Fig. 2
figure 2

Years of publication

Table 1 Overview of the articles and books

Summary of review results

The collected PRO guidance was categorized into five groups. Figure 3 shows the major categories of guidance. These categories and an outline of guidance are described in detail below.

Fig. 3
figure 3

Mapping of guidance for patient-reported outcome from a usage perspective

Adoption of patient-reported outcome measures

Qualitative research and patient-reported outcome measure development

Identifying outcomes that are important to patients is essential for PRO evaluation [5, 12, 14, 15, 49]. Qualitative research on patient experience has been used for conceptual framework, item development, and content validation in the development of PRO measures [5, 12, 14, 49] (for qualitative research in translation [57], ePRO [36], and MCID [14], see the literature in the respective sections). Interpreting the results of qualitative research requires the support of experts [5], whose cooperation in implementation is essential.

Copyright issues and translation

Most PRO questionnaires have been developed and owned by third parties. Therefore, it is essential to ask the questionnaire owner whether translation is possible and obtain licensing and author consent [55]. General guidance for translating PRO questionnaires [57] is also referenced in the guidance of the FDA [14] and EMA [15]. In a multinational clinical trial, there are considerations for its use even when the same questionnaire is used [34]. These translational considerations have also been applied to non-PROs [38].

Selection of patient-reported outcome measure

The measurement properties of the PRO measure were established by the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative [32]. These are reflected in the following minimum requirements for the selection of PRO measures in clinical trials or studies [35]: 1) conceptual and measurement models, 2) evidence of reliability, 3) content validity, 4) construct validity, 5) responsiveness, 6) score interpretability (see Clinically meaningful differences section), 7) quality of translation, and 8) acceptable burden on patients and investigators. Crossnohere et al. [45] chose these requirements [35] in their review of PRO selection guidance [12, 14, 15, 30].

In clinical practice, the intentions of stakeholders (e.g., clinicians and patients) in identifying outcomes, which are the premise for selecting PRO measures, often diverge [20]. Therefore, the selection of PRO measures necessitates 1) use of existing guidelines and conceptual models, 2) consideration of measurement properties, 3) measurement ease of use, and 4) engagement of clinicians, patients, and other stakeholders to reach a consensus [47, 48, 58].

Design and reporting of evaluations using patient-reported outcomes

The endpoints to be assessed by the PROs for clinical trials or studies (e.g., efficacy or safety) should be defined in advance [12, 14, 15], and responder definitions are recommended based on the interpretability of scores (see Clinically meaningful differences section for details) [12, 14, 15]. Reporting [29], and trial protocols [31] standards for clinical trials using PROs (extensions of Consolidated Standards of Reporting Trials (CONSORT) and Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT), see Additional file 2) are also recommended in the regulatory guidance [14, 15].

The purpose of PRO assessment in clinical practice can vary considerably even when this review excludes health system evaluations. Hence, the ISOQOL series of guides [20, 22, 47, 58] emphasizes the need to set goals for PRO assessment, recognize the available resources for conducting the assessment, and strategize how to discuss PRO assessment, specifying when, where, how, and with whom the results will be reported and discussed with patients.

Implementation of patient-reported outcomes evaluation

ePRO

Byrom and Muehlhausen [56] summarized essential elements of ePRO, including ePRO design, validity considerations in transitioning from paper [36], language processing, ePRO system validation when conducting evaluations [52], user training [37], and “Bring Your Own Device”. The latest ePRO-related information, including the Clinical Data Interchange Standards Consortium (CDISC) standard compliance [41], can be found on the website of the Critical Path Institutes’ PRO Consortium’s Electronic Clinical Outcome Assessment (eCOA) Consortium [59].

Patient-reported outcomes assessment in routine clinical practice

The essence of general PRO assessment in clinical practice is summarized in the ISOQOL series of guides [20, 22, 47, 58] and has been adopted in other practice guides [48]. The ISOQOL companion guide [47, 58] addresses issues identified by Ivatury et al. in oncology [44] regarding scale selection, delivery methods, frequency of assessment, and costs and resources in systematic assessment, including ways to address the challenges identified in PRO assessment. In their guidance, Snyder et al. [46] summarizes the strategy, training, evaluation, and administrative, ethical, and legal considerations for integrating PROs into electronic health records.

Analysis and interpretation of patient-reported outcome evaluation

Statistical methods

The Setting International Standards in Analyzing Patient-Reported Outcomes and Quality of Life Endpoints Data (SISAQOL) Consortium recommendations [39] use cancer clinical trials as examples to categorize the remaining challenges of planning and reporting trials or studies using PROs. These challenges include fit-for-purpose statistical methods, definitions, and management of missing data.

Clinically meaningful differences

In regulatory PRO guidance [11, 14, 15], for a reasonable definition of “response” and “worsening” for an individual patient (responder definition in Design and reporting of evaluations using patient-reported outcomes section), a statistical significance test alone is not sufficient. The amount of change or difference obtained must be judged to be MID [28], MCID [28], or a meaningful score difference [14]. The MCID can be used for between-group, within-group, or within-patient changes and requires clarification [28]. Designing clinical trials or studies with a known measure of the MCID facilitates the interpretation of results [15]. Two methods are used to estimate the MCID, one based on anchors and the other based on distributions [11, 14, 28]. Cocks et al. [27] guides sample size calculation and score interpretation in cases where the PRO measure was used for patients with cancer.

Response shift

The response shifts are unintended deviations from the PRO measurement results. Sajobi et al. [43] reported that statistical methods for detecting reaction shifts are shifting from then-test methods to structural equation modeling, whereas Verdam et al. [40] conducted modeling to identify response shifts and summarized their interpretation (detection of response shifts and assessment of true changes).

Application of patient-reported outcome

Systematic review and patient-reported outcomes

The COSMIN initiative promotes high-quality PRO measurement and assessment with guidance for systematic reviews [18] and bias assessment [60]. The Cochrane Handbook for Systematic Reviews of Interventions [50] considers evidence synthesis.

PROs in health technology assessment

The EUnetHTA, a network of HTA organizations in Europe, has published guidance for outcomes that include PROs and non-PROs in the context of HTA [33]. However, many clinical trials or studies using PROs do not include PBM to calculate the utility required for HTA and lack relevant preference-based scoring systems. Mapping aims can be used to fill these gaps in evidence. Reporting [61] and methodological [53] guidance is provided for this procedure.

Patient-reported outcomes in clinical practice

Patient-reported outcomes for screening and monitoring

The ISOQOL series of guides [20, 22, 47, 58] lists the best practices that can be used for any purpose, including screening, monitoring, and assessing effectiveness and safety of intervention. This has been incorporated into the Patient-Reported Outcomes Tools, Engaging Users and Stakeholders (PROTEUS) guidance for clinical use [48]. Banerjee et al. [51] proposed a framework for drug safety data collection in pharmaceuticals.

Patient-reported outcomes in communication

The significance of PRO assessment (how and why the data are used for treatment) needs to be clearly communicated to improve patient-clinician communication in clinical practice [48]. As described in Patient-reported outcomes for screening and monitoring section, the series of ISOQOL guidance [20, 22, 47, 58] provides best practices for this purpose.

Patient-reported outcomes for clinical decision-making

PROs measured in clinical trials can be used for third-party clinical decision making when published as reports. Wu et al. [54] discussed using PRO assessment reports in clinical practice. PRO assessment in clinical practice has created a basis for decision-making by providing patient feedback on the PRO assessment results [20, 22, 47, 58].

Discussion

Previous exhaustive PRO guidance has been organized regarding PROs for approval, reimbursement, and policy [62]; PROs in clinical trials/studies and clinical practice [48]; and PRO measure utilization [63]. This scoping review collected all guidance except for health system evaluations and organized them into the five sections presented in the results. During this organization, we recognized the need to note the “place” and “purpose” for which guidance is used when choosing and understanding guidance for novice users. The specific sections of this review that should be referred to choose and understand the guidance are identified below.

In clinical trials or studies, what is expected for PROs is the outcome of the trial or study. However, PROs in clinical practice may be expected to serve as communication tools, as indicated in Patient-reported outcomes in communication section, rather than simply outcome.

PROs are used as a measure of health in drug approval and PBM, an indicator of health value, is used in HTA, as described in PROs in health technology assessment section. However, it should be noted that in some countries (e.g., the United Kingdom), PBM may also be referred to as PROs (i.e., the scope of PROs varies).

The terminology associated with PROs varies according to regional and national clinical trial guidance, as noted in the background, and by disease area and application (e.g., systematic reviews). Therefore, when reading the selected guidance, it is advisable first to review the definitions of PROs and their related terms. (Additional file 3 provides examples of synonyms that may be difficult to understand using only a single guidance).

This study has some limitations. First limitation was the keywords setting for the titles of the guidance, which were based on existing guidance and books. However, the titles of the collected guidance were sometimes described as checklists or handbooks. It is possible that adding these terms to the keywords made it more efficient to obtain the desired guidance. Second limitation is that the database used to retrieve article information specializing in the medical sciences did not use PsycINFO in psychology. Therefore, guidance for qualitative research (e.g., COREQ: Consolidated criteria for reporting qualitative research [64] and CIRF: Cognitive Interviewing Reporting Framework [65]) were not included in this review. Although a previous study [35] used psychological databases, consultations with experts yielded more relevant information than database searches. We believe that the comprehensiveness of the present review was ensured by consulting ISOQOL Japan Special Interest Group members. Third limitation is that disease-specific guidance was excluded from the collection. However, a 2013 review by the SPIRIT-PRO group of guidance documents from 1989 to 2013 focused chiefly on HRQL or PRO assessments in cancer clinical trials, and 21,175 reports were screened after removing deduplicates [6]. The inclusion of disease-specific guidance may unnecessarily expand the scope of this review. This study prioritized the feasibility of a comprehensive strategy spanning both scholarly articles and book information. Fourth limitation was the lack of comparison between the series of FDA guidance and other guidance regarding the definition of COA. For example, the FDA’s COA includes patient preference information for medical devices [66]. Although Hollin et al. [67] cited PRO guidance and recommended the validity of preference evidence from qualitative studies, PROs differ from patient preferences, which may confuse novices. Patient preference information was outside the scope of this study, and that article [67] was ultimately excluded. In the future, collecting and organizing guidance for patient preference information may be necessary.

Conclusions and implications

In this scoping review, existing PRO guidance was categorized into adopting PRO measures, designing and reporting of trials or studies using PROs, implementing PRO evaluation, analyzing and interpreting PROs, and applying PRO evaluation. Based on this categorization, we suggest the following for novices: When selecting guidance, novices should clarify the “place” and “purpose” where the guidance will be used. Additionally, they should know that the terminology related to PRO and the scope and expectations of PROs vary by “places” and “purposes”.