Introduction

In order to determine if a patient has achieved a meaningful outcome, it is insufficient to evaluate treatment results solely on medical history, physical findings, laboratory tests, or imaging findings [1]. Patient-reported outcome (PRO) measures are a useful tool to quantify and communicate a patient’s health status to healthcare providers that directly incorporates the patient’s voice. Change in PROs can be one of the measures of “success” from a patient’s perspective after an orthopedic procedure [2]. PROs are increasingly being used as part of the clinical encounter to guide treatment decisions and determine the effectiveness of interventions [3], but PROs have presented challenges with implementation and measure selection.

In orthopedic practice and research, there is great variability in the number of PRO measures available. As a result, there is confusion among orthopedic providers about which PRO measure is most appropriate given a patient population and how to appropriately interpret a patient’s score to enhance treatment recommendations. Subsequently, in orthopedics, there has been a recent increase in the adoption of a universally accepted set of PRO measures: the Patient-Reported Outcomes Measurement Information System® (PROMIS®). PROMIS has been compared against conventional general health and disease-specific PRO measures and regularly has been found to improve coverage of the relevant health domain, increase reliability, and reduce respondent burden [4].

PROMIS measures were developed with support by the National Institutes of Health (NIH) as an effort to address the need for more valid, reliable, and generalizable measures of clinical outcomes that are important to patients [5]. PROMIS is a set of psychometrically sound measures to assess a patient’s physical, mental, and social health across multiple conditions or diseases, including orthopedic conditions. PROMIS measures overcome the limitations of traditional PRO measures used in orthopedic research and practice by scoring all PROMIS domains using a common metric of a T-score that is normalized to the U.S. general population. PROMIS provides access to both fixed-length measures (e.g., 6-item measure of fatigue) and computerized adaptive testing (CAT) that tailors the measure for each individual to allow for efficient assessment when response burden is of concern [6].

In recent years, a proliferation of studies have reported the association of PROMIS measures with traditional measures and have demonstrated the reliability and performance of PROMIS measures in orthopedic populations. While there have been a few systematic reviews about the use of PROMIS measures in certain disciplines within orthopedics [7,8,9,10], these reviews do not describe how the measures have been reported neither in the literature nor the general uptake of PROMIS measures within orthopedic research and practice. Thus, we sought to evaluate the adoption of PROMIS measures in orthopedics by describing how the measures are used and reported on, including the PROMIS domains evaluated, the type of PROMIS instrument used, and other traditional measures that were reported along with PROMIS measures.

Methods

Review design

The protocol for this systematic review was designed in accordance with the PRISMA guidelines [11] and is registered with the PROSPERO database (CRD42018088260) [12]. We collaborated with a research librarian (LL) to develop an appropriate search strategy and management of the literature review.

Data sources and search strategy

We performed a literature search of PubMed, Embase, and Scopus from inception to November 4, 2018, using a combination of keywords and database-specific subject headings to capture studies done in an orthopedic setting and/or procedures that reported a PROMIS measure as an outcome (Additional file 1). We added search filters to exclude case studies or reports, editorials, letters to the editor, and studies not written in English.

Inclusion and exclusion criteria

Inclusion criteria included the use of PROMIS measures in studies conducted in orthopedic settings for clinical care purposes or studies that used PROMIS measures to assess an outcome from an orthopedic intervention. Our exclusion criteria were study population < 18 years of age; non-orthopedic interventions, settings, or providers performing the intervention; and qualitative studies, commentaries, or systematic reviews. All included studies were peer-reviewed, reported at least one PROMIS measure, and used an experimental, quasi-experimental, or observational design. Two authors screened articles (MH and SZG) and a third author (ER) resolved any conflicts.

Study selection and data extraction

After databases were searched, titles and abstracts of studies were uploaded into Covidence, a systematic review management software [13]. The article selection process was done in two phases. In the first phase, two authors (MH and SZG) performed independent reviews of titles and abstracts in Covidence using the predefined inclusion and exclusion criteria. Articles were moved to full-text review if one or both authors found the article potentially relevant. In the second phase, the same two authors independently reviewed full-text articles for eligibility. Any conflicts were resolved by the third author.

Data analysis

Included studies were evaluated from November 2018 to June 2020. The primary purpose of this review was to describe the uptake of PROMIS measures in orthopedic research and practice through qualitative synthesis, and then rate the quality of included studies. Therefore, we did not perform a meta-analysis of data. For the qualitative synthesis, we described the studies by publication year, clinical population, study type, and sample size. We evaluated the reporting of PROMIS measures by recording the PRO domains reported in each study and the type of PROMIS measures used (i.e., domain-specific fixed short forms, multiple domain profile short forms, or CAT). Last, we described the frequency in which PROMIS measures were reported alongside traditional measures by the clinical population. Traditional measures are non-PROMIS established measures used in orthopedics.

Quality assessment

We used the Newcastle-Ottawa Scale (NOS) to assess the quality of included studies (Additional file 2). Because this review included a heterogeneous group of studies with a wide variety of methodologies, there is likely no single risk of bias tool to perfectly evaluate study quality across such a diverse group. The NOS was developed to assess the quality of nonrandomized studies, and evaluates studies within three domains: the selection of study groups, the comparability between these groups, and the determination of the outcome of interest. We used a version of the NOS specifically adapted for cross-sectional studies [14] and for case control and cohort studies [15]. The NOS scoring of seven or more stars is generally considered high quality, though no ranges have been officially reported in the literature [16].

Results

Our preliminary search yielded 1046 citations, and after duplicates were removed, 513 citations were reviewed by their titles and abstracts. Of those, 376 were moved forward to the full-text review stage, and 88 articles remained for inclusion in the systematic review [3, 17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103] (Fig. 1). After conflicts were resolved by the third author, we calculated an 81.6% agreement between the authors performing full-text review.

Fig. 1
figure 1

PRISMA literature flow diagram [11]

Study characteristics

Table 1 shows the characteristics of included studies by year, clinical population, study type, and sample size.

Table 1 Study characteristics

Year

Studies included in this review were published from 2013 through 2018. The number of publications reporting PROMIS measures notably increased across time: 2013 (1%, 1 study), 2014 (7%, 6 studies), 2015 (6%, 5 studies), 2016 (13%, 11 studies), 2017 (17%, 15 studies). The majority of studies were published in 2018 (57%, 50 studies).

Clinical population

PROMIS measures were reported in orthopedic studies across multiple clinical populations. For reporting, we grouped the studies by body region rather than specific diagnosis. The majority of studies (36%, 32 studies) reported PROMIS measures in lower extremity disorders (hip, knee, ankle, foot), followed by upper extremity disorders (shoulder, elbow, hand) (28%, 25 studies), spine disorders (19%, 17 studies), orthopedic trauma (10%, 9 studies). Few studies (6%, 5 studies) reported PROMIS measures in general orthopedic patients.

Study type and sample size

The studies in this review varied in the study design used to assess outcomes. The largest percentage of studies were cohort studies (59%, 52 studies). Most of these were prospective observational designs (38%, 33 studies), and 22% (19 studies) were retrospective observational designs. Many studies (41%, 36 studies) used a cross-sectional study design to analyze the psychometric properties of PROMIS or to validate in a patient population. No randomized controlled trials were reported using PROMIS measures as an outcome measure. Sample sizes in the studies ranged from 11 patients to 14,679 patients, with 133 patients as the median number reported. Five studies included patients from registries including the American Orthopedic Foot and Ankle Society’s National Orthopedic Foot and Ankle Research Outcomes Network and the Maryland Orthopedic Registry.

Reporting of PROMIS measures

The most frequently reported PROMIS domains in the studies included in this review were physical function (81%, 71 studies), pain interference (61%, 54 studies), depression (31%, 28 studies), physical function-upper extremity (18%, 16 studies), physical function-lower extremity (3%, 3 studies), and anxiety (15%, 13 studies) (Table 2). Most studies (75%, 66 studies) reported more than one PROMIS domain. Approximately a third of studies (32%, studies) reported two PROMIS domains, 25% (22 studies) reported three PROMIS domains, 9% (8 studies) reported four PROMIS domains, and the remainder (9%, 8 studies) reported between 5 and 9 PROMIS domains. Only a quarter (25%, 22 studies) reported one PROMIS domain. Of the type of PROMIS instrument used (i.e., CAT, short form, or profile), the vast majority of studies (81%, 71 studies) reported using the PROMIS CAT approach. A small percentage of studies reported only fixed-length instruments (15%, 13 studies) and (4%, 4 studies) reported a combination of CAT and fixed-length questionnaires.

Table 2 Reporting of PROMIS measures

PROMIS and traditional PROs

Fourteen studies in this review reported PROMIS as the sole outcome measure. Of those 14 studies, 9 were published in 2018 alone. Widely reported traditional measures were reported alongside PROMIS measures in all studies. Traditional measures included measuring the constructs of pain, disability, psychosocial comorbidity, and quality of life. Table 3 describes the reporting of traditional measures alongside PROMIS measures by body region.

Table 3 PROMIS domains and traditional PROs by body region

Quality of studies and risk of bias

A majority of studies assessed had a low risk of bias. All cohort and cross-sectional studies scored seven or above in their respective versions of the NOS quality assessment tool, and, with one exception, all case-control studies scored eight or above. Table 4 describes the risk of bias summary for individual studies included in this review, and Additional file 2 contains detailed results of the quality assessment.

Table 4 Risk of bias summary table

Discussion

In this review, we evaluated the uptake of PROMIS measures in orthopedic research and practice by describing how PROMIS measures were reported in published studies. The number of studies reporting the use of PROMIS measures increased exponentially from 2013 through 2017, with a spike in studies reporting PROMIS measures in 2018 alone (57% of total studies). This large increase in studies potentially indicates that PROMIS measures are being more widely adopted within orthopedic research and practice as an outcome measure. This increase may be due to the evolution of PROMIS measures from the short form, fixed instrument to the CAT instrument. Additionally, progress has been made with the availability and integration of PROMIS measures into Electronic Health Record (EHR) systems, allowing easier use of PROMIS CAT in the clinical setting [104, 105]. However, in relation to the increase in reporting of PROMIS measures in the literature, the vast majority of studies in our review reported the use of traditional measures alongside PROMIS measures [106]. This finding supports that, while PROMIS measures are gaining traction within orthopedics, researchers and clinicians may not be ready to abandon traditional measures in favor of PROMIS measures, despite evidence that the PROMIS domains of physical function and pain interference outperform traditional measures [107]. The reasons for this hesitancy may be related to familiarity with traditional measures, participation in registries that do not have PROMIS measures as part of the core set of measures, or a perceived lack of applicability in their patient populations. However, it may be noted that any new PRO measure should be considered experimental; thus, established measures are included both for validation purposes and to gain more understanding of how they relate to each other.

Our review also found that the use of PROMIS measures across clinical populations varied, with 37% of studies examining lower extremity conditions, followed by upper extremity (28%) and spine conditions (19%). This finding is consistent with the supporting literature where the use of PROMIS measures in lower, upper, and spine is increasing as a primary measure across clinical populations [1, 4, 108]. Last, most studies in our review reported the use of CAT-based assessments as the PROMIS assessment type. This finding is not surprising, as the primary benefits of the PROMIS CAT measures are the decrease in patient burden and the precision of the estimate. The majority of studies reported between one and three PROMIS domains. Unsurprisingly, the most commonly reported PROMIS domains were physical function and pain interference, which are validated and compared to many traditional measures. Of the psychological domains, depression was reported more frequently than anxiety. While the field of orthopedics is focused on improved functioning and reduced pain, we would encourage a more holistic view of the patient by incorporating more psychological constructs that may affect patient prognosis. This review provides evidence that the prevalence and support for use of PROMIS measures is growing in orthopedics and that PROMIS is being recognized as a PRO measure of choice for clinical trials [109].

Limitations

Our systematic review has some limitations. First, we aimed to describe the prevalence and use of PROMIS measures within orthopedic practice and research rather than to compare outcomes or exposures in the studies. Our review had broad inclusion criteria, and thus there was high variability, with study designs often considered less rigorous. The majority of studies were retrospective and prospective cohort studies. No studies in our review were randomized clinical trials; however, this is likely because of the relative unavailability of PROMIS measures until recently. It will take some time before clinical trials that use PROMIS measures as endpoints are published.

Second, we reported on the PROMIS domains but did not perform meta-analyses to examine the effects of treatment or compare the performance of PROMIS measures with other reported measures. Last, many studies included in the review examined the reliability and validity of PROMIS measures in orthopedic populations, so the studies that reported PROMIS measures as the primary outcomes were less frequent, potentially leading to the impression that there is a higher prevalence of reporting PROMIS measures in the literature.

Conclusions

PROMIS measures have been increasingly reported in orthopedic research and practice and present a new era of PRO measurement for clinical practice and scientific dissemination. Our findings are relevant for orthopedic researchers and clinicians who are using, or considering using, PROMIS measures. Our findings can provide guidance for stakeholders about the selection and administration of PRO measures, supporting value-based decisions both in clinics and prostheses procurement [110]. The domains of physical function and pain interference are the most commonly reported PROMIS domains, and these measure similar constructs to the traditional, body region-specific measures. Considerations about which PROMIS measures to administer in clinical populations should be made by determining what constructs are most important and whether PROMIS measures are sufficient alone or if traditional measures are needed to supplement the PROMIS measures. Given the evidence for the validity and reliability of PROMIS in orthopedics, we expect a decrease in the use of other established PRO measures in order to reduce respondent burden.

The implications for future research and practice in orthopedics support that PROMIS measures are versatile, reliable, and valid for orthopedic research and practice. Further, PROMIS measures provide distinct advantages over traditional measures, particularly, when the study population is heterogeneous. Multiple recent studies indicate that widespread variability exists in the particular PROs used in studies of the same diagnosis, thereby significantly limiting the translatability of many of these high-impact studies [6, 8, 111, 112]. Future research on the use of PROMIS measures in orthopedics should focus on the use of PROMIS measures as the primary outcome measure, particularly in studies that examine heterogeneous patient populations. Last, PROMIS measures hold immense potential for improving patient and provider communication, particularly across specialties.