Background

The primary goal of HIV therapy is to increase disease-free survival and improve health-related quality of life (HRQL) by containing viral replication, avoiding drug resistance, and boosting immunologic function by restoring CD4 count [1, 2]. The United States Department of Health and Human Services (DHHS) has recommended several preferred and alternative initial highly active antiretroviral therapy (HAART) regimens which have comparable efficacy, but different pharmacokinetic or pharmacodynamic properties. DHHS further recommends tailoring the HAART regimen to the patient--based on expected side effects, convenience, comorbidities, potential drug interactions, and results of any pre-treatment genotypic drug-resistance testing--to optimize medication adherence and improve long-term treatment success [3]. Since some of these constructs must be measured from the patient perspective, it is important to consider patient-reported outcomes (PROs) when selecting initial antiretroviral therapy.

A PRO is defined as any report of the status of a patient’s health condition that comes directly from the patient without interpretation of the patient’s response by a clinician or anyone else [4]. In clinical trials, PRO instruments can be used to measure the effect of a medical intervention on one or more concepts – such as symptoms, functioning, severity of disease, or HRQL. Given the armamentarium of potent HAART regimens available today, HIV infection has been transformed from a terminal illness into a chronic condition. As such, there is a strong case for evaluating the impact of antiretroviral therapies on broader aspects of patient’s lives, including psychological health and emotional adjustment. The majority of published comparative treatment studies that include PROs are limited to comparing differences between protease inhibitor (PI) and non-nucleoside reverse transcriptase inhibitor (NNRTI)-based regimens. This may be due in part to the fact that for several years, treatment guidelines have recommended initiating HAART with two NRTIs plus either an NNRTI or a boosted PI [5, 6]. However, this broad comparison may miss important distinctions among regimens that are related to within-class PRO differences.

Although five NNRTIs have received Food and Drug Administration (FDA) approval to date, European AIDS Clinical Society (EACS) and DHHS treatment guidelines recommend efavirenz (EFV) as the NNRTI of choice to be used in most treatment-naïve HIV-infected adults initiating NNRTI-based therapy [3, 6]. Other recommended NNRTIs include nevirapine (NVP) and rilpivirine (RVP). In the absence of head-to-head comparative clinical trials demonstrating clinical superiority of one NNRTI over another, PROs become an important tool for identifying treatment differences and informing treatment choices. A necessary first step to understanding differences among specific NNRTIs is to examine the PRO instruments being used in clinical trials and the aspects of health they measure. Therefore, the purpose of this study was to identify and classify PRO instruments used to measure treatment effects in clinical trials evaluating NNRTIs.

Methods

Literature search

An electronic search using PubMed was conducted evaluating studies published from March 2003 to February 2013. Our search strategy included a combination of Medical Subject Headings (MeSH) terms for HIV [HIV OR HIV infections], MeSH terms associated with PROs/instruments [questionnaires OR interviews as topic OR quality of life OR patient satisfaction OR self-evaluation programs], Substance Names of NNRTIs [efavirenz OR nevirapine OR delavirdine OR etravirine OR rilpivirine OR efavirenz, emtricitabine, tenofovir disoproxil fumarate drug combination], and clinical trial Publication Types [clinical trial OR clinical trial, phase IV OR clinical trial, phase III OR clinical trial, phase II OR controlled clinical trial OR randomized controlled trial]. A complete list of all search terms used, including terms used in title/abstract searches, is shown in Additional file 1. We limited our search to articles written in English with abstracts available. In addition to the PubMed search, we conducted a manual search of the bibliographies of the electronically-identified primary studies and review articles.

Study selection

The inclusion and exclusion criteria for studies to be considered in our systematic review were established prior to conducting the literature search. All identified articles were independently screened by two authors. Papers included in our review reported on clinical trials evaluating NNRTI-based treatment regimens in HIV-infected adults and administering at least one validated PRO instrument. Full-text articles of study abstracts which appeared to administer a PRO instrument were reviewed for the name and citation of the validated instrument. Reviews, editorials, animal studies, and those reporting results of children were excluded from our analysis.

Data extraction

Data collected from each study included population characteristics, study design, study objective, treatments, and PRO instruments administered. We categorized each validated PRO instrument by type (e.g., HRQL, symptoms) to understand key domains of interest in NNRTI-based therapy. We also assessed the number of items, scoring, and dimensions/concepts measured by each instrument. For the most commonly used instruments, PRO-related data (e.g., baseline and follow-up scores, effect sizes, and significance values) were extracted from the studies, as available. The most commonly used PROs and study results were described.

Results

A total of 189 articles were identified by the literature search and bibliography review. Most articles were excluded because they did not include a validated PRO instrument (n = 111). Articles were also excluded for one or more of the following reasons: review articles (n = 33), duplicate studies (n = 18), evaluated HIV therapies in children (n = 5), or did not evaluate NNRTI-based regimens (n = 13). Twenty-six unique clinical trials met all selection criteria and were included in the review.

Table 1 presents the characteristics of the 26 clinical trials. Almost all were randomized controlled trials (n = 20). The number of PRO instruments per study ranged from 1 to 7, with most studies including only one (54%) or two (23%) validated PRO instruments. In addition to validated PRO instruments, eight of the 26 trials (31%) used non-validated and study-specific instruments to measure such aspects as treatment preference, treatment satisfaction, perceived ease of regimen, and neuropsychiatric symptoms.

Table 1 Characteristics of included studies

The PRO instruments used corresponded to each study’s primary objective (e.g., HRQL studies used general or HIV-targeted HRQL instruments, a study to compare depressive symptoms in patients taking EFV- versus PI-based regimens used the CES-D, a depression-specific PRO instrument, etc.). Most studies utilizing a generic HRQL instrument (e.g., SF-36, SF-12) also included either an HIV-targeted HRQL or symptom instrument [11, 14, 1921, 25].

Overall, 27 validated PRO instruments were identified. Six of the instruments the Medical Outcomes Study HIV Health Survey (MOS-HIV), Functional Assessment of HIV Infection (FAHI), World Health Organization Quality of Life HIV BREF (WHOQOL-HIV BREF), HIV Symptom Index (HIV-SI)/AIDS Clinical Trials Group Symptom Distress Module (SDM), Beliefs about Medicines Questionnaire-ART version (BMQ-ART) and HAART Intrusiveness Scale were developed specifically to be administered in the HIV population. The remaining instruments were either generic HRQL instruments or general symptom-specific instruments.

Characteristics of the PRO instruments, including the number of items, concepts measured, and scoring method, are presented in Table 2. Based on review of the concepts measured by the PROs, key areas of interest measured by PROs in NNRTI clinical trials include general and HIV-targeted HRQL (typically comprised of physical, emotional, social, and functioning domains), HIV-related symptoms (including anxiety, depression, sleep, psychiatric symptoms, and stress), and medication-related beliefs.

Table 2 Characteristics of identified PRO measures

Table 3 provides a summary of the validated PRO instruments, categorized by instrument type, utilized in the 26 studies. The MOS-HIV, administered in 8 clinical trials, was the most commonly used PRO instrument. Table 4 presents PRO results for all PRO instruments used in three or more studies: the MOS-HIV, FAHI, HIV-SI/SDM, and CES-D.

Table 3 PRO instruments identified in trials with NNRTIs
Table 4 PRO results of commonly used instruments

Discussion

Evaluation of PROs during clinical practice, as well as in clinical research, enhances understanding of disease impact and effect of treatment on that disease impact. Thus, PRO assessment should be recognized by patients and their physicians, as well as by payers and health technology assessment authorities, as improving the knowledge base on which to base health care decision making, and ultimately to improve patient health. This study found that the key areas of PRO interest in clinical trials of NNRTI-based therapy are HRQL (general or HIV-targeted, and typically comprised of physical, emotional, social and functioning domains), HIV symptoms, sleep, and psychiatric symptoms, including anxiety, depression, stress, and medication beliefs. A variety of instruments were used to measure these dimensions. The only instruments used in three or more clinical trials within the past ten years were the MOS-HIV, FAHI, and CES-D.

Overall, although we were able to identify important concepts measured in NNRTI studies based on the convergence of PRO instrument types (e.g., HRQL, HIV symptoms, anxiety, depression), there was a noticeable lack of consensus among studies on specific instruments utilized to measure each concept. For example, of five generic HRQL instruments identified, none were used in more than two studies.

To our knowledge, this is the first study to systematically identify and categorize PRO instruments used specifically in NNRTI clinical trials. Clinical trials commonly use more than one PRO instrument. Although each PRO instrument may be able to contribute valuable information, it is important to carefully weigh the advantages and disadvantages of each instrument, especially related to its sensitivity and specificity to capture the patient factors of greatest importance. This is important both maximize the chances of detecting important differences between treatments, as well as to limit patient response burden.

A multidimensional generic HRQL instrument, such as the SF-36 or EQ-5D, is useful because it comprehensively measures HRQL and has norm-based scoring which can be used to compare the study population with others. Furthermore, it can be used in population-wide decision making by providing data on quality of life weights or utilities for inclusion in cost-effectiveness and cost-utility analyses. For example, this can be done directly (e.g., using the EQ-5D) or indirectly (e.g., by deriving SF-6D utility weights from the SF-36). However, a disadvantage of using generic measures is that they may be less sensitive or responsive to small but important changes that occur due to changes in disease status, adverse events, or to treatment effect, and which may occur over the typical timeframe of a randomized control trial.

HIV-targeted HRQL instruments, such as the MOS-HIV, FAHI, and WHOQOL-HIV BREF, were each developed by revising, at least in part, generic HRQL instruments (the SF-20, Functional Assessment of Cancer Therapy-General [FACT-G], and WHOQOL-BREF, respectively) with input from HIV-infected patients and HIV-treatment providers to ensure more complete coverage of concepts specific to HIV infection. Each instrument demonstrates excellent psychometric properties in the HIV population. In contrast to the generic HRQL instruments, a disadvantage of HIV-targeted instruments is that they do not provide a means for estimating utilities, which can be useful in clinical-economic modeling considered by health technology assessment authorities and others focused on population health.

For HIV-related symptoms, the HIV-SI/SDM is considered to be the gold standard in clinical research. However, a generic symptom-specific instrument may be more appropriate when the primary or secondary study objective is to measure a specific symptom; such symptom-specific instruments generally measure the symptom and different attributes and impacts with multiple items, thus providing greater insights into the extent and effect of the measure.

More than half of the articles initially identified were excluded from our review because the abstract did not report use of a PRO instrument. However, this likely underestimates of the frequency of administration of PRO instruments in clinical trials for two reasons: 1) we used an extensive list of search terms in order to capture as many validated PRO instruments as possible, and consequently identified non-relevant articles, and 2) PROs are generally secondary endpoints in clinical trials; as such, they may not be mentioned in the study abstract and commonly are reported in separate publications. Since we did not review the full text of excluded articles, we do not know if the excluded studies were unique clinical trials or secondary publications of identified trials.

There are several limitations to our study that should be noted. First, our review excluded questionnaires measuring adherence because we were only interested in patient-reported measures of treatment effects. However, it should be noted that the HIV-SI/SDM is a component of the ACTG Adherence Questionnaire, a validated instrument developed by the AIDS Clinical Trial Group. Although our review excluded studies which mentioned only adherence and no additional patient-reported measures in the study abstract, based on abstract review we identified two studies which used the ACTG Adherence Questionnaire [34, 35]. It is possible that there are additional studies which used the ACTG Adherence Questionnaire as the adherence measure, and therefore also measured HIV symptoms with the HIV-SI/SDM, which were not included in our literature review. Secondly, our review only evaluated studies using validated PRO instruments. However, some studies use study-specific instruments which are based on one or more validated instruments. For example, studies by Santos et al. [36] and Martinez-Picado et al. [37] used modified versions of the MOS-HIV and thus were not fully evaluated in our review. Finally, our review focused on PRO instruments included in prospective clinical trials of NNRTIs. It should be noted that there are clinical research networks, such as the Centers for AIDS Research Network of Integrated Clinical Systems (CNICS) which allow for retrospective review of PROs measured during routine medical visits [38]. PRO instruments used at these clinical sites include the Patient Health Questionnaire (PHQ) for depression and anxiety, HIV-SI for symptom burden, and EQ-5D for HRQL, among others. For example, a study by Kozak et al. [39] used reports from the Patient Health Questionnaire depression scale (PHQ-9) and the Alcohol, Smoking and Substance Involvement Screening Test (ASSIST) to demonstrate that current substance abuse (odds ratio [OR], 2.78; 95% confidence interval [CI], 1.33–5.81) and current depression (OR, 1.93; 95% CI, 1.12–3.33) were associated with poor antiretroviral adherence in HIV patients. Additional research, including review of NNRTI studies published in non-English languages and retrospective analyses of PROs collected during usual medical care visits, should be conducted and could build on the findings presented here.

Conclusions

Review of recently published NNRTI clinical trials suggests a lack of consensus on the optimal PRO instruments to include to measure key domains. Overall, a typical battery of instruments is comprised of a multidimensional measure of HRQL (either HIV-targeted or generic) coupled with one or more symptom measures. Further work is needed to clarify the advantages and disadvantages of using various instruments to measure the relevant constructs and to identify the most useful batteries of instruments. Furthermore, new instruments may need to be developed to meet future research needs.