Development and Initial Validation of the PEG, a Three-item Scale Assessing Pain Intensity and Interference
- First Online:
- Cite this article as:
- Krebs, E.E., Lorenz, K.A., Bair, M.J. et al. J GEN INTERN MED (2009) 24: 733. doi:10.1007/s11606-009-0981-1
- 2.9k Downloads
Inadequate pain assessment is a barrier to appropriate pain management, but single-item “pain screening” provides limited information about chronic pain. Multidimensional pain measures such as the Brief Pain Inventory (BPI) are widely used in pain specialty and research settings, but are impractical for primary care. A brief and straightforward multidimensional pain measure could potentially improve initial assessment and follow-up of chronic pain in primary care.
To develop an ultra-brief pain measure derived from the BPI.
Development of a shortened three-item pain measure and initial assessment of its reliability, validity, and responsiveness.
We used data from 1) a longitudinal study of 500 primary care patients with chronic pain and 2) a cross-sectional study of 646 veterans recruited from ambulatory care.
Selected items assess average pain intensity (P), interference with enjoyment of life (E), and interference with general activity (G). Reliability of the three-item scale (PEG) was α = 0.73 and 0.89 in the two study samples. Overall, construct validity of the PEG was good for various pain-specific measures (r = 0.60–0.89 in Study 1 and r = 0.77–0.95 in Study 2), and comparable to that of the BPI. The PEG was sensitive to change and differentiated between patients with and without pain improvement at 6 months.
We provide strong initial evidence for reliability, construct validity, and responsiveness of the PEG among primary care and other ambulatory clinic patients. The PEG may be a practical and useful tool to improve assessment and monitoring of chronic pain in primary care.
KEY WORDSpain measurement primary care
Inadequate pain assessment has been identified as a key barrier to appropriate pain management.1, 2, 3 Recently, important initiatives have aimed to increase awareness of pain as a clinical problem by promoting better pain assessment.4, 5, 6 These initiatives have led to widespread adoption of pain screening through measurement of current pain intensity.
In chronic pain, the most common type of pain seen in primary care, assessment of pain intensity alone is inadequate. Guidelines encourage comprehensive assessment that includes measurement of pain-related functioning, which may be even more relevant to patients’ overall quality of life than intensity.7,8 To facilitate chronic pain assessment, numerous multidimensional patient-reported measures have been developed;9, 10, 11, 12 however, none of these have been widely adopted in the general medical settings where most chronic pain treatment is delivered.
In primary care, use of multidimensional pain measures is limited by factors such as instrument length and scoring complexity; however, a brief and straightforward multidimensional measure could potentially improve assessment of chronic pain. We sought to develop a very brief measure that would be feasible, valid, and sensitive to change in primary care. We started with the Brief Pain Inventory (BPI) because it is relatively easy to administer, score, and interpret; includes items assessing pain intensity and functional interference; and has been validated in many pain conditions.10,13, 14, 15, 16, 17 As its name implies, the BPI is shorter than other multidimensional pain measures, but it is still too lengthy for implementation in primary care practice. We hypothesized that a shortened scale based on the BPI could be developed that would be more feasible, but just as useful, for assessing chronic pain in primary care. Our objectives were to develop an ultra-brief scale derived from the BPI and to initially assess its reliability, validity, and responsiveness.
We used data from two sources: 1) Stepped Care for Affective Disorders and Musculoskeletal Pain (SCAMP), a longitudinal study that enrolled a total of 500 patients with chronic musculoskeletal pain, and 2) Helping Veterans Experience Less Pain (HELP-vets), a cross-sectional study of 646 veterans receiving care at VA clinics. We used data from Study 1 to develop and initially validate the ultra-brief measure and data from Study 2 to confirm reliability and validity in an independent patient population.
Study 1 (SCAMP) enrolled 500 primary care patients with persistent back, hip, or knee pain of at least moderate severity, 250 of whom had concurrent depression.18 Participants were recruited from university (n = 300) and VA-affiliated (n = 200) internal medicine clinics in Indianapolis. Patients with concurrent depression were enrolled in a trial of depression and pain treatment vs. usual care (n = 250). Those without depression were followed in a parallel observational study (n = 250). The mean age of SCAMP participants was 59 years; 52% were women, 58% were white, and 38% were black. The mean numeric rating of current pain (on a 0–10 scale) was 6.1 (SD 1.9) at baseline.
Study 2 (HELP-vets) enrolled a random visit-based sample of 646 veterans from ambulatory care clinics at two VA hospitals and six affiliated community sites in three urban California counties. Patients with chronic illness were over-sampled by design. The mean age was 63 years and 95% were male. Self-reported race/ethnicity was 54% white, 30% black, and 10% Latino. Sixty-one percent of participants reported pain at the time of enrollment and 63% had one or more pain diagnoses (33% back pain, 45% other musculoskeletal pain, 12% neuropathic pain, 5% headache). The mean rating of current pain (on a 0–10 scale) was 3.1 (SD 3.2) overall and 5.1 (SD 2.6) among those with pain.
The Brief Pain Inventory (BPI) includes two scales that assess pain intensity and pain-related functional impairment (physical and emotional).13,15 The four items of the BPI severity scale assess the intensity of current pain and pain at its least, worst, and average during the past week on scales from 0 (“no pain”) to 10 (“pain as bad as you can imagine”). The BPI interference scale assesses pain-related functional interference with seven items assessing different domains (general activity, mood, walking ability, normal work, relations with other people, sleep, and enjoyment of life) rated from 0 (“does not interfere”) to 10 (“interferes completely”).
The Chronic Pain Grade questionnaire (CPG) includes two three-item scales (intensity and disability) that are transformed into 0–100 scores.19 An algorithm classifies pain into four graded categories: 1) low disability-low intensity, 2) low disability-high intensity, 3) high disability-moderately limiting, and 4) high disability-severely limiting. The CPG has been validated in primary care, chronic pain, and general populations.20, 21, 22
The Roland Disability questionnaire is a pain-specific measure of physical disability validated in patients with back pain and other chronic pain conditions.23,24 It includes a checklist of 24 statements about pain effects on function; the score is the number of items endorsed.
The Pain Global Rating of Change is a single item assessing patients’ overall impression of change in their pain. Study 1 participants were asked whether their pain was worse, about the same, or better since the start of the study. Those who reported that pain was better were asked to rate the magnitude of improvement (a little, somewhat, moderately, a lot, or completely better). Global ratings of change may be more sensitive to improvement and better correlated with patient satisfaction than serial measures.27
The Functional Morbidity Index was developed to assess general functional status in older adults.28 Patients indicate whether they are able perform four different activities independently, and if not, whether the impairment is due to a health problem.
Overall Pain Distress is a single item: “How much did overall pain distress or bother you during the past week?” Response options are not at all, a little bit, somewhat, quite a bit, and very much.
We used a consensus-based process, drawing on a literature review, expert opinion, and statistical data, to develop a shortened scale.29 Pre-specified criteria guided initial item selection. First, we decided to include at least one item representing each of three domains included in the BPI: pain intensity, physical functioning, and emotional functioning. We then selected items with the following characteristics: 1) easy to understand and applicable to patients with all types of pain; 2) good statistical characteristics (e.g., high response variability, high item-remainder correlation); 3) similar performance in depressed and non-depressed patients.
We chose “pain average” for the intensity item because it had a good distribution of responses, lacking the ceiling and floor effects seen with “pain worst” and “pain least,” respectively. We did not select “pain now” because we wanted to avoid duplicating information provided by the “fifth vital sign,” and capture intermittent pain. Although the ideal reporting period for pain assessment is debated, recalled average pain over one week is a valid measure of pain intensity.30,31
BPI interference items include those assessing physical status (general activity, walking, normal work), emotional status (mood, relations with others, enjoyment of life), and sleep. For physical interference, we chose “interference with general activity” because it applies equally to all patients, as opposed to “interference with work” (which may be affected by occupation, employment status, etc.) and “interference with walking” (which may not apply to non-ambulatory patients or those with upper body pain).
For emotional interference, we considered both “interference with mood” and “interference with enjoyment of life.” In our experience, “interference with relations with other people” is more difficult than other items for patients to answer. We wanted a scale that would discriminate between chronic pain and depression, which commonly co-occur.32 In our sample of patients with and without comorbid depression, we found that “interference with enjoyment of life” was more independent of depression than “interference with mood.” We also considered “interference with sleep” in place of the emotional interference items.
We reached consensus on a preferred three-item scale (“pain average,” “interference with enjoyment of life,” and “interference with general activity”) and alternative three-item and four-item scales, which we then evaluated statistically.
Reliability and Validity
We assessed reliability (internal consistency) by calculating Cronbach’s coefficient alpha. To assess construct validity, we compared the PEG with measures of pain and function using Pearson correlation coefficients. We used multiple measures for construct validity assessment, including the BPI, because no criterion standard exists for pain. We hypothesized that coefficients would be slightly higher for comparisons with pain-specific functional measures than for those with pain severity measures (because two of the three PEG items assess function). We also expected that coefficients would be larger for comparisons with pain-specific functional measures than for comparisons with generic functional measures.
Assessment of responsiveness, or sensitivity to change, requires an independent standard to define change.33 We used two different measurements to define the presence or absence of patient improvement: 1) global rating of change and 2) serial CPG grade. We categorized patients according to their pain trajectory as assessed by each of the two measures. Global rating of change categories were defined by the patient’s retrospective assessment at 6 months of the change in their pain since the trial began: 1) improved (“better”), 2) unchanged (“about the same”), and 3) worse (“worse”). CPG categories were defined by the change in CPG grade from baseline to 6 months: 1) improved (pain grade decreased by ≥1 level), 2) unchanged (pain grade at baseline = pain grade at follow-up), and 3) worse (pain grade increased by ≥1 level).
Using data from Study 1, we assessed responsiveness by calculating the following three metrics: 1) change score (difference between mean score at baseline and follow-up), 2) effect size (ES; change score divided by the standard deviation of the baseline score), and 3) standardized response mean (SRM; change score divided by the standard deviation of the change score). These calculations were performed for patients in the improved, unchanged, and worse categories. Confidence intervals for SRM were calculated as + /- 1.96 divided by the square root of the sample size.34 We assessed responsiveness using all three methods because they can produce differing results and because agreement is lacking on the preferred method.34,35 We compared responsiveness of the PEG, BPI severity and BPI interference scales by comparing ES and SRM for each measure among patients in the improved category. Finally, we assessed responsiveness to varying degrees of improvement by comparing change scores for PEG and BPI scales to degree of improvement by global rating of change.
Reliability and Item-total Correlations for PEG and Alternate Scales in Study 1 Sample (n = 500)
Alpha (deleted variable)*
Preferred scale (PEG)
Enjoyment of life
Alternate scale 1
Alternate scale 2
Alternate scale 3
Enjoyment of life
PEG and Individual Item Statistics at Baseline in Study 1 and Study 2
Study 1 (n = 500)
Study 2 (n = 638)
Item 1: average pain intensity
Item 2: interference with general activity
Item 3: interference with enjoyment of life
Reliability and Validity
Correlation between the PEG, BPI Scales, and other Measures at Baseline
Study 1 (n = 500)
SF-36 bodily pain*
Study 2 (n = 638)
Overall pain distress
Responsiveness of PEG among Patients Classified by Pain Global Rating of Change and Serial CPG Grade at 6 Months (n = 210)
PEG 6 months
Global rating of change
Improved (n = 66)
1.20 (0.96, 1.44)
Unchanged (n = 83)
0.29 (0.07, 0.51)
Worse (n = 61)
−0.06 (-0.31, 0.19)
Improved (n = 62)
0.99 (0.74, 1.24)
Unchanged (n = 115)
0.29 (0.11, 0.47)
Worse (n = 32)
0.04 (-0.31, 0.39)
We demonstrated that the PEG, an ultra-brief three-item scale derived from the BPI, was a reliable and valid measure of pain among primary care patients with chronic musculoskeletal pain and diverse VA ambulatory patients. The PEG appears comparable to the BPI in terms of responsiveness to change. These findings support our hypothesis that an abbreviated scale derived from the BPI may be both useful and practical for chronic pain assessment in primary care and other ambulatory care settings, such as medical and surgical specialty clinics.
Strengths of this study include the confirmation of reliability and validity in an independent patient population, the diversity of the study populations, and the availability of multiple pain and functional measures with which to assess construct validity. Our choice of the BPI as the basis for our abbreviated scale development is another strength. The BPI is a widely used instrument that has been validated in numerous patient populations, clinical settings, and languages. BPI items are rated from 0–10, a format that has become familiar to patients and clinicians since assessment of pain with numeric scales has been broadly implemented in US health care settings. We took advantage of our collective experience with the BPI in observational and interventional research by employing a consensus-based process for scale shortening, consistent with recommendations to avoid over-reliance on statistical techniques.29
We believe our use of two different ambulatory study populations is a strength; however, each study has its own limitations. Study 1 was a sample of patients with chronic back and lower extremity musculoskeletal pain and included an over-representation of patients with depression (50% by design). The more clinically diverse patient population of Study 2, including ambulatory VA patients with and without chronic pain, enhances the generalizability of our findings. However, Study 2 included fewer pain measures with which to assess construct validity and was cross-sectional; therefore, we were able to assess responsiveness only in the first sample. Forty percent of Study 1 patients and 100% of Study 2 patients were recruited from VA clinics, so our findings may be less generalizable to non-VA settings.
We found that the PEG differentiated well between patients who improved and those who did not. According to responsiveness metrics, patients in the improved category had a large improvement in PEG score, whereas those in the unchanged category had a minimal change. Proper interpretation of the magnitude of change according to SRM and one group pre-post ES is not entirely clear, although authors have suggested that Cohen’s definition of small (0.2), moderate (0.5), and large (0.8) effects can be applied to interpretation of both responsiveness measures.36,37 We did not find evidence of PEG responsiveness in the worse direction (i.e., change scores between those who were unchanged and those who were worse did not significantly differ). We are limited in our ability to adequately assess sensitivity to worsening because we evaluated responsiveness in a single study population that likely had a ceiling effect for worsening due to high baseline pain severity.
The competing demands of primary care, in which visits are short and pain is only one of several problems warranting attention, make efficiency of assessment a paramount concern.38,39 A balance must be found between feasibility and key characteristics such as reliability, validity, and responsiveness. For example, ultra-brief depression measures containing two to three items perform better than single item depression measures.40 We also evaluated a four-item abbreviated scale, but found that adding an item contributed little. An abbreviated version of the BPI that eliminated a few items has been previously published,41 but the PEG is the first ultra-brief scale based on the BPI.
New assessment strategies are needed to support improved chronic pain management in primary care. We believe the PEG, which includes items assessing pain intensity, emotional function, and physical function, is an important step forward. However, further studies are needed to confirm our findings and validate the PEG in additional patient populations. Prospective research should determine whether serial pain measurement can improve the quality of clinical decision-making and pain outcomes in primary care. Given the huge clinical and societal burden associated with pain, developing efficient and effective strategies to enhance care is an important priority.
We thank Charles Cleeland, PhD, Tito Mendoza, PhD, Lisa Shugarman, PhD, and Cathy Sherbourne, PhD for their insightful contributions to this manuscript and Andy Lanto, MS, for assistance with data analysis.
We presented a preliminary abstract of this work as a poster at the 31st annual meeting of the Society of General Internal Medicine, April 2008.
SCAMP was supported by a grant from the National Institute of Mental Health to Dr. Kroenke (MH-071268). HELP-Vets was supported by the Health Services Research & Development (HSR&D) service of the US Department of Veterans Affairs (IIR-03–150). Drs. Krebs, Lorenz, and Bair were supported by VA HSR&D Research Career Development Awards.
Conflict of Interest Statement
We disclose the following financial relationships: 1) Drs. Lorenz and Asch have received research funding from the Amgen Corporation; 2) Dr. Bair has received research funding and honoraria from Eli Lilly and served on an advisory board for Abbott.