Methods for interpreting change over time in patient-reported outcome measures
Interpretation guidelines are needed for patient-reported outcome (PRO) measures’ change scores to evaluate efficacy of an intervention and to communicate PRO results to regulators, patients, physicians, and providers. The 2009 Food and Drug Administration (FDA) Guidance for Industry Patient-Reported Outcomes (PRO) Measures: Use in Medical Product Development to Support Labeling Claims (hereafter referred to as the final FDA PRO Guidance) provides some recommendations for the interpretation of change in PRO scores as evidence of treatment efficacy.
This article reviews the evolution of the methods and the terminology used to describe and aid in the communication of meaningful PRO change score thresholds.
Anchor- and distribution-based methods have played important roles, and the FDA has recently stressed the importance of cross-sectional patient global assessments of concept as anchor-based methods for estimation of the responder definition, which describes an individual-level treatment benefit. The final FDA PRO Guidance proposes the cumulative distribution function (CDF) of responses as a useful method to depict the effect of treatments across the study population.
While CDFs serve an important role, they should not be a replacement for the careful investigation of a PRO’s relevant responder definition using anchor-based methods and providing stakeholders with a relevant threshold for the interpretation of change over time.
KeywordsPatient-reported outcome Interpretation Anchor-based Distribution-based Change over time Quality of life Cumulative distribution function Minimal important difference Responder definition
Asthma Quality of Life Questionnaire
Cumulative distribution function
Chronic Heart Failure Questionnaire
Chronic Respiratory Questionnaire
Eastern Cooperative Oncology Group
Food and drug administration
Industry Advisory Committee
International Society for Quality of Life Research
Minimal clinically important difference
Minimal important difference
Quality of life
The use of patient-reported outcome (PRO) measures in research studies, clinical trials, and clinical practice has risen dramatically over the last 30 years and will continue to rise as health care assessments become increasingly patient centered . The development or selection of the appropriate PRO instrument for measuring the most relevant and appropriate endpoints requires attention to the conceptualization of the measure, the PRO’s content validity as well as measurement properties of reliability, other validity, and ability to detect change. However, once PRO measures with established and acceptable measurement properties have demonstrated statistically significant changes, further research to establish benchmarks for interpretation of results is necessary. Interpretation guidelines are required for change scores to evaluate efficacy of an intervention and communicate PRO results to regulators, patients, physicians, and providers . In 2009, the U.S. Food and Drug Administration (FDA) published the final Guidance for Industry Patient-Reported Outcomes (PRO) Measures: Use in Medical Product Development to Support Labeling Claims (hereafter referred to as the final FDA PRO Guidance)  that included specific recommendations for the interpretation of change in PRO scores as evidence of efficacy of treatments. Despite the usefulness of this document, the Agency’s recommendations continue to evolve over time .
This article provides a historical perspective and a focused review of key publications in the evolution of methods used for interpreting treatment effects from endpoints designed to provide evidence for FDA medical product labeling claims based on PRO measures. These interpretation methods for longitudinal clinical trial results have been developed and debated over several decades and include anchor-based and distribution-based methods for interpreting change over time and establishing interpretation guidelines, as well as the use of cumulative change distribution curves. Evolution in terminology for describing what is a meaningful improvement in a PRO endpoint is reviewed to provide a historical context for some of the many terms to describe an important threshold for changes in PRO endpoints. Finally, the current challenges and recommendations to improve the understanding of PRO trial results that balance the interpretation needs of many stakeholders in the medical product development process are discussed, while recognizing that two other important aspects of PRO interpretation—response shift and proxy respondents/measurements—are outside the scope of this article.
Historically for many clinical measurements or examinations, extensive patient experience was usually a feasible and valid way for physicians to assess the significance of instrument score changes over time. However, since most PRO measures are predominantly used as research tools, not clinical practice instruments, there may be lack of such experience to assess the meaningfulness of a change. In addition, changes in PRO scores are usually expressed as units on an abstract scale that need to be correlated with something more interpretable in order to acquire meaning. Moreover, because statistical significance does not guarantee that observed differences between treatments or within an individual over time are important or meaningful to patients, there is a need to provide a systematic approach to document what level of change in a PRO measure is important to patients.
To address these concerns, Jaeschke et al.  were the first to introduce the term minimal clinically important difference(MCID) for PRO instruments in 1989 to indicate the “smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in the patient’s management.” To determine an MCID, the authors used data from three studies [6, 7, 8] that included either the Chronic Respiratory Questionnaire (CRQ)  or the Chronic Heart Failure Questionnaire (CHQ) . These questionnaires differ on a single item, and both assess the domains of dyspnea, fatigue, and emotions using a 7-point Likert scale. In addition, in all three studies, at follow-up visits, patients were asked a “global rating of change” question (with possible ratings from −7, “a great deal worse,” to +7, “a great deal better”) for each of the three domains to assess whether they had experienced change since the start of their treatment. By using the CRQ and CHQ data and the global rating of change responses as the anchors to define small, medium, and large changes (see details on anchor-based methods in section “Review of methods” below), Jaeschke and colleagues concluded that an MCID for both instruments was approximately 0.5 per item, which was also consistent with consensus expert opinion at that time.
In the 1990s, several researchers applied this important change methodology suggested by Jaeschke and colleagues, including other McMaster University colleagues investigating these thresholds for the Asthma Quality of Life Questionnaire (AQLQ) . This 1994 study improved on the 1989 study’s methodology by: 1) investigating whether the magnitude of important improvements are similar to important deterioration levels and (2) recognizing that the process did not incorporate any clinical assessments or judgments and therefore represent the minimal important difference (MID), not the MCID. In their conclusion, the Juniper and colleagues authors demonstrated that the 0.5 MID is important when the AQLQ is used to examine within-patient changes (both improvements and declines), but the same threshold does not necessarily apply when examining differences between patients, and presumably, between patient groups .
Key meetings and communications related to the interpretation of change over time for PRO measures
Meeting title and location
Paper topic and reference
Symposium on the Clinical Significance of QoL Measures in cancer Patients
Mayo Clinic, Rochester, MN (USA)
1. Methods to explain the clinical significance of health status measures 
Distribution-based methods estimates may not suffice on their own but are useful if consistent with anchor-based results
Anchor-based results generated from only one anchor may need to be supplements with validation of alternative anchors
More work is needed on interpretation approaches if they are to be used by clinicians in their day-to-day practice
2. Group versus individual clinical significance differences 
Group-level data may be used to guide decisions about individual changes but not without the presence of measurement error
Multi-anchor approach is suggested to establish individual clinical significance in which patient self-report, individual preferences, clinical expectations, and empirical behaviors are included in the interpretation of change, giving the patient self-rating the highest weight
3. Single items versus summated scores 
If a detailed description of the construct of interest is required then multi-item indices may be required. On the other hand, if only a global impression of QoL is needed, then a single measure or item score may be sufficient
The clinical significance of QoL assessments does not depend on how the score in constructed and the same methodologies and interpretation can be applied to both single-item global measures or multi-item indices
4. Patients, clinicians and population perspectives on clinical significance of HRQL data 
More research is needed to create clear QoL interpretation guidelines for clinicians and health service providers
More patient input is required on what constitutes a clinically important difference. Clinicians can then use this as a guide in their clinical practice
5. Assessing change over time 
A checklist was created as a guide for clinicians to critically assess and interpret longitudinal QoL data and use in the treatment decision process
6. Interpreting the clinical significance of HRQL results from 2 perspectives: clinical trials and clinical practice 
No universal approach can determine the clinical significance of HRQL data for both research and practice settings. A difference can be meaningful for one context, but not for the other
FDA Guidance on Patient-Reported Outcomes: Discussion, Dissemination, and Operationalization
Chantilly, VA (USA)
FDA Perspective on PROs to support medical product labeling claims 
In the paper by Patrick et al., a conceptual distinction is made on interpretation of PRO data depending on the how the patient response to treatment is being measured by the PRO: (a) comparison of the average change from baseline across all patients in treatment and control group according to the between-group criteria or MID; (b) comparison of the proportion of patients in each group who meet the prespecified criterion for response or “responder criteria”
Interpreting and Reporting Results Based on Patient-Reported Outcomes 
This paper focuses on issues associated with assessing clinical significance and common pitfalls to avoid in presenting results related to PROs. Specifically, the questions addressed by this manuscript involve: What are the best methods to assess clinical significance for PROs? How should investigators present PRO data most effectively in a Food and Drug Administration (FDA) application? In labeling or in a scientific publication?
The FDA draft PRO Guidance, issued in February 2006, provided some text on the complicated issue of methods for interpretation of results generated from PRO measures for medical product labeling claims . In the draft PRO Guidance, the MID was presented as an approach to facilitate interpretation of clinical trial results of PRO endpoints. For widely used measures such as treadmill distance or the Hamilton Depression Rating Scale , it was suggested that the ability to show any difference was treated as evidence of a relevant treatment effect. However, the draft PRO Guidance suggested that PRO instruments may be more sensitive than past measures, and thus, an MID benchmark can serve as a guide for interpreting mean differences. The concept of “mean effect” (i.e., differences between group means or MID) was distinguished from a responder effect criterion, defined as “change in an individual that would be considered important” (line 542). The draft PRO Guidance reviewed methods to derive MIDs, including mapping changes in a PRO score to non-PRO measures or to other PRO scores such as global impression of change, distribution-based approaches (see details on distribution-based methods in section “Review of methods” below) and empirical rules (e.g., a fixed percentage of a theoretical range). At the same time, the draft PRO Guidance noted that distribution-based approaches and empirical rules were problematic because these approaches do not directly reflect patient preferences or assessments of meaningful change and do not address what magnitude of treatment differences may be clinically meaningful.
Finally, the draft PRO Guidance stated that “there may be situations where it is more reasonable to characterize the meaningfulness of an individual’s response to treatment than a group’s response.”(lines 571–572). Therefore, this document suggested that it would be acceptable to categorize a patient as a responder based on prespecified criteria backed by empirically derived evidence of the responder definition. It is important to note that in the draft PRO Guidance, the FDA specifically asked for comments on the need and appropriate review standards for the MID and responder definitions applied to PRO instruments in the context of drug development.
In February 2006, a meeting titled “FDA Guidance on Patient-Reported Outcomes: Discussion, Dissemination, and Operationalization” was held in Chantilly, VA, USA. The intent of the meeting, organized jointly by the Mayo Clinic and the FDA, was to: (1) facilitate review and discussion of the FDA draft PRO Guidance among diverse stakeholders and (2) generate a supplement to the FDA draft PRO Guidance that would provide detail and exposition that was not possible to communicate in the relatively brief 36-page document. Based on discussions during this meeting, two key articles that addressed interpreting and reporting PRO data were published in Value in Health (Table 1) [21, 22]. The meeting and related publications provided an elaboration of the contents of the FDA draft PRO Guidance, including interpretation of PRO data, from the viewpoint of experts in academia, industry, clinical research, clinical practice, and FDA reviewers.
With the 2009 release of the final FDA PRO Guidance, FDA clarified concerns about interpretation of individual versus group PRO score change over time . This most recent PRO Guidance focused on individual responses using an a prioriresponder definition representing “the individual patient PRO score change over a predetermined time period that should be interpreted as a treatment benefit.”  (p. 24). Use of empirical evidence derived from anchor-based methods was proposed as the appropriate methodology to explore associations between the targeted concept of the PRO instrument and the concept measured by the anchors. Different types of anchors such as clinical measures and patient global ratings of change were suggested. The final FDA PRO Guidance indicated that distribution-based approaches should be considered supportive in determining clinical significance of particular score changes and “are not appropriate as the sole basis for determining a responder definition.”  (p. 25).
The final FDA PRO Guidance also proposed the alternative of presenting the entire distribution of a clinical trial’s change scores for both treatment and control groups as a cumulative distribution graph (see details on cumulative distribution function of responses in section “Review of methods” below). The cumulative distribution function avoids the need to define a responder and allows for the evaluation of the entire distribution of change scores for both treatment and control groups.
Thus, the FDA’s draft and final PRO Guidance distinguished between individual change and group differences and emphasized the need for an empirically determined definition of a meaningful individual-level change threshold as the desired approach to interpreting change over time in PRO scores. Moreover, in the final PRO Guidance, any reference to the term MID as the interpretation of group change score differences was removed. The evolution of the specific methods that elucidate the best estimate for a responder definition within a specific clinical trial setting is described below.
Review of methods
The categorization of anchor-based and distribution-based methods for interpreting PRO scores was a taxonomy first proposed in 1993 by Lydick and Epstein . Anchor-based methods were considered those that explore the association between the targeted concept of a PRO instrument and the same or closely related concept measured by an independent anchor or anchors. Hence, changes seen in the PRO instrument are compared, or anchored, to changes on the anchoring item or measure. Potential anchors fall into three broad categories: patient ratings, clinician ratings, and direct clinical anchors. In all instances, it is imperative that any selected anchor should have intuitive meaning, be easier to interpret than the PRO instrument itself, and have an appreciable association with the PRO instrument. The minimum magnitude of this association between the PRO change scores and the anchor, however, is currently under debate, with recommended correlations of at least r = 0.3  or r = 0.5  in absolute value.
The most commonly reported anchor-based method is that first suggested by Jaeschke et al. , based on within-patient global ratings of change. This method involves asking patients to rate overall how much change they experienced on a PRO concept between two relevant time points (e.g., baseline and end of study) as “about the same” or on a gradient of “better” or “worse.” The gradients of global rating of change assessments typically range from a 15-point scale  to a 7-point scale , with greater preference for the latter due to a clearer distinction between response options. The PRO change scores of those patients choosing “minimal” or “small” change responses can then be averaged to calculate the MID [5, 11]. Likewise, the average change scores of those selecting the moderate and large change gradients can be used to derive important difference thresholds and to establish a responder definition  if responses greater than minimal or small changes are better descriptions of a treatment benefit. It is also important to note that for some health conditions with a known history of progressive deteriorations, no change over time may represent a treatment benefit.
These patient global ratings of change are easily interpreted; however, they have been criticized for relying on patients’ reconstructive memories, which can be poor and result in a systematic underestimation of the initial state and a recall bias for the present state . This is apparent when the change response has a high positive correlation with the end of study measurement and a near-zero correlation with the baseline measurement [28, 29]. Furthermore, the FDA has recently suggested that patient ratings of change are inappropriate for certain diseases, such as irritable bowel syndrome, due to the high level of symptom variability across short periods of time, in addition to the error associated with retrospective assessments over long-time periods. To address these issues, the use of a patient global rating of concept has been suggested by the FDA in 2010 . This method involves asking patients to rate their current state on the concept of interest at each key time point (e.g., “How would you rate your IBS symptoms overall over the past seven days?”). Changes in the global rating of concept across time points (e.g., from baseline to end of treatment) can then be calculated to create responder definitions in much the same way as global ratings of change. However, any investigation of the global rating of concept method should give careful consideration to whether the anchor item: (1) accurately measures the PRO’s concept; (2) includes meaningful and useful response options; and (3) can inform when an important change over time happens from the patients’ perspective.
To appreciate important changes requiring clinical judgment, clinician ratings and direct clinical anchors can be used. Clinician’s global ratings of change employ the same methodology as patient global ratings of change and ask clinicians to rate a patient’s magnitude of change or improvement over time and have been used to identify a PRO responder definition . Clinician ratings of meaningful differences have, however, also been criticized due to the incongruity between patients and clinicians perception of important change  and are best applied when patient judgment may be impaired (e.g., mental health conditions).
Direct clinical anchors are thus also commonly used to interpret change in a PRO. These anchors link change in PRO concepts or domains with change in an external criterion. For example, Kosinski et al.  assessed change among patients with rheumatoid arthritis on the SF-36 and Health Assessment Questionnaire based on minimal, moderate, and large categorical improvements in joint tenderness and swelling counts, as well as patient and physician global assessments, and a global pain assessment. Eton et al.  assessed change in four breast cancer endpoints based on analgesia use, change in Eastern Cooperative Oncology Group (ECOG) performance status, and response to treatment (complete response, partial response, stable disease, and progression).
As explained earlier, to be useful in understanding a PRO’s responder definition, clinical anchors need to be relevant to, and correlated with, the QOL concept under consideration. Thus, the joint tenderness or swelling count anchors used by Kosinski et al.  may have been appropriate to interpret change on the SF-36 bodily pain subscale, but perhaps less appropriate for other subscales. Furthermore, to assess change in a PRO score associated with different levels of important improvement (minimal, moderate, and large), clinical assessments that use cross-sectional categorical ratings (e.g., none, mild, moderate, and severe) may be problematic if the clinical category selection is considered subjective and/or inconsistently applied across different clinicians. Even when these rating scale categories are precisely defined, the wide range of patient status captured in each category can make movement from one category difficult, and therefore, unable to detect potentially important treatment benefits.
Distribution-based methods are another set of approaches for estimating the magnitude of meaningful PRO change scores using statistical parameters from the clinical trial population. Although there are a variety of different methods for interpretation based on the statistical distribution, all express a change score difference relative to some measure of variability.
The effect size (ES) is often employed to compare two or more groups to benchmark the magnitude of the group difference. Distribution-based ESs have the advantage of placing mean differences into a unit-less metric, thus allowing comparisons between different phenomena, as well as comparisons between groups in a treatment study. There are well-accepted standards for judging an ES: 0.2 is considered small, 0.5 is medium, and 0.8 or greater is large. These conventions were introduced by Jacob Cohen based on his experiences with education and psychological tests , and some empirical evidence exists in the health sciences supporting Cohen’s effect size convention . Nonetheless, Cohen warned that these effect size standards should be used only “in the belief that more is to be gained than lost by supplying a common conventional frame of reference which is recommended for use only when no better basis for estimating the ES index is available.” (p. 25).
There are several different methods for calculating group ES using the mean change score differences in the numerator, and a variety of calculations of the standard deviation in the denominator . Most commonly, ES is the ratio of the group differences in mean change scores to the baseline standard deviation score . Norman et al.  noted that the MID for PRO measures was frequently very close to a half standard deviation, or an ES of 0.5. This observation was based on a systematic examination of 38 PRO studies, using different instruments and across different diseases .
The standard error of measurement (SEM) has also been a useful distribution-based method for gauging an important PRO change for an individual incorporating a statistical parameter with origins in classical test theory . The SEM is calculated by multiplying the standard deviation by the square root of 1 minus the PRO’s reliability coefficient (SD√(1−rxx) . One SEM has demonstrated a strong correspondence to anchor-based individual change thresholds for many PRO measures [40, 41, 42, 43, 44, 45].
These distribution-based methods for PRO interpretation provide an alternative to anchor-based methods when an appropriate anchor is not available. Their limitations are closely related to the source of their usefulness; that is, because they are not derived using a relevant external criterion, interpretation must be based on prior conventional benchmarks (e.g., small, medium, or large ES, half standard deviation, or 1 SEM). Moreover, the final PRO Guidance states FDA’s view that the interpretation of PRO change be based on anchor-based methods, with distribution-based approaches playing only a supportive role .
Cumulative distribution function of responses
The December 2009 final FDA PRO Guidance represents a useful step forward in guiding the interpretation of PRO change scores beyond achieving statistical significance to support medical product labeling claims. This document has removed reference to the concept of the minimum important difference (MID) as a primary aid to interpretation of trial results. However, given the role that the MID concept has played in the development and use of PROs over the past two decades, the use of this acronym will certainly live on in discussions and the scientific literature when the interpretation of change over time in PROs is addressed. The reason for the disappearance of the MID term in the Guidance is presumably due to the manner that the MID change threshold was being applied. That is, in the 2006 draft PRO Guidance, the MID was viewed as the minimum difference in mean change from baseline between treatment groups that can be interpreted as an important difference . This is also often referred to as the between-group difference . The inconsistency of using a single change threshold derived from data designed to estimate important individual changes to inform the required magnitude of difference in group changes will hopefully end with the elimination of the MID terminology in the final PRO Guidance. The FDA has named the responder definition as the appropriate individual or within-subject change threshold, while at the same time, recognizing that selecting a specific level for a meaningful response to treatment can be quite subjective. Therefore, the final FDA PRO Guidance also recommends displaying PRO change results using a cumulative distribution curve of responses, where the percentage of patients in each study group achieving a spectrum of change thresholds can be easily compared . It is important to note that although a thorough discussion is outside the scope of this article, response shift [48, 49, 50] and the use of proxy respondents or proxy measurements [51, 52, 53] are also threats to the PRO interpretation process described, and a limitation to this article’s focus.
The authors of this article support the modification in interpretation mentioned above. We agree that anchor-based methods for finding the responder definition best describe the estimated change of an individual experiencing a treatment benefit, not a groups’ change over time. We also support the use of the cumulative response distribution curves so that the percentage of clinical trial subjects achieving all possible change levels can be easily displayed in a single diagram. However, we do not see the cumulative distribution functions as a replacement for the careful investigation of a PRO’s relevant responder threshold using anchor-based methods, and supported by distribution-based methods. Identifying the best estimate for the level where individual patients demonstrate a meaningful treatment benefit provides important information to patients, physicians, payors, and policy makers. As more studies are accumulated, triangulation across results may be needed to obtain the best estimate of the responder definition and stakeholder comfort in this threshold. The FDA’s 2010 request for cross-sectional patient global assessments of concept  versus patient global assessments of change as patient-reported anchors has challenged the long-time approach exemplified by Jaeschke et al. .
We also recognize the usefulness of identifying the minimum difference in mean change from baseline between treatment groups that can be interpreted as an important difference (MID). Clinical researchers and others in the medical product development process continue to turn to PRO specialists for this estimate to plan clinical trials, where the MID influences sample size calculations. Although the MID point estimate is no longer required for the interpretation of PRO results and incorporation into the FDA PRO dossier, it remains an important threshold that also deserves careful consideration in planning all clinical trials that include PRO assessments.
Members of the Industry Advisory Committee (IAC), the Board of Directors of the International Society for Quality of Life Research (ISOQOL), and two anonymous reviewers offered valuable suggestions that were incorporated into this paper.
- 1.Patient-Centered Outcomes Research Institute (PCORI). Available at: http://www.pcori.org/home.html.
- 3.Food and Drug Administration. (2009). Guidance for industry on patient-reported outcome measures: Use in medical product development to support labeling claims. Federal Register, 74(235), 65132–65133.Google Scholar
- 4.Burke, L. B., & Trenacosti, A. M. (2010). Interpretation of PRO trial results to support FDA labelling claims: the regulator perspective. International Society for Pharmacoecomomics and Outcomes Research 15th Annual International Meeting. Atlanta: GA.Google Scholar
- 12.Sloan, J. A., Cella, D., Frost, M., Guyatt, G. H., Sprangers, M., & Symonds, T. (2002). Assessing clinical significance in measuring oncology patient quality of life: Introduction to the symposium, content overview, and definition of terms. Mayo Clinic Proceedings, 77(4), 367–370.PubMedCrossRefGoogle Scholar
- 19.Food and Drug Administration. (2006). Draft guidance for industry on patient-reported outcome measures: Use in medical product development to support labeling claims. Federal Register, 71(23), 5862–5863.Google Scholar
- 30.Wyrwich, K., Harnam, N., Revicki, D. A., Locklear, J. C., Svedsater, H., & Endicott, J. (2009). Assessing health-related quality of life in generalized anxiety disorder using the Quality Of Life Enjoyment and Satisfaction Questionnaire. International Clinical Psychopharmacology, 24(6), 289–295.PubMedCrossRefGoogle Scholar
- 32.Kosinski, M., Zhao, S. Z., Dedhiya, S., Osterhaus, J. T., & Ware, J. E., Jr. (2000). Determining minimally important changes in generic and disease-specific health-related quality of life questionnaires in clinical trials of rheumatoid arthritis. Arthritis and Rheumatism, 43(7), 1478–1487.PubMedCrossRefGoogle Scholar
- 33.Eton, D. T., Cella, D., Yost, K. J., Yount, S. E., Peterman, A. H., Neuberg, D. S., et al. (2004). A combination of distribution- and anchor-based approaches determined minimally important differences (MIDs) for four endpoints in a breast cancer scale. Journal of Clinical Epidemiology, 57(9), 898–910.PubMedCrossRefGoogle Scholar
- 34.Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
- 39.Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric Theory. New York: McGraw-Hill.Google Scholar
- 43.Cella, D., Eton, D. T., Fairclough, D. L., Bonomi, P., Heyes, A. E., Silberman, C., et al. (2002). What is a clinically meaningful change on the Functional Assessment of Cancer Therapy-Lung (FACT-L) Questionnaire? Results from Eastern Cooperative Oncology Group (ECOG) Study 5592. Journal of Clinical Epidemiology, 55(3), 285–295.PubMedCrossRefGoogle Scholar
- 45.Yost, K. J., Cella, D., Chawla, A., Holmgren, E., Eton, D. T., Ayanian, J. Z., et al. (2005). Minimally important differences were estimated for the Functional Assessment of Cancer Therapy-Colorectal (FACT-C) instrument using a combination of distribution- and anchor-based approaches. Journal of Clinical Epidemiology, 58(12), 1241–1251.PubMedCrossRefGoogle Scholar
- 46.ARICEPT Oral Solution (Donepezil Hydrochloride) [approval label]. Available at: http://www.accessdata.fda.gov/drugsatfda_docs/label/2004/21719lbl.pdf.
- 53.van der Linden, F. A., Kragt, J. J., van Bon, M., Klein, M., Thompson, A. J., van der Ploeg, H. M., et al. (2008). Longitudinal proxy measurements in multiple sclerosis: Patient-proxy agreement on the impact of MS on daily life over a period of two years. BMC Neurol, 8, 2.PubMedCrossRefGoogle Scholar