Background

Proxies are individuals who answer questions for survey respondents. Ideally, survey respondents themselves are the best ones to answer the survey. In order to reduce non-response bias and make the survey representative of the study population, proxies are allowed for survey respondents who are not available (e.g., hospitalized or institutionalized) or unable (e.g., physical or cognitive impairments) to answer on their own behalf. Proxy responses are very common when surveys are conducted among the elderly or disabled population [1, 2]. In major Medicare surveys, proxy responses constituted 10 % to 30 % of all responses [3]. However, outcomes reported by proxies may be systematically different from those obtained from patients directly. Proxy response bias is the difference between the responses from proxy and survey respondents. The impact of proxy response bias on the validity of the estimates is a significant concern for researchers when surveys are conducted among the elderly or disabled.

Research has shown that health and functional status have direct impact on the demand for health care services and serve as key predictors of health care expenses [4, 5]. In order to better allocate scarce resources in health care, the accuracy of the estimation of health and functional status is of great importance. In the literature, however, there is a significant disagreement in some health and functional status measures reported by the patient and those reported by the proxy [6, 7]. In order to better utilize patient-reported health and functional status data, it is imperative to understand the potential proxy response bias in this important area.

Extensive literature has assessed the extent of proxy response bias. Much of this work has focused on controlling observed subject characteristics [3, 8, 9]. Since unobserved subject characteristics (mainly physical or cognitive impairments) are not identified, many of the existing studies are subject to omitted variable bias. Some estimates in published studies may be not valid and not give enough information about the extent of proxy response bias.

To date, literature has focused on the extent of proxy response bias among survey respondents who are able to provide responses but are not available at the time of the interview [3, 8]. There has been little attention given to survey respondents who cannot provide responses for themselves. Among the elderly, 28.5 % have physical disabilities and 9.5 % have cognitive disabilities; among the disabled, 53.6 % have physical disabilities and 37.8 % have cognitive disabilities [10]. Given the high proportion of the elderly and disabled who may be unable to respond, it is important to understand the extent of proxy response bias among these groups.

This study has two objectives: (1) to examine the presence, direction, and magnitude of possible differences between proxy-reported and patient-reported outcomes in health and functional status measures among Medicare beneficiaries surveyed in the Medicare Current Beneficiary Survey (MCBS); (2) to assess whether the extent of proxy response bias varies by the relationship between the subject and the proxy (spouse, non-spouse relative, and non-relative). We hypothesize that differences exist between proxy-reported and patient-reported outcomes in some domains of health and functional status and the extent of differences does not vary by the relationship between the subject and the proxy.

Methods

Data and study sample

We used data from the MCBS for this study. The MCBS is a longitudinal panel survey of Medicare beneficiaries, including community-dwelling and facility-dwelling beneficiaries, conducted by the Centers for Medicare and Medicaid Services (CMS). Survey respondents were interviewed 3 times a year over 4 years. It collects information about demographic and socioeconomic characteristics, health and functional status, service use, and health care spending for persons covered by Medicare. The MCBS is an appropriate dataset for this study in that it contains Medicare enrollment and claims data in addition to survey data. Because Medicare enrollment and claims data are independent to the survey, they are not subject to proxy response bias.

This study is a pooled cross-sectional study for a nationally representative sample of community-dwelling Medicare beneficiaries from 2006 to 2011. Because facility-dwelling beneficiaries do not have patient-reported data, we only included community-dwelling beneficiaries in the study.

Measures

Survey respondents can respond either for themselves or via proxies. The best one to respond to the MCBS is the survey respondent. An effort is made to interview the survey respondent directly. In case the survey respondent is unable or not available to respond, he or she needs to name a proxy. In many cases, a spouse or child will serve as a proxy. But the proxy is not required to be a relative of the survey respondent. The outcome of interest in the study is health and functional status. It was assessed across five domains: physical, affective, cognitive, social, and sensory status. Health and functional status was measured by the percentage of limitations reported by survey respondents or proxies.

Statistical analysis

The study used propensity score matching to balance the distribution of measured covariates between patient-report and proxy-report groups. Specifically, five steps were used to estimate the proxy response bias. In the first step, we used univariate relative risk regression to calculate unadjusted relative risk (RR) and 95 % confidence interval (CI) for each heath and functional limitation between patient-reports and proxy-reports. In step 2, we assessed the differences in socio-demographic characteristics and chronic conditions between two groups by using the chi-square test. Socio-demographic characteristics and chronic conditions may confound the association between types of responses and health and functional limitations. In step 3, we conducted a multivariate logistic regression. In the model, the dependent variable was the log of proxy and independent variables were a set of conditioning variables. Conditioning variables were restricted to those from Medicare enrollment and claims data; those variables included age, gender, race, education, marital status, household size, income, Medicare status, Charlson Comorbidity Index (CCI), and dementia. Based on the values of conditioning variables, each subject had an estimated propensity score, which is the predicted probability of using a proxy. In step 4, Greedy 5-to-1 digit matching was used to create matched samples [11]. Patient-reports were matched to proxy-reports in a 1:1 ratio (without replacement). With this matching method, patient-reports were first matched to proxy-reports with the same 5 digits. For those that did not match, patient-reports were matched to proxy-reports with the same 4 digits. Similar processes were continued until the remaining patient-reports were matched to the remaining proxy-reports with the same 1 digit. In the last step, matched patient-report and proxy-report samples were compared to assess the extent of proxy response bias. Conditional Poisson regression was used to analyze matched-pair data and calculate adjusted RR and 95 % CI for each health and functional limitation. Alternative matching techniques, including Kernel matching, radius matching with caliper 0.001, and Mahalanobis metric matching, were conducted as a sensitivity analysis to test the robustness of the results.

In Objective 2, we conducted a stratified analysis of Objective 1. The same matching technique was used in the stratified analysis. The only difference between the two aims was the number of comparisons being made. In Objective 2, proxy-reports were divided into three subgroups by the relationship between the subject and the proxy. Hence, we have three comparison groups: (1) spouse proxy-reports vs. patient-reports; (2) non-spouse relative proxy-reports vs. patient-reports; and (3) non-relative proxy-reports vs. patient-reports.

The study protocol was approved by the University of South Carolina Institutional Review Board. The study adhered to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) checklists for cross-sectional studies. All analyses were performed using SAS Software version 9.4 (Statistical Analysis Systems, Cary, NC) and STATA version 13 (STATA Corp, College Station, TX).

Results

The study identified a total of 76,115 person-years of patient-reports and 8,822 person-years of proxy-reports. Among proxy-reports, most of them were non-spouse relative proxy-reports (n = 5,126), followed by spouse (n = 3,011) and non-relative proxy-reports (n = 684).

The socio-demographic characteristics differed significantly between survey respondents who self-reported and those who are proxy-reported (Table 1). Survey respondents reported via proxies were more likely to be male, non-white, single, disabled, older than 85 years, less than a high school education, larger household size, and lower annual income. Patient-reports also differed significantly from proxy-reports in some self-reported chronic conditions. Especially, proxy-reports were associated with significantly more physical (measured by CCI) and cognitive impairments (measured by dementia). Distributions of some socio-demographic characteristics and chronic conditions were found to be uneven between patient-reports and proxy-reports.

Table 1 Characteristics of patient-reports and proxy-reports among medicare beneficiaries

Except for difficulties in stooping/crouching/kneeling, proxy-reports were associated with significantly higher percentages of health and functional limitations compared with patient-reports (Table 2). The magnitude of differences between two types of responses varied by domains and specific questions within domains. The observed differences can be attributed to non-random allocation or proxy response bias.

Table 2 Unadjusted and adjusted comparisons between patient-reports and proxy-reports in health and functional limitations

After applying the propensity score matching, we identified 7,780 person-years of patient-reports paired with 7,780 person-years of proxy-reports. Most characteristics were similar between two types of responses (Table 1). Proxy response bias was not observed in seeing (RR: 1.01, 95 % CI: 0.97-1.06) and eating solid foods (RR: 1.05, 95 % CI: 0.99-1.12) and was very small in hearing (RR: 1.09, 95 % CI: 1.05-1.13) (Table 2). Four other domains were found to have proxy response bias even after propensity score matching. Proxies tended to report more health and functional limitations in comparison to survey respondents themselves. Two domains had small proxy response biases: affective (RR: 1.03-1.12) and social status (RR: 1.20). The cognitive status domain (RR: 1.80-2.85) had moderate proxy response bias. Within the physical status domain, small proxy response bias was found in mobility (RR: 0.97-1.28) and moderate to large proxy response biases were found in activities of daily living (ADL) (RR: 1.16-3.10) and instrumental activities of daily living (IADL) (RR: 1.28-3.83). Significant items and proxy response bias in each domain were summarized in Table 3. A question regarding survey respondents’ difficulties in managing money was associated with the largest proxy response bias (RR = 3.83). The results were robust in the sensitivity analysis (Table 4).

Table 3 Summary of significant items and proxy response bias in each domain
Table 4 Sensitivity analysis to compare Greedy 5-to-1 matching and other propensity score matching techniques

Characteristics were balanced between two types of responses in the stratified analysis. (Data not shown) With few exceptions, the presence, direction, and magnitude of differences between proxy-reported and patient-reported outcomes did not vary much in the subgroup analysis (Table 5). That is to say, the ways spouse, non-spouse relative, and non-relative proxies respond to the survey were very similar.

Table 5 Subgroup analysis of comparisons between patient-reports and proxy-reports in health and functional limitations

Discussion

The study successfully controlled for major confounding variables of proxy response bias. The distributions of socio-demographic characteristics and chronic conditions between two types of responses were not even in the study. The presence of proxy response bias observed in the unadjusted analysis might be attributed to socio-demographic characteristics and chronic conditions differences. For example, survey respondents older than 85 years were more likely to respond to the survey via proxies. They were also more likely to have health and functional limitations. Another example was dementia patients. They were more likely to use proxies to respond and have more health and functional limitations. Existing studies also found that socio-demographic characteristics confounded the association between proxy response and health and functional limitations [3, 8, 9]. So unevenly distributed socio-demographic characteristics and chronic conditions might serve as confounding variables in the study. These major confounders were controlled using the propensity score matching approach.

Given the high base rate of health and functional limitations among the elderly and disabled, it seems proxies are more likely to assume that an elderly or disabled survey respondent has a limitation unless they have sufficient information about the limitation. Under this assumption, proxies tended to report more health and functional limitations for elderly or disabled survey respondents.

The extent of proxy response bias depended on the domain being tested. Survey respondents with sensory limitations can be easily observed by proxies. However, physical, affective, cognitive, or social limitations are sometimes hard to observe. So the sensory status domain was less likely to suffer proxy response bias.

The nature of the question being asked can also impact the extent of proxy response bias. Proxies are good reporters for objective, observable, or easy questions but usually do not have enough information on private, unobservable, or complex questions. For example, a question regarding survey respondents’ difficulties in walking 1/4 mile or 2–3 blocks is objective, observable to proxies, and easy to answer. Even though proxies still report more limitations, the magnitude of proxy response bias was very small. On the contrary, difficulty in managing money is very complex. As a result, large proxy response bias was observed for this question. Another example is a question regarding survey respondents’ difficulties in using the toilet. This question is about an activity that is private and unobservable to proxies. So we observed large proxy response bias for this question.

The MCBS is a widely used dataset in conducting health services research among Medicare beneficiaries. The issue of non-response bias in the MCBS was previously investigated [12]. The study found that the MCBS was not subject to non-response bias. The current study is the first to investigate proxy response bias in the MCBS. When using proxy-reported data in the MCBS, most of the existing studies assumed that responses from survey respondents and proxies can be used interchangeable. According to the findings of this study, however, such an assumption is invalid. In order to improve use the MCBS, researchers should be aware of proxy response bias.

The study has limitations. First, even though we included some unobservable variables identified in previous studies, it is highly possible that we did not include other unobservable confounding variables. If that is the case, we were not able to simulate random allocation of all confounding variables. As a result, the study may be subject to omitted variable bias and will lead to type I error. Secondly, this study only investigated proxy response bias. The extent of self-report bias is unknown. Thirdly, this study matched patient-reports to proxy reports in a 1:1 ratio. Using 1:n matching could lead to higher bias, although it might increase estimate precision [13].

The study has the following four strengths. First, survey respondents who cannot respond for themselves were included in the analysis. Secondly, cognitive impairments were included in the group of conditioning variables. According to the literature, cognitive impairments are the major reason for non-random allocation of survey respondents between proxy-reports and patient-reports. Thirdly, all the conditioning variables are independent to the survey. As a result, they are free from proxy response bias. Finally, in addition to Greedy 5 to 1 digit matching, this study used alternative matching techniques as a sensitivity analysis [1416]. The presence, direction, and magnitude of proxy response bias are similar to Greedy 5 to 1 matching used by the study. Therefore, the results are robust to alternative matching techniques.

Conclusions

Proxy response bias was present in the physical, affective, cognitive, and social status domains but not in the sensory status domain. Specifically, proxies tended to report more health and functional limitations among the elderly or disabled population compared with patient-reports. The magnitude of proxy response bias was large in questions involving private information, unobservable factors, or complex answers. When assessing the impact of different relationships on proxy response bias, the presence, direction, and magnitude all remained the same. When patient-reported outcomes are not available, researchers should accept proxy reports for sensory status and objective, observable, or easy questions. For physical, affective, cognitive, or social status and private, unobservable, or complex questions, proxy reports should be used with caution when patient-reported outcomes are not available.

The current study provides useful findings for survey organizations that wish to minimize proxy response bias. At the questionnaire development stage, objective, observable, or easy questions that do not call for judgments by proxies are preferred. At the survey execution stage, when the subject is unable to respond, interviewers should identify a proxy who is familiar with the questions being asked. The results of this study will also help researchers better use survey data. When using survey data obtained from proxies, researchers should describe possible effects of proxy response bias on study results.