The Association Between Adiposity and Inpatient Hospital Costs in the UK Biobank Cohort

Background High adiposity is associated with higher risks for a variety of adverse health outcomes, including higher rates of age-adjusted mortality and increased morbidity. This has important implications for the management of healthcare systems, since the endocrinal, cardiometabolic and other changes associated with increased adiposity may be associated with substantial healthcare costs. Methods We studied the association between various measures of adiposity and inpatient hospital costs through record linkage between UK Biobank and records of inpatient care in England and Wales. UK Biobank is a large prospective cohort study that aimed to recruit men and women aged between 40 and 69 from 2006 to 2010. We applied generalised linear models to cost per person year to estimate the marginal effect of adiposity, and average adjusted predicted costs of adiposity. Results Valid cost and body mass index (BMI) data from 457,689 participants were available for inferential analysis. Some 54.4% of individuals included in the analysis sample had positive inpatient healthcare costs over the period of follow-up. Median hospital costs per person-year of follow-up were £89, compared to mean costs of £481. Mean BMI overall was 27.4 kg/m2 (standard deviation 4.8). The marginal effect of a unit increase in BMI was £13.61 (99% confidence interval £12.60–£14.63) per person-year of follow up. The marginal effect of a standard deviation increase in BMI was £69.20 (99% confidence interval £64.98–£73.42). The marginal effect of becoming obese was £136.35 (99% confidence interval £124.62–£148.08). Average adjusted predicted inpatient hospital costs increased almost linearly when modelled using continuous measure of adiposity. Sensitivity analysis of different scenarios did not substantially change these conclusions, although there was some evidence of attenuation of the effects of adiposity when controlling for waist-hip ratios, and when individuals who self-reported any pre-existing conditions were excluded from analysis. Conclusions Higher adiposity is associated with higher inpatient hospital costs. Further scrutiny using causal inferential methods is warranted to establish if further public health investments are required to manage the large healthcare costs observationally associated with overweight and obesity. Electronic supplementary material The online version of this article (10.1007/s40258-018-0450-2) contains supplementary material, which is available to authorized users.

prevention and treatment of obesity. As Wang et al. [13] note, "A systematic understanding of the potential morbidity and cost implications of specified hypothetical changes in body-mass index trajectories…is crucial for formation of effective and cost-effective strategies, establishment of research and funding priorities, and creation of the political will to address the obesity epidemic."

Introduction
Body mass index (BMI)-weight divided by the square of standing height-is a widely used indicator of nutritional status, adiposity and overall health [1][2][3][4][5][6][7]. Longitudinal, crosscountry data reveal a recent increase in mean BMI levels for men and women, and increased variance in BMI [6]. Globally, more individuals are becoming obese (BMI ≥ 30 kg/m 2 ) than are transitioning out of underweight status (BMI ≤ 18.5 kg/m 2 ) [6,8], leading to marked increases in the absolute numbers of individuals in the former category. Some 2.1 billion individuals throughout the world were overweight (BMI > 25 kg/ m 2 ) or obese in 2013, an increase of 1.4 billion from 1980 [4]. This reflects a global prevalence of overweight or obesity amongst men of 28.8% and 29.8% amongst women [4].
The underlying aetiology between BMI and health is complex [9], but robust associations have nevertheless been found between higher BMI and risks for a variety of adverse health and social outcomes, including higher rates of age-adjusted mortality [10], increased morbidity [6,11] and poor labour market outcomes [12]. This has important implications for the management of healthcare systems [13], since the endocrinal [14], cardiometabolic [15,16] and other changes [13] associated with increased adiposity are associated with substantial healthcare resource requirements [17].
Cawley et al. [18] estimate that the direct medical care costs of obesity of US adults amounted to 27.5% of health expenditures for non-institutionalised US adults, and noted adult obesity has become more expensive over time in the USA due to both increased prevalence and higher treatment costs. Kent et al. [19] attribute 14.6% of total annual hospital costs in women aged 55-79 in England to overweight and obesity, with this figure rising to 52% for individuals with BMI > 40 kg/m 2 .
This article assesses the observational association between BMI and other measures of adiposity with admitted patient hospital costs using individual-level data from UK Biobank, a large prospective cohort study. This is the first study to develop cost estimates for UK Biobank, which will be returned to the cohort and freely available for other researchers to use. Observational analyses of the type deployed in this paper cannot answer causal questions, or identify the mechanisms that mediate the association between adiposity and healthcare costs, but do have utility in documenting policy-relevant associations, offering context for previous observational and causal studies, and providing evidence for meta-analysis.
A further advantage of this analysis is that the results may be triangulated [20] with other sources of evidence with potentially orthogonal sources of bias to come to an overall view about the association between BMI and healthcare costs. Together, diverse sources of evidence, of which the present analysis is one source, can be used to inform the cost-effectiveness of interventions and policies targeting the calculated in increments of 0.1 kg. A separate measure of adiposity was calculated from these impedance measures by dividing by the standing height of participants to create another index of body mass, also in terms of kg/m 2 .
These two measures of adiposity were highly concordant (Lin's rho = 0.99996, p < 0.00001) with a mean difference between traditional and impedance measures of − 0.01 (99% confidence interval − 0.12 to 0.10). Impedance-based BMI was therefore used where the traditional BMI measure was missing. A total of 685 observations had a mean difference between traditional and impedance-based measures of BMI of more than 5 standard deviations from the mean difference, and were excluded from the analysis.
Waist circumference (at the umbilicus) and hip circumference were measured in centimetres using a Wessex sprung tape measure. A waist-hip ratio (WHR) was calculated from these measurements to reflect central adiposity [23,24]. We estimated separate models including WHR as a continuous variable and as a binary variable. The binary variable was constructed to reflect measures of regional adiposity using a ratio > 0.85 for women and > 0.9 for men. It is important to note that while these binary cut-off points are used in a number of countries, there is no widely accepted definition of these cut-offs, despite the frequent attribution of these figures to WHO reports (see [25] and particularly the discussion in [26]). Nevertheless, this classification has utility as a comparison to the binary cut-off of obesity defined for the conventional BMI measures, as well to published literature using these or similar classifications for WHR measures.
As a sensitivity analysis, we estimated models of the continuous WHR outcome using continuous BMI as a covariate, and vice versa. These models estimated, respectively, BMIadjusted WHR, and WHR-adjusted BMI outcomes conditional on all baseline covariates, and each model assumes that WHR is not on the causal pathway between BMI and hospital costs and vice versa.

Measurement of Costs
Admitted patient care episodes, sometimes referred to as inpatient care episodes, were obtained from linked Hospital Episode Statistics (HES) for English care providers and from Patient Episode Database for Wales (PEDW) for Welsh providers. These two data sources are very similar in structure and purpose, and were analysed as a single dataset following reconciliation where required. Inpatients are those who are admitted to hospital and occupy a hospital bed but need not necessarily stay overnight (i.e. day case care).
Episodes of care refer to a period of care in which a patient is under the care of a single consultant working at a single hospital provider. An admission may comprise just one "Finished Consultant Episode" (FCE), or many such episodes if the patient receives care from more than one consultant.
Each FCE is associated with information on the patient, and the consultant and hospital overseeing their care. For example, FCEs are associated with procedures undertaken (based on OPCS-4 procedure codes) and diagnoses made (based on ICD-10 diagnosis codes [27]). These FCEs were converted into Healthcare Resource Groups (HRGs) by using data on patient characteristics, length of stay, procedures, diagnoses, and other information by using NHS "Grouper" software [28].
HRGs reflect groups of similar patient activity, and are used in England and Wales as the basis for case-mixadjusted remuneration of publicly funded NHS hospitals. In turn, these HRGs were cross-classified against NHS Reference unit costs, allowing costs to be assigned to each FCE. These costs were calculated to reflect differences between categories of care included in inpatient care, such as elective and non-elective episodes, and accounted for the additional costs associated with patients undergoing long hospital stays. Costs were separately calculated for so-called "unbundled" elements of care, such as diagnostic imaging, for which elements of cost and care activity are represented separately from other elements of inpatient care.
Due to differences in the collection and valuation of care in Scottish hospitals compared to hospitals in England and Wales, only costs from the latter two jurisdictions are included in this analysis. Costs were calculated for episodes of care occurring on or after 1 April 2006 to reflect major changes to the hospital payment system that came into effect on that date [29]. Participants who attended UK Biobank baseline appointments before this date were removed from the analysis.
Remaining cohort members therefore report person-years from recruitment, and contribute data to HES until death, emigration, or 31 March 2015, the current censoring date for linked HES data. Note that the event of death does not constitute censoring in this analysis, since the number of allcare episodes and associated costs is known following death. Emigration out of the UK amongst the cohort is reported to be a modest 0.3% [21] and is not accounted for in our analysis. We do not have information on patients who may have received inpatient care in Scottish hospitals while living in England or Wales. Costs were converted into a per-patient cost by summing across all episode costs per person-year of follow-up in constant 2016/17 prices.

Covariates
All models adjusted for covariates obtained from participant responses at the baseline appointment at UK Biobank assessment centres. These covariates were sex, age at recruitment, days per week of vigorous physical activity (categorical ranging from 0-7), frequency of alcohol intake (categorical ranging from "daily or almost daily" to "never"), educational and professional qualifications (categorical), employment status (categorical, ranging from "in paid employment or self-employed" to "full or part-time student"), and a measure of deprivation calculated using the Townsend score converted into quintiles.
To assess the effects of residual confounding and of reverse causality from pre-diagnostic or prodromal disease [7], we conducted sensitivity analysis restricting the analysis to include only "never smokers" (n = 249,423) and including only individuals reporting no pre-existing illness at baseline (n = 106,388). Smoking status is self-reported and may exhibit measurement error as a consequence [30]. It is possible that focusing only on never smokers (who are probably less likely to misreport their smoking status than current or former status) may avoid some of the confounding that may otherwise be present in the BMI-hospital cost association. However, the causal relationship between BMI, smoking and healthcare costs is complicated and very likely bidirectional [31,32]. It is important to emphasise that accounting for residual confounding and reverse causality in this way is necessarily incomplete given the observational study design.
Including only those individuals with no health conditions at baseline may partially capture the impact of reverse causality if individuals with many pre-existing conditions are on treatment regimens that raise or lower BMI, so that the direction of causality runs from costs to BMI. Again, this is a partial attempt at accounting for reverse causality and reflects only baseline information that was accurately self-reported.
Two covariates (the exercise variable and the qualifications variable) had slightly higher rates of missing data than others, and we conducted a post hoc sensitivity analysis excluding these two variables from the adjusted regression models.

Statistical Methods
The primary objective of our analysis was to predict total admitted patient hospital costs as a function of the conditional marginal effect of body mass on healthcare costs. Marginal effects for categorical and binary adiposity-related variables refer to the effects of a change in category, e.g. from "overweight" to "obese". Marginal effects for continuous outcomes were calculated for specific changes-unit changes and standard deviation changes-in the adiposityrelated variable. The magnitude of these effects depends on the values of other covariates included in the model-these values of other covariates were left at their observed values to calculate an average marginal effect.
The average adjusted predictions that follow from this calculation adjust for variation in these other covariates. These predicted costs therefore do not necessarily reflect the hypothetical costs of an "average" individual, nor necessarily the expected costs for any specific individual in the sample. Average adjusted predicted costs were calculated for the entire analysis sample, for the samples defined in sensitivity analyses, and at representative ages stratified by sex.
Cost is always non-negative, the modal value is often zero, and the distribution of healthcare costs tends to be skewed with a long right tail reflecting very high expenditures incurred by a small number of individuals. Linear models may predict negative costs for some individuals, may not estimate mean effects without bias in the presence of long tails, and are likely to be inefficient in the presence of heteroskedasticity. Logarithmic transformations can address skewness to some extent but are not well suited for handling zero cost observations, and require transformations of regression output back to the original scale, a process which itself may cause bias in the presence of heteroskedasticity [33].
Generalised linear models (GLM) were therefore used to estimated average adjusted predicted costs: rather than modelling the conditional expectation of the logarithm of cost, the logarithm of the conditional expectation of cost was estimated, and the relationship of the mean to variance in the outcome data was assessed and modelled. A so called "single-index model" was used, in which the zero-cost and positive-cost outcomes were modelled with a single density function. As Deb et al. [34], this approach is appropriate when the object of inferential interest is related to the mean function of the outcome, conditional on covariates, such as the marginal effect of an adiposity-related covariate on inpatient hospital cost outcomes. However, in sensitivity analyses, we also estimated models under a two-part model which decomposed the density of the outcome into a mixture of two parts: one that models the zero cost observations (estimated with a logit model), and one that models the non-zero cost observations using a GLM structure.
Two tests, neither necessarily definitive, were used to identify an appropriate link function for all models. Box-Cox tests on positive cost observations were used to identify scalar powers that resulted in the most symmetric transformed distribution. A Pregibon-type [35] link test was performed to check whether including the squared outcome variable as a covariate had explanatory value above models excluding this term. To inform the choice of the family function, a modified version of the Park test was used to characterise the relationship between the variance of the error term and the expected value of the cost outcome variable. All models use robust standard errors to allow for potential mis-specification of the link and family functions.
All analyses were conducted in Stata version 15.1 (Stata-Corp, College Station, Texas).

Results
Following exclusion of data as described above, records from up to 457,689 participants were available for the inferential analysis (Fig. 1), of whom 54.4% at baseline were female (Table 1). Person-years of follow-up ranged from 4.5 to 9 years (mean and median = 6.1 years). Mean age at baseline for women was 56.3 years, and 56.7 years for men. Median hospital costs per person-year of followup were £89, compared to mean costs of £481, reflecting the highly skewed distribution that is characteristic of this type of cost data.
Some 54.4% of individuals included in the analysis sample had positive healthcare costs over the period of follow-up. Mean BMI overall was 27.4 kg/m 2 [standard deviation (SD) 4.8]; BMI amongst men was 27.8 kg/ m 2 (SD 4.2) and amongst women 27.1 kg/m 2 (SD 5.2). The proportion of participants with BMI > 30 kg/m 2 was 24.4%, with more men (25.3%) than women (23.5%) meeting the definition of obesity. This is slightly lower than the prevalence of obesity in England in 2010 (the last year of recruitment to the study) amongst all adults of 26.1% (26.2% males, 26.1% females) [36].
There was limited missingness (< 0.1%) amongst baseline covariates used in the adjusted analysis, with the exception of the exercise variable (5.4% coded as missing) and the qualifications variable (1.9% missing). Tests of family and link functions for the GLM supported the use of a Gamma family with logarithmic link when modelling the cost outcome. Further details on the results of these tests are provided in supplementary material.
All models found a positive association between measures of adiposity and hospital costs ( Table 2)  Resource Group. This is typically where the primary diagnosis was not specific and was coded to an unspecified injury (or similar) classification cost outcome for specific sets of values for the predictor variables, for example, the effect of a standard deviation increase in BMI.
Marginal effects for continuous outcomes require the specification of a particular "margin" of change, but otherwise have a similar interpretation of the marginal effects of categorical outcomes. For example, the effect of a one standard deviation increase in BMI is to increase in inpatient hospital costs by £69.20 (99% CI £64.98-£73.42).
The relative effect of a unit change in continuous BMI amounts to approximately 2.9% (99% CI 2.7-3.2) in increased costs per person-year of follow-up. Figure 2 displays average adjusted predicted costs at representative values of continuous BMI, capturing predictions at unit increments of BMI from small to the large BMI values. Figure 3 presents the same outcome according to age, and stratified by sex to examine whether these variables may influence inpatient costs. The point estimates and CIs are similar, and no difference in sex is apparent at any age. However, there is a clear gradient with respect to age, with older members of the cohort predicted to have higher inpatient costs. Figure 4 summarises average marginal effects for the categorical BMI outcome. There is some evidence of a J-shaped relationship in point estimates for lower levels of categorical BMI, although this evidence is weak given the widths of the CIs around these estimates.
Three of the five specifications assessed in sensitivity analysis were very similar to the base-case predictions (Fig. 5). The effect of BMI attenuates slightly when controlling for WHR. Including only never-smokers slightly reduced the estimated marginal effect to £13.10 (99% CI £11.77-£14.43), and predicted costs are similar to the base model near mean BMI. Excluding individuals who reported no pre-existing health conditions had the largest impact of all scenarios assessed, reducing the marginal effect of a unit change of BMI to £5.51 (99% CI £4.13-£6.89). Supplementary material describes average adjusted predictions over the range of BMI for these sensitivity analyses. Results from sensitivity analyses of the other outcomes reported in Table 2 were similar.

Discussion
Higher rates of adiposity are associated with higher inpatient hospital costs in this cohort of middle-aged and older individuals. This is broadly consistent with findings from studies with similar designs. A systematic review [37] of the association between total annual healthcare costs and BMI in studies using individual participant data found progressive increases in annual hospital costs for increasing levels of obesity, a finding demonstrated here in all models and for all outcomes. The systematic review also reported that overweight and obesity increased median inpatient hospital costs relative to costs associated with BMI of 18.5 kg/m 2 to < 25 kg/m 2 by 12% and 34%, respectively. These figures are similar to those reported here, for which the comparable relative increases in cost for BMI 25 kg/m 2 to < 27.5 kg/m 2 relative to 22.5 to 25 kg/ m 2 of 16%, and 31% for BMI ≥ 30 kg/m 2 relative to 22.5 to < 25 kg/m 2 . Korda et al. [38] studied hospital costs amongst 224,254 Australian adults aged at least 45 years, and found an increase in costs of between 14 and 24% (depending on category of age) for BMI between 27.5-30 kg/m 2 relative to 22.5-< 25 kg/m 2 , rising to 77% to 115% for BMI between 40-50 kg/m 2 . The comparable increase in costs for individuals of all ages in UK Biobank ≥ 40 kg/m 2 is 78%. The findings are also broadly similar to the US study of Andreyeva et al. [39].
Marginal effects for the continuous BMI outcome are similar to the observational estimates in Cawley et al. [18] for US adults of $49 in 2005 US dollars per unit change in BMI, although their estimate of the marginal effect of obesity ($656) is much higher than the effect reported in Table 2 for the UK Biobank cohort. Predicted costs and marginal effects are estimated to be somewhat lower than in the large prospective women-only study of Kent et al. [19], perhaps reflecting the relatively "healthier and wealthier" composition of participants in UK Biobank.

Strengths and Limitations
This analysis benefitted from access to a large volume of patient level data obtained from a prospective cohort study with comprehensive information on a variety of baseline characteristics. Measures of BMI and related variables were obtained by research staff, and the errors and biases associated with self-report of weight and height were avoided [40]. The use of traditional and impedance-based measures of BMI offered a degree of further validation for the core exposure. The analysis included both continuous and categorical/ binary outcomes of weight-related measures, in contrast to much of the literature on these associations, which is generally restricted to categorical or binary classifications of BMI. The analysis has a number of limitations. The analysis presented in this paper is observational and cannot reveal the causal association between measures of adiposity and inpatient costs. For example, the association between costs and BMI may be confounded by the effect of unmeasured baseline health conditions that affect both BMI and hospital costs. Adjustment for baseline pre-existing conditions in sensitivity analysis revealed a relatively large moderating effect, but only captures known health conditions that were accurately reported at baseline.
The results could therefore be affected by some degree of reverse causality, the precise degree of which cannot be uncovered in this type of observational analysis. The results from excluding individuals without pre-existing conditions was suggestive-effect sizes attenuated but did not encompass the null. Nevertheless, this is necessarily a partial and incomplete test of reverse causality.
A further channel for costs to influence BMI may be treatment regimens that affect weight (such as selective serotonin reuptake inhibitors for depression), or treatment decisions that used BMI thresholds (such as for bariatric surgery) as a basis for intervention. However, observing these associations may give the appearance of the direction of causality running from healthcare costs to BMI, but in reality, are better conceived of as further instances of confounding (by depression in the first example) rather than an actual causal effect on the level of BMI from the level of costs per se.
Methods for causal inference, such as Mendelian Randomization-the use of germline genetic variants as instrument variables for exposures such as BMI-or the use of BMI of a biological relative as an instrument for own BMI, offer promise in this area as a means to overcome confounding [41].
Complementary study designs and triangulation [20] between these different designs will almost certainly be necessary even after causal analyses of the adiposity/cost relationship.
Biobank cohort participants differ from the general population: they are more likely to be older, female and university graduates. They are less likely to be deprived, obese, to smoke, to drink alcohol on a daily-basis, or to have selfreported health conditions. There is evidence of a selection bias from a "healthy volunteer" effect [21]. The rate of all-cause mortality is 46% lower in the cohort than in the   population from which it was drawn [22]. This may induce associations between outcomes and participant characteristics when none exist in truth due to the operation of collider bias caused by the self-selection of a healthier and more educated population into the cohort [42]. This is likely to impair the representativeness and generalisability of the findings of this study to the wider UK population from which the cohort is drawn. The direction of the bias that this imparts is unknown. However, it is plausible that the direction of bias may be downwards, suggesting that the estimates presented here are likely to be an underestimate of the effect of BMI on healthcare costs. This would occur if, for example, the healthy behaviours and favourable circumstances of cohort participants (relative to the general population) are related to unmeasured variables that themselves tend to mitigate the consequences of higher BMI. Robust causal methods are required to resolve this uncertainty.
Modelling cost outcomes is challenging, not least because methods to identify appropriate link and family functions in GLM have limitations and require a degree of contextual interpretation. Mis-specification is possible despite robust standard errors being used for all models. The costs assigned to each individual episode of care are based on HRGs that are essentially averages of costs of types of care, and may conceal individual variation in actual resources used.
Analysis did not encompass all healthcare costs. Primary care data are not linked to UK Biobank, and hospital data are restricted to inpatient care, although Kent et al. [19] report that 30-50% of total overweight and obesity attributable costs relate to inpatient care. It is also probable that many costs will be correlated with the same direction of effect across different categories of provider care, albeit that the scale of effects is likely to differ. This question requires linked data that are not available for the UK Biobank cohort. The remuneration of Scottish hospitals differs from that in England and Wales, and Scottish hospitalisation data could not be combined with those from the latter two jurisdictions.
The dataset includes private patients treated in NHS hospitals, but not private patients treated elsewhere. Approximately 37% of the cohort answered a question on use of private health insurance, of whom 2.4% self-report exclusive use of private healthcare. Nevertheless, 39% of individuals reporting exclusive use of private healthcare have non-zero NHS inpatient costs. Further data on reported use of private healthcare in the cohort are presented in Supplementary material.

Conclusion
BMI is a potentially modifiable risk factor for a variety of healthcare conditions. Evidence from analysis of the UK Biobank is consistent with findings from other cohorts in demonstrating a robust association between BMI and inpatient hospital costs. However, evidence from randomised study designs and/or valid causal inference on observational data is required to validate these findings and to inform public policy toward excess adiposity.

Data Availability Statement
The terms of access to UK Biobank do not permit data sharing by third parties. However, data used in this study are available via application to UK Biobank. Code used in the analysis is available from: github.com/pdixon-econ.