Introduction

The rapidly increasing prevalence of obesity and interventions to prevent it have become a major health challenge worldwide (Reilly et al. 2018; Sassi 2010; OECD 2017, 2018). The ways that health outcomes, including the risk of being obese, are physically and socially shaped by places are usually described as “neighbourhood or contextual effects” (Bambra et al. 2019; Elliott 2018). Evidence of these effects is used to establish that context matters. If health outcomes are, at least in some regard, location dependent then “one-size-fits-all” prevention strategies should be avoided in favour of tailoring them to the individual and their context. According to the second Law of Geography (Goodchild 2004), “geographic variables exhibit uncontrolled variance”, which is the principle of spatial heterogeneity. The associations may vary over space, particularly over large or sophisticated geographical areas. Given that context is place-based then the expectation is that the predictors of obesity may be different or impact differentially in their effects from one place to another, suggesting the need for geographically tailored interventions.

However, these geographical variations are not always evident. Schuurman et al. (2009) found no evidence of significant spatial clustering of obesity at either the city or neighbourhood level of their analysis across eight suburban neighbourhoods in Metro Vancouver. Moreover, finding geographical variation in any health outcome does not necessitate that the predictors of that relationship also vary from place to place. The use of geographically weighted regression, which allows the functional relationship between dependent and independent variables to change with location, failed to improve the global model significantly in Schuurman et al.’s study, suggesting the relationship is spatially stationary (the same model could be used with the same broad predictive accuracy anywhere within the study region).

Many studies focusing on the neighbourhood effect influencing obesity risk were limited by potential ecological fallacies as they only consider the aggregated, neighbourhood-level variables. Sun et al. (2020b) adopted spatial regression models to explore to what extent spatial inequalities in childhood obesity are attributable to spatial inequalities in socioeconomic characteristics in England using aggregated data. Their analysis indicates positive spatial dependence for childhood obesity prevalence as well as significant associations with aggregated socioeconomic variables across England (Sun et al. 2020b). Similarly, Neelon et al. (2017) harnessed the national prevalence of childhood obesity rather than individual-level health outcomes by adopting a multinomial logistic regression and a geographically weighted logistic regression to examine the association between food insecurity and obesity by area-level deprivation in children across England. The area deprivation was associated with the perceptions of obesity and food insecurity. Importantly, the influence of area deprivation on obesity and food insecurity varied over space in England (Jia et al. 2017).

In contrast to the literature that emphasizes the essential influence of geographical context on obesity (Chi et al. 2013; Ha and Xu 2022; Huang 2021; Jia et al. 2017; Lee et al. 2019; Oshan et al. 2020; Qin et al. 2019; Shahid and Bertazzon 2015), this paper reaches the opposite conclusion: there is evidence neither of any substantial geographical patterning to where more and less obese participants in the UK Biobank data live nor in the predictors of obesity at a more regional scale. We employ data from individuals aged 40–69 years between 2006 and 2010 who participated in a large-scale health survey in Great Britain (UK Biobank 2022). This conclusion is contrary to our expectations and is despite using methods that we assumed would reveal geographical variation. Most studies examining the neighbourhood effect on obesity harness various types of regression methods, including Poisson regression (Lovasi et al. 2013; Nguyen et al. 2021), logistic regression (Alexander et al. 2013; Dadvand et al. 2016; Rossi et al. 2019) and multilevel models (Gilliland et al. 2012; Robinson et al. 2021).

We use multilevel models in the first stage of the analysis to decompose the variation in body mass index (BMI) values and quantify how much of it is linked to the various geographical scales of the model, which was neglected in another study that also looks for geographical heterogeneity using data from the UK Biobank (Mason et al. 2021). It is essential to explore the potential geographical clustering of obesity before further exploring the potential spatially varying effects of obesity predictors, which this study does. The identification of spatial clustering patterns aims to understand spatial autocorrelation of obesity firstly. Both spatial dependence and spatial heterogeneity require attention when modelling dynamic spatial data (Anselin 1988). The comprehension on patterns of spatial variations helps to incorporate related characteristics variability into model construction (Jacquez 2008), which is beneficial in the further investigations on spatial heterogeneity.

At the second stage, the paper turns to whether the main predictors of obesity, such as dietary habits and physical activities, vary geographically for those measured at the different assessment centres at which the Biobank data are collected. Other studies that focus on the possibility of spatial dependencies and/or variations have adopted geographically weighted regression (Fraser et al. 2012; Oshan et al. 2020) or spatial regression models (Leonard et al. 2014; Bonnet et al. 2022). Here we adopt a different approach that is rooted in the growth of geographic data science and its interest in machine learning (Andrienko et al. 2017; Calafiore et al. 2021; Gahegan 2000; Singleton and Arribas-Bel 2021). Compared with traditional regression approaches, machine learning approaches are efficient and powerful in handling big high-dimensional datasets as a data-driven approach to recognize potential patterns within data themselves (Cracknell and Reading 2014; Feizizadeh et al. 2023). It is essential in spatial analysis to develop data-adaptive, non-linear and multivariate models based on high-dimensional spatial datasets (Kanevski et al. 20082009), which fits well with the scope of machine learning approaches (Du et al. 2020). Machine-based approaches usually involve training a model with a random subset of the complete data to then assess how it performs for the remaining data. The potential problem with such an approach is that it is not geographical if the training strategy does not allow for geographical variation in what it is being trained for.

Additionally, traditional machine learning methods rely on statistical probability principles, assuming data are independently and identically distributed (L’heureux et al. 2017); however, they are not applicable in the case of datasets where there is spatial heterogeneity. We, therefore, adopt a more directly spatial training and prediction strategy: random forest models are trained in one city to predict individual BMI values in other cities to examine the possibility of spatially varying predictors of obesity. In detail, the validations on the degree of heterogeneity of the predictors are achieved through the generalized evaluations, including R-squared, variable importance and coefficient estimations on models’ predictive performance from city to city. Previous studies which also employed the UK Biobank data mainly used regression methods to explore the associations between risk factors and obesity. However, this paper is, to our knowledge, the first study to focus on using machine learning methods to examine predictors of obesity and whether they vary regionally across England. Finally, we used multilevel models, including random intercepts and random slope models as a further check on whether the effect of socio-economic status, both individual and neighbourhood deprivation level vary across England. The adoption of multilevel models allows for neighbourhood measures of deprivation to be taken into considerations.

Data and Method

Study Area and Dataset

The UK Biobank survey is a large prospective cross-sectional, observational cohort study (UK Biobank 2022). It recruited 502,656 volunteer participators, mainly aged 40–69 years old, between 2006 and 2010 who visited the 22 assessment centres throughout the UK and has collected a wide range of individual-level data about those participants, including demographic characteristics, lifestyle habits, socio-economic status, mental health status and neighbourhood environments. These data appear in the baseline assessment, which are used in this study. Those registered with the National Health Service and living within 25 miles of the 22 UK assessment centres were invited to participate.

To explore the geography of participation, we used kernel density estimation (KDE) to evaluate the participation density across Great Britain (Fig. 1). The exact coordinates of each participant’s home addresses are not known to us as these have been degraded to 1 km by 1 km geography to protect participators’ privacy to avoid individual data disclosure (UK Biobank 2022). It should, however, be noted that although KDE reveals “the density” of where participants live, it does not, in the simple form, show control for the underlying number of the population. In other words, it reveals incidences, not rates. Residents living in the buffer overlap areas have more than one chance of receiving the invitation to participate. Generally, high densities of respondents are concentrated in major cities and towns — that is, closer to the assessment centres.

Fig. 1
figure 1

Kernel density estimation of participation density with buffers drawn at distances of 25 miles from the UK Biobank assessment centres

Total six cities, including Bristol, Birmingham, London, Newcastle, Nottingham and Manchester in England, were chosen as the study areas. Because each selected city is located in a different region of England, and they have relatively various contexts, such as population demographic, urban planning and transportation and population behavioural habits. Conducting the spatial analysis, including investigating whether predictive models of obesity are spatially transferable and whether the associations between risk factors and obesity spatially vary across these cities, is feasible and meaningful. The high density of UK Biobank participation from these cities (Fig. 1) helps to increase the reliability of our findings.

A range of data were considered when applied the machine learning models and these data can be grouped into four domains of predictors: local environments exposures, interpersonal lifestyle habits, socio-economic status, demographical characteristics and continuous log BMI values as health outcomes. Multilevel models turned attention to the spatially varying effect of socio-economic status and incorporated with demographical characteristics to predict the risks of obesity.

Local Environment Exposures

UK Biobank offers neighbourhood environments about greenspace exposures, distance to the coast and traffic scores. Linked to the UK Biobank is the UK Biobank Urban Morphometric Platform (UKBUMP), providing a high-resolution spatial database based on the morphometrics analysis of the built environment (Sarkar et al. 2015). The proximity to various health-relevant destinations (GP practices, hospitals and fast-food outlets) was adopted in the study. Another complementary dataset, the index of Access to Health Assets and Hazards (AHAH), is a multi-dimensional index developed to measure how “healthy” neighbourhoods are (Green et al. 2018). Three variables in AHAH, including accessibility to fast food outlets, accessibility to pubs and accessibility to tobacco stores, were used to link the risks of obesity.

Interpersonal Lifestyle Habits

Interpersonal lifestyle habits involve predictors of diet habits, physical activities, drinking habits, smoking status, sleeping patterns and habits of watching TV. Predictors in the dietary habit domains were collected by questionnaires and interviews about the intake and frequency of eating various types of food as well as their dietary patterns. The usual walking paces have been answered by the short-form international physical activity questionnaire representing physical activities. Sleeping patterns include whether participants snore and the duration of sleep. Drinking habits were quantified to drinking units firstly, further classified into five groups based on their drinking units according to previous literature (Perreault et al. 2017), including (1) never drinking; (2) previous drinking; (3) within the UK low-risk drinking guidelines (< 14 (women); < 21 (men)); (4) hazardous drinking (14–35 (women); 21–49 (men)) and (5) harmful drinking (> 35 (women) > 49 (men)). Smoking status was coded as non-smoker, previous smoker and current smoker. The time spending in watching TV per day represented the habits of sedentariness.

Socio-economic Status

Deprivation levels were divided into individual deprivation levels and neighbourhood deprivation levels. The individual deprivation levels were directly represented by the household income, which were answered by touchscreen questionnaires. Neighbourhood deprivation levels included the Index of Multiple Deprivation (IMD) and the Townsend deprivation index with the closest release version to the year of their recruitment (UK biobank 2022). The higher the IMD values, the more deprived the areas. As for the Townsend deprivation index, positive values indicate areas with high material deprivation, while negative values indicate relative affluence. A score of 0 represents an area with an overall average deprivation value.

Demographic Characteristics

Age, gender, ethnicity and family status provide the broad characteristics of these participants, which were obtained through touchscreen questionnaires in the UK Biobank Survey. UK Biobank gathered information about family status, including cohabitation status, marital divorce status and a spouse’s death. Participants who answered “Husband, wife or partner” in response to the question “How are the other people who live with you related to you?” were classified as “living with a spouse or partner”. Participants also answered questions about whether they had experienced the marital separation/divorce and death of a spouse/partner in the last 2 years.

Outcomes: Body Mass Index (BMI , kg/m.2 )

Body mass index (BMI, kg/m2) values were directly provided by UK Biobank baseline data, which were calculated from height and weight measured during the baseline assessment centre visit. Log BMI values rather than absolute BMI values are treated as the continuous health outcomes to offset the observed positive skewness of BMI values.

Descriptive Sample Characteristics

This is a demographically specific subgroup analysis study of 117,108 UK Biobank participants aged 45 to 72 years living in or near one of the six cities (Table 1). Their mean age is 56 years, of which 51% are male, and 87.5% are of white ethnic background. The average BMI is 27.16 kg/m2. NHS follows the World Health Organization’s guidance and criteria on obesity diagnostic as high BMI values over or equal to 30 (NHS 2017; WHO 2018). Adult obesity prevalence in England ranged from approximately 24 to 26% between 2006 and 2010, the baseline data collection period of UK Biobank. The obesity prevalence across cities ranges from 13.1 to 29.9%, with a mean of 22.93% in this study.

Table 1 Summary of sample characteristics (N = 117,108)

Data Analysis

This study follows a multi-stage analysis strategy with the detailed methodological flow chart in Fig. 2. There are mainly three stages in this study. The first stage was to examine the existence of spatial clustering patterns of BMI values by multilevel null models. Secondly, random forest models were trained in one city to predict other cities to investigate whether models are spatially transferable and whether key predictors of obesity have spatially varying effect. Finally, we turned to focus on the spatially varying effect of socio-economic status by constructing multilevel models across six cities.

Fig. 2
figure 2

The methodological flow chart

At the first stage, multilevel null models are used to examine the existence of spatial patterns of log BMI values at various geographical scales. With the UK Biobank, participants are nested into coordinates, nested into Middle Layer Super Output Areas (MSOAFootnote 1), nested into assessment centres (cities). All cities apart from London (three assessment centres), there is a one-on-one match between cities and assessment centres. This study considered a total of six cities (eight assessment centres), including Bristol, Birmingham, London, Newcastle, Nottingham and Manchester. Different multilevel null models on two-level, three-level and four-levels were constructed (some combination of city-MSOA-coordinate-id). The calculation of the variance partition coefficient (VPC) in null models without any predictors variables (just the response variable, Log BMI values) was used to explore to what extent the variations in individual BMI values cluster geographically. Higher VPC values reveal a more significant clustering of log BMI values at the geographical level concerned. For a two-level model of individuals nested into geographical coordinates, the standard formula for calculating VPC is formula (1) below.

$${\mathrm{VPC}}_{\mathrm{u}}=\frac{{\sigma }_{\mathrm{u}}^{2}}{{\sigma }_{\mathrm{e}}^{2}{+\sigma }_{\mathrm{u}}^{2}}$$
(1)

where \({\sigma }_{\mathrm{u}}^{2}\) is the level 2 (coordinate level) variance, \({\sigma }_{\mathrm{e}}^{2}\) is the level 1 (individual level) variance and \({\mathrm{VPC}}_{\mathrm{u}}\) is the coordinate level VPC (Merlo et al. 2005)

For a three-level model considering the hierarchical geographical structure that individuals nested into coordinates, then nested into MSOA, the standard formula for calculation VPC at the MSOA level is formula (2) below.

$${\mathrm{VPC}}_{\mathrm{MSOA}}=\frac{{\sigma }_{\mathrm{MSOA}}^{2}}{{{\sigma }_{\mathrm{e}}^{2}+\sigma }_{\mathrm{u}}^{2}+{\sigma }_{\mathrm{MSOA}}^{2}}$$
(2)

where \({\sigma }_{\mathrm{MSOA}}^{2}\) is the level 3 (MSOA level) variance, \({\sigma }_{\mathrm{u}}^{2}\) is the level 2 (coordinate level) variance, \({\sigma }_{\mathrm{e}}^{2}\) is the level 1 (individual level) variance and \({\mathrm{VPC}}_{\mathrm{MSOA}}\) is the MSOA-level VPC (Merlo et al. 2005)

For a four-level model of individuals nested into coordinates, MSOA and cities, the standard formula for calculation VPC at the city level is formula (3) below.

$${\mathrm{VPC}}_{\mathrm{city}}=\frac{{\sigma }_{\mathrm{city}}^{2}}{{{\sigma }_{\mathrm{e}}^{2}+\sigma }_{\mathrm{u}}^{2}+{\sigma }_{\mathrm{MSOA}}^{2}+{\sigma }_{\mathrm{city}}^{2}}$$
(3)

where \({\sigma }_{\mathrm{city}}^{2}\) is the level 4 (city level) variance, \({\sigma }_{\mathrm{MSOA}}^{2}\) is the level 3 (MSOA level) variance, \({\sigma }_{\mathrm{u}}^{2}\) is the level 2 (coordinate level) variance, \({\sigma }_{\mathrm{e}}^{2}\) is the level 1 (individual level) variance and \({\mathrm{VPC}}_{\mathrm{city}}\) is the city-level VPC (Merlo et al. 2005)

Having used the multilevel null models to look for geographical patterning to the log BMI values, we next turn to the machine learning approach to explore any spatially varying influence of predictors and whether trained models are spatially transferable. The flow chart containing the entire procedures of constructing random forest models in all six cities is exhibited in the Appendix as Fig. S2. Due to the procedures being quite similar and repeated across cities, the procedures of training unweighted random forest models in Birmingham and Nottingham as examples are exhibited in Fig. 3. Firstly, a random forest model was trained in one city and then the model was adopted to predict continuous log BMI values in another city. For instance, we trained our model using the London data and then we used this model to make predictions for Newcastle, Manchester, Birmingham, Bristol and Nottingham. The logic is that if the predictors and their effects vary according to city context then the models should not be spatially transferable. Our interest is in whether they are. Subsequently, models were trained in other cities in sequence to predict another city’s log BMI values.

Fig. 3
figure 3

The flow chart of training unweighted random forest models in Birmingham and Nottingham

The prediction performance of random forest models was evaluated by four error measures, including mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE) and R-squared. Their calculations are shown in the following formula 4 to formula 7. The power of random forest predictions was compared by the values of these four statistical error measures. Low values of MAE, MSE and RMSE represent high prediction accuracy of models. The higher R-squared, the better prediction accuracy.

$$\mathrm{MAE}=\frac{1}{n}{\sum }_{\mathrm{i}}|{\mathrm{y}}_{\mathrm{i}}-{\widehat{y}}_{i}|$$
(4)
$$\mathrm{MSE}=\frac{1}{n}{\sum }_{\mathrm{i}}{({\mathrm{y}}_{\mathrm{i}}-{\widehat{y}}_{i})}^{2}$$
(5)
$$\mathrm{RMSE}=\sqrt{\frac{1}{n}{\sum }_{\mathrm{i}}{({\mathrm{y}}_{\mathrm{i}}-{\widehat{y}}_{i})}^{2}}$$
(6)
$$R-\mathrm{Squared}=1-\frac{{\sum }_{\mathrm{i}}{({\mathrm{y}}_{\mathrm{i}}-{\widehat{y}}_{i})}^{2}}{{\sum }_{\mathrm{i}}{({\mathrm{y}}_{\mathrm{i}}-\overline{\mathrm{y} })}^{2}}$$
(7)

where \({y}_{i}\) are the continuous log BMI values, \({\widehat{y}}_{i}\) are the predicted log BMI values by random forest models and \(\overline{y }\) are the mean of predicted log BMI values (Rustam et al. 2020).

Furthermore, variable importance plots were built to examine whether and how the significance of predictors varies across space for predicting log BMI values. The reason for using variable importance to exhibit spatial variation effects is that machine learning methods do not provide direct comparisons between coefficient estimates. A permutation feature importance method was adopted to assess the variable importance, which is the most advanced variable importance measure method in random forest models (Strobl et al. 2007). The mechanism is to randomly permute variables and then examine the prediction accuracy before and after permuting the variable (Strobl et al. 2007). The larger values the permutation importance of a variable, the more predictive the variable (Breiman 2001; Chen and Ishwaran 2012).

UK Biobank is not representative of the UK population with a selection bias arising due to volunteering to participate. Van Alten et al. (2022) constructed weight for each participant to offset the volunteer bias based on the reference of the UK Census (2011), and they verified that adopting these weights reduces 78% of the volunteer bias on average. We also adopted these weights (Van Alten et al. 2022) in machine learning models but not multilevel models to investigate whether the significance of geographical context would change due to volunteer bias. Both weighted and unweighted random forest models were compared regarding their prediction ability and the importance of variables. Multilevel models adopt weights at all levels to unbiased estimate parameters (West et al. 2015). However, only weights at the individual level were available by Van Alten et al. (2022). Thus, multilevel models in this study did not harness weights.

The decision-making processes of machine learning methods are somewhat black box, referring that users typically know inputs and outputs but the procedures inside are not clear (Savage 2022; Watson et al. 2019). In machine learning models, spatial comparisons of the effect sizes of variables are not directly available. Thus, we turned to variables that we know are of direct interest in the wider literature with obesity and combine them with multilevel models to examine their geographical variations from city to city. In the UK, the prevalence of obesity in low-income neighbourhoods is over twice compared with wealthy neighbourhoods (NHS 2018). Scholars argued that poor people are limited by economic resources and threatened by poor health conditions (Averett and Smith 2014; Black et al. 2010; Fan et al. 2020; Macintyre et al. 1993). Therefore, socio-economic status, including both individual and neighbourhood deprivation levels, was chosen to investigate whether their associations with obesity vary across cities to ultimately examine the spatial heterogeneity.

Previous machine learning models are a mixture of variables, some of which may indeed contribute to obesity, but some of which are symptoms of obesity. For example, discussions of the causal relationship between snoring and obesity are controversial as to which is the cause (Fraire et al. 2021; Taylor et al. 2021). Furthermore, the influence of socio-economic status on obesity could be masked by the influence of physical activities, diet habits and neighbourhood environments. The substitution property of food is affected by income elasticity (Cawley et al. 2010; Monteiro et al. 2004; Moreno-Franco et al. 2018). Additionally, poor populations may live in low-income neighbours with more barriers to exercise due to resource and facilities restrictions and limitation (Romero 2005; Robert and Reither 2004; Ruel et al. 2010; Salmasi and Celidoni 2017). Thus, in order to clearly and directly compare the effect size of socio-economic status from city to city, demographic characteristics were adjusted for, while other predictors which may mask associations between poverty and obesity to a certain degree were not considered.

As participants in the same neighbourhood shared similar levels of neighbourhood deprivation, we adopted multilevel models that considered the nested hierarchy of individuals sharing coordinates and then nested in cities. For each city, we initially built two-level (coordinate-id) models with individual deprivation level, which is household income and demographic characteristic (age, gender and ethnicity) (Model Income), followed by consideration of neighbourhood deprivation level, which is Townsend deprivation index and demographic characteristic (age, gender and ethnicity) (Model Townsend Deprivation), subsequently accounted for both individual and neighbourhood deprivation level (household income and Townsend deprivation index) and demographic characteristic (age, gender and ethnicity) as the Model Income & Townsend Deprivation, finally added other individual socio-economic status, including the number of vehicles, highest educational attainment and employment status based on the Model Income & Townsend Deprivation as the Model Socio-Economic Status. The effect sizes of these socio-economic status were compared to investigate their influence on obesity was constant or varied across cities. After constructing separate random intercept models in each city, the final step was to build the three-level (city-coordinate-id) random slope models to check whether the slopes of socio-economic status change among cities to further investigate whether there were spatially varying influences of deprivation levels.

Results

Null Multilevel Results (Evaluations on VPC Values)

Having fitted the multilevel null models to explore the variation in the log BMI models at the various levels, we found little evidence of any geographical variations at any scale above coordinates, as indicated by the extremely low VPC values for them in the selected six cities (Null Multilevel Model 1 (Model NM1) to Null Multilevel Model 5 (Model NM5)). The calculated VPC values for the null multilevel models (Model UK1 to Model UK5) involving all 22 assessment centres are listed in the Appendix (Table S1). Regardless of the coverage of assessment centres considered, the VPC values at any geographical level are very low, suggesting that there is no substantial clustering pattern in the log BMI values. It is clear that participants who shared coordinates had different BMI values because over 95% of the overall variation from the null multilevel models is found within coordinate level. Compared to the variance between the city level or the coordinate level, the MSOA-level variance was marginally larger. The calculated VPC values for BMI values in selected six cities (Model B1 to Model B5) are also listed in the Appendix (Table S2). Similarly, no matter how many or which geographical scales in the multiple multilevel models were taken into consideration, the variances between all geographic scales were, however, much more limited than the variance within the coordinate level (Table 2 and Table S2).

Table 2 VPC of Null multilevel models in the selected six cities (N = 117,108) for predicting log BMI values

Prediction Abilities of Random Forest Models Across Cities (Comparisons of MAE, MSE, RMSE and R-squared)

We firstly fitted random forest models in one city by set of predictors, and then adopted the trained models to predict continuous log BMI values in other cities. The whole prediction procedures were conducted from city to city in sequence. The VIF diagnosis with the threshold 5 has been done to avoid the existence of collinearity issues among predictors (Akinwande et al. 2015; Salmerón et al. 2018; Vatcheva et al. 2016). MAE, MSE, RMSE and R-squared were harnessed to compare the prediction ability across cities, and it was assumed that similar MAE, MSE, RMSE and R-squared across cities based on the same trained model indicated models across spatial domains do not transfer in terms of the similar predictive abilities.

Additionally, the usage of IP-weights did not bring substantial changes in prediction ability evaluated by MAE, MSE, RMSE and R-squared for modelling log BMI values across cities. Thus, only comparisons of R-squared derived from unweighted models (Model RF1 to Model RF6) are listed in Table 3. The weighted random forest models’ prediction abilities (Model W1 to Model W6) are exhibited in the Appendix (Table S3). The R-squared across cities by different random forest models is quite similar, between 18 and 22% (Table 3). MAE, MSE and RMSE values are also fairly close across cities, at around 0.110, 0.020 and 0.143 respectively. The differences among MSE values are the minimal compared to the other error indicators. MAE measure the average residuals, while MSE measure the variance of the residuals. The MSE values for predicting London and Manchester are relatively high compared with other cities. It is worth noting that MSE is sensitive to outliers and the few large errors between predicted and actual log BMI values may result in high MSE values. It is therefore useful to consider both MSE and R-squared to evaluate the prediction accuracy, as R-squared is less sensitive to outliers. It was observed that the R-squared for predicting log BMI values in London remained relatively low. However, R-squared for predicting log BMI values in Manchester was relatively high.

Table 3 MAE, MSE, RMSE and R-squared for predicting log BMI values in unweighted random forest models

Regardless of city selection to training the prediction models, the R-squared for predicting log BMI values for Birmingham and Manchester is approximately 20%. There are slight differences in predicting log BMI values in Newcastle and Nottingham based on different models. The model based on London has the highest prediction performance in predicting Bristol and vice versa; the model based on Bristol has the highest prediction ability in predicting London. The model based on Manchester invariably has a relatively low interpretability ability compared with other models. One possible explanation is that the sample size of Manchester is the smallest.

The model transferability is the application of parameters’ estimations in previous models to a new context (Karasmaa 2001). Spatial transferability has been applied in habitat suitability model and travel demand model to explore the stability and performance of previous models when applied into new geographical areas (Lauria et al. 2015; Yasmin et al. 2015). However, spatial transferability has been few explored in health disparities field, which is achieved by our analysis through comparing MAE, MSE, RMSE and R-squared for predicting log BMI values from city to city. We are interested in whether the predictive power varies according to the city context to examine whether the models are spatially transferable. Overall, the prediction abilities of all models do not vary a lot, suggesting that the models can be considered spatially transferable across cities. In general, the spatial transferability in our analysis suggests the stability in predicting log BMI values from one context to another. The trained models across different cities could be regarded as spatially transferable based on similar predictive performance, regardless of the error measure considered.

Variable Importance Plots of Random Forest Models Across Cities

Note that the MAE, MSE, RMSE and R-squared values are similar when using models for different cities does not actually mean that the models are the same in terms of which variables are important to them. It is, therefore, important to examine the possibility of spatially varying effects. The explorations of the variable importance of predictors across cities help to examine the existence of spatially varying effects of predictors. Even though the usage of weight affects the variable importance of neighbourhood environments more than other types of predictors, the influence of neighbourhood environments is still limited compared to lifestyle habits and socio-economic status. In summary, the overall power of weight is limited across cities. Thus, only the variable importance plots from unweighted models (Model RF1 to Model RF6) are displayed in Fig. 4, and the variable importance plots from weighted models are displayed in the Appendix (Fig. S1). Variables with the symbol “_” are one category of the categorical variables. For instance, Usual walking pace_Brisk pace represents the category “Brisk pace” in the Usual Walking Pace variables and the reference category is Usual walking pace_Slow pace. The reference category selection for categorical variables depends upon their attributes and contents. The first type of categorical variables was classified by the frequency of occurrence (none, once a week, twice a week etc.), and the intensity or magnitude (low, medium, high). The second kind was classified by their properties (British White, Black, and other ethnic groups). The lowest intensity, magnitude or frequency was used as the reference level for the first kind. Otherwise, the category with the largest population was picked as the reference category; for example, British white is the reference category for ethnicity. Variables without the symbol “_” are continuous variables.

Fig. 4
figure 4

Variable importance from unweighted random forest models for predicting log BMI values

For all six cities, the usual walking pace, snoring or not, gender, and time spent in watching TV are the most important predictors for continuous Log BMI values. Except for London, the education deprivation index has relatively high variable importance. This may be because neighbourhoods in London have lower educational deprivation compared with other cities. Individual food intake is not so essential compared with the frequency of overall dietary changes. The influence of the neighbourhood effect varies across cities, but their impact is limited compared to lifestyle habits. For example, accessibility to the nearest fast-food outlets has a relatively high variable importance (ranked in the top 10 most important variables) while it has marginal effect in London (Fig. 4). Possibly because fast-food outlets are nearly everywhere with high distribution density in London. The variable importance of drinking and smoking habits is limited compared with physical activities and sleeping patterns, but they are more significant than some food intake variables.

The overall variable importance across different cities has more similarities than differences. Even though neighbourhood environments perform differently among cities, their influence is relatively limited compared with lifestyle habits. The most significant predictors are lifestyle habits related to physical activities and sleep patterns across six cities. The absolute variable importance values may change for these important variables in different cities; however, the rankings of the variable importance keep unchanged across cities. In summary, important variables constantly rank high in predicting log BMI values, while relatively unimportant variables have varying rankings in different cities. The importance of these significant variables in random forest models is similar across cities, which is mutually corroborative with the observed transferability based on similar prediction abilities in the previous section.

Results from the Multilevel Models Containing Socio-economic Status and Demographic Characteristics

In the wider literature, there is interest in the relationship between deprivation and obesity; thus, a spatially varying effect between deprivation and obesity is being explored. Participants who live in deprived areas have higher BMI values regardless of individual socio-economic status across those six cities. The effect sizes of the Townsend deprivation index are nearly the same for Birmingham, Manchester, Newcastle and Nottingham with or without the consideration of individual deprivation (household income), suggesting a fairly constant effect of neighbourhood deprivation on obesity (Table 4, Fig. 5). In the Model Townsend Deprivation, an increase of one unit in the Townsend deprivation index is associated with a 0.006-kg/\({\mathrm{m}}^{2}\) increase in log BMI values across these four cities. In the Model Income & Townsend Deprivation and Model Socio-Economic Status, which consider individual socio-economic status, there are slight differences in coefficient estimations for the Townsend deprivation index. However, they are still between 0.005 and 0.006 kg/\({\mathrm{m}}^{2}\) for Birmingham, Manchester, Newcastle and Nottingham, which is approaching the estimations in the Model Townsend Deprivation. The spatial differences towards effect sizes could still be regarded as stationary for these four cities.

Table 4 Associations between Townsend deprivation index with log BMI values (estimated using two-level (coordinate-id) regression models) in Birmingham (n = 13,195), Bristol (n = 25,035), London (n = 31,536), Manchester (n = 7331), Newcastle (n = 20,821) and Nottingham (n = 19,190)
Fig. 5
figure 5

Associations between Townsend deprivation index with log BMI values in Model Townsend Deprivation (a), Model Income & Townsend Deprivation (b), and Model Socio-Economic Status (c) across 6 cities and each city area is represented by a 25-mile buffer based on its corresponding assessment centre locations (darker colour represents higher coefficient estimations). To better compare ac, we padded an additional colour bar for each subfigure. However, there are no cities with coefficient estimations for Townsend deprivation index from 0.0061 to 0.055 kg/\({\mathrm{m}}^{2}\) in a and c, 0.0551 to 0.073 kg/\({m}^{2}\) in b

For Bristol, the Townsend deprivation index coefficient estimations are between 0.002 and 0.003 kg/\({\mathrm{m}}^{2}\) in the three models, which is approximately half the effect for the other cities (Table 4, Fig. 4). For London, the effect of neighbourhood deprivation on obesity is an order of magnitude greater than for the other cities. The effect size of the Townsend deprivation level is slightly larger in the model that only contains neighbourhood deprivation level compared to the models with both neighbourhood and individual socio-economic status. In the Model Townsend Deprivation, a unit growth of Townsend deprivation level is associated with 0.073 kg/\({\mathrm{m}}^{2}\) rise in log BMI values. In the Model Income & Townsend Deprivation and Model Socio-Economic Status, the coefficient estimations towards Townsend deprivation level are 0.055 and 0.063 kg/\({\mathrm{m}}^{2}\).

Three-level (city-coordinate-id) random slope models allowed the slopes to vary randomly across cities and were fitted to explore whether the effect sizes of the Townsend deprivation index were constant in these cities. The results of the random slope models are shown in the Appendix (Table S4). It is found that there is no change in the slope for individual cities with the estimated variance smaller than 0.001 (standard errors < 0.001) when considering or not considering individual socio-economic status, suggesting no differential effect for Townsend deprivation level across space. Additionally, compared with the single slope models, the corresponding random slope models did not bring statistically significant changes in Deviance, indicating that the varying slopes did not provide better fit for predicting log BMI values.

With the increase in household income, the regression coefficients decreased monotonically for all three models in all six cities, suggesting that lower household income is associated with higher BMI values in these six cities (Table 5). As for Birmingham, in the Model Income, compared with the lowest household income < £18,000 (reference category), the log BMI values were 0.039 kg/\({\mathrm{m}}^{2}\) lower for those with the highest household income > £100,000. In the Model Income & Townsend Deprivation, with the consideration of neighbourhood-level deprivation level (Townsend deprivation index), those with the highest household income have lower log BMI values (β =  − 0.026 kg/\({\mathrm{m}}^{2}\), P < 0.001) compared with those with the lowest household income. Taking other individual socio-economic statuses into consideration, those with the highest household income have 0.012 kg/\({\mathrm{m}}^{2}\) lower log BMI values compared with the lowest household income. The effect of household income becomes slightly reduced as other socio-economic conditions are accounted for. The other individual socio-economic status has more evident influence compared to the neighbourhood deprivation level.

Table 5 Associations between household income with log BMI values (estimated using two-level (coordinate-id) regression models) in Birmingham (n = 13,195), Bristol (25,035), London (31,536), Manchester (7331), Newcastle (20,821) and Nottingham (19,190)

As for London in the Model Income, those with the highest household incomes (> £100,000) had − 1.349 kg/\({\mathrm{m}}^{2}\) lighter log BMI values (95% CI: − 1.56264, − 1.13536; p < 0.001) than those with the lowest incomes (< £18,000). While for Newcastle, Nottingham and Manchester, the highest household income had approximately lower 0.04 (0.047, 0.041, 0.046) kg/\({\mathrm{m}}^{2}\) log BMI values compared with the lowest incomes without consideration of other socio-economic statuses. The circumstances are similar for Model Income & Townsend Deprivation and Model Socio-Economic Status in that the log BMI differences among different categorical incomes in London were the most evident when compared with the other five cities, which is consistent with the highest effect sizes of the Townsend deprivation index in London. With the adoption of three-level (city-coordinate-id) random slope models with the allowance of household income to have a different effect for each city, both the estimated variance and standard error for the slopes of household income levels were both smaller than 0.001 and close to 0, suggesting the effect of household income is broadly stationary, whether taking other socio-economic statuses into considerations or not. Furthermore, the Deviance differences between single slope models and matched random slope models were not statistically significant, suggesting there were no differences between cities in the relationships between the household income levels and the log BMI values.

Discussion

Summary of Findings

Despite plenty of studies linking place and health to explain health disparities across space, our results failed to find evidence of geographic context or substantial spatially varying effects of predictors on obesity for middle- and old-aged adults across six cities. We found no geographical clustering of log BMI values at any geographic level due to extremely limited VPC at any geographical scale. Furthermore, we found no significant differences in the predictive power of machine learning models across cities in predicting BMI values, suggesting models were spatially transferable from one context to another in England. Compared with lifestyle habits, including dietary habits and physical activities, the spatially varying neighbourhood effect on obesity is negligible and marginal. We also found that even when severe volunteer bias was reduced by using IP-weight as proposed by van Alten et al. (2022), models were still transferable from one city context to another, the effects of risk factors remained constant and neighbourhood effects remained marginal. Furthermore, multilevel models incorporating socio-economic status and demographic characteristics further indicated that associations between deprivation level, both neighbourhood and individual deprivation level and obesity were consistent across space.

Interpretations and Discussion of Results

The extremely low VPC values from multilevel null models illustrate that the majority of variance of individual BMI values comes from within the coordinate level rather than between the coordinate level. In other words, the BMI values of participants who share the same coordinate vary greatly. Mason et al. (2021) also proposed that it is not surprising that most of the variations in BMI would be expected to be explained by individual-level factors. Scholars have debated more about the significance of geography on obesity prevalence at the aggregation level rather than predicting individual BMI values in this study. In this study, due to the 1-km coordinate rounded measurements, over 300 participants may share the same coordinates, and their individual BMI values may vary considerably because being obese is the consequence of a complex and multifactorial social gradient in the interactions of context factors and interpersonal behaviours (Barton and Grant 2006; Black et al. 2010; Colberg et al. 2016; Hamasaki 2016; Healy et al. 2015; Kriska et al. 2003; Nguyen et al. 2017; Pesta et al. 2018; Reilly et al. 2018). It is reasonable to assume that the prediction of individual BMI values increases the difficulty of detecting the significance of geographical context compared with the regional level prevalence of obesity prediction.

Given the evidence of no apparent clustering of individual BMI values at any geographical level, the similar prediction abilities of random forest models in modelling individual BMI values across different cities suggest models are spatially transferable from one city context to another. Furthermore, there are more similarities than differences towards variable importance of predictors across cities, indicating constant associations among predictors and risk of obesity over space. It is noticeable that the R-squared values of models are not very high, but they are acceptable for social science research when most predictors are statistically significant (Ozili 2023). Furthermore, the prediction difficulty has increased for predicting continuous BMI values compared with binary outcomes (obese or not). The overall analysis was at the individual level with a large number of participations. Furthermore, R-squared values get smaller when the sample size increases (Reisinger 1997; Ozili 2023). High R-squared values are more common in predicting aggregated obesity outcomes, such as the prevalence of obesity (Shrestha et al. 2013; Sun et al. 2020a). Additionally, the major objective of the work is to examine whether the geographical context has influence on obesity based on model prediction performance and variable importance across cities rather than predicting BMI values accurately.

Mason et al. (2021), who also used the UK Biobank, proposed that relationships between the availability of neighbourhood physical activity facilities and BMI, and fast-food proximity and BMI, varied from place to place across urban England (Mason et al. 2021). Nevertheless, Mason et al. (2021) did not attempt to account for the influence of lifestyle habits, such as physical activities and diet habits (Mason et al. 2021), which were considered in our study. In this study, the effective magnitude of neighbourhood variables varies across cities, but their impact is limited compared with lifestyle habits, such as the usual walking pace. Overall, our results indicated that the spatial varying effect of neighbourhood factors cannot fully interpret individual health disparities of obesity compared with lifestyle habits. Rossi et al. (2019) also implied that the influence of neighbourhood effect on obesity may be marginal compared with personal habits such as physical activities and diets.

The spatially varying performance of key factors would support geographic heterogeneity; however, it is not found in this study. The separate two-level random intercept models further illustrated the constant influence of both individual and neighbourhood deprivation levels in Birmingham, Manchester, Newcastle and Nottingham based on the constant effect sizes. We acknowledge that the coefficient estimations for socio-economic status reveal a much more pronounced effect on obesity for London than for the other cities. One possible explanation is that it is more racially diverse with higher educated and income participants. If geographic context did cause differences in the performance of variables, the usage of the random slope model would improve the predictive power with the evident varying slopes among cities. However, the three-level random slope models indicated that if all six cities were regarded as one entirety, the effect sizes for deprivation remained constant as both the variance and standard errors were minimal and close to 0 and Deviance differences between the single slope models and random slope models were not statistically significant. It is not to conclude that geography cannot make any differences in socio-economic status, but it is primarily affected by the study region and samples. In summary, the effect of geography on deprivation levels is limited for the selected cities from the UK Biobank data.

Strengths and Limitations

Compared with other studies that stressed the significance of spatial heterogeneity and geographical patterns, such as clustering (hot spots and cold spots) of health outcomes, this paper proposes that there is no substantial geographical patterning between areas with more and less obese participants because most of the variation the models find is at the individual level. Accordingly, this paper questions the significance of geographical context and suggests stationary associations between main risk factors and obesity. Furthermore, this paper also suggests that random forest models are spatially transferable based on similar prediction abilities across cities. This study has used the machine learning methods, random forest models to examine whether there are spatially varying effects based on prediction results among different cities.

Multilevel models were also used to offset the “black-box” disadvantages of machine learning methods and to look for spatial variations in socio-economic status. Previous UK-based studies on contextual factors are either ecological analyses (Neelon et al. 2017; Sun et al. 2020a) or only focused on small areas with limited observations (Fraser et al. 2010; Fraser et al. 2012). This paper employs individual-level data from the UK Biobank (UK Biobank 2022) and linked UKBUMP, to look at both micro- and meso-level analysis in the UK to explore if there are any geographical patterns in obesity and the factors explaining obesity.

Although scholars have questioned the representativeness and sample bias of the UK Biobank, few studies have examined whether and how volunteer bias affects the observed findings. This paper compared the weighted and unweighted associations to explore whether volunteer bias influences geographical context using the weights proposed by Van Alten et al. (2022). We found that weights did not bring evident changes in prediction ability and variable importance suggested by the random forest models.

There are limitations to this study. The obtained results should, therefore, be viewed as a case study on the importance or otherwise of geographical context in the studied areas rather than suggesting that the limited effects of geographical context would necessarily be true in other circumstances. The analysis is cross-sectional rather than a longitudinal exploration. UK Biobank did not update the latest data for each participant after the baseline assessment visits during 2006 to 2010. Thus, the overall analysis is restricted from 2006 to 2010. Other scholars who also adopted the UK Biobank work indicated that UK Biobank datasets are cross-sectional and the usage of the dataset is limited by its collection method and time (Burgoine et al. 2018; Cassidy et al. 2017; Dadvand et al. 2014; Healy et al. 2015; Mason et al. 2018; Mason et al. 2020; Rauber et al. 2021). The study does not consider how macro-level processes impact on obesity over time. Additional variables such as local taxes (on, for example, high sugar drinks) and local policies reflecting social and economic context may be beneficial in comprehending spatial variations on obesity in, for example, cross-national studies (Black 2014).

Another potential limitation is the imprecise georeferencing of participants. The round coordinates for home addresses may lead to inaccurate classification for participants living near the Lower Super Output Area (LSOA) boundary, thus matching to inaccurate neighbourhood variables measured at LSOA level, such as AHAH variables. In addition, the potential edge effects of the neighbouring effect are also ignored in this paper, which is the residents may be affected by living neighbours and their neighbouring environments (Van Meter et al. 2010). Furthermore, interpretability ability of selected variables is limited, thereby requiring more variables to explain the spatial variations on obesity.

Despite the aforementioned limitations, the strengths of this study include the usage of both the machine learning approach and multilevel models help to understand that there are no substantial geographical patterns to where more and less obese participants in the UK Biobank nor predictors of obesity vary across space. The insignificance of the geographical context on obesity remains stable even after accounting for IP-weight that offsets particular sampling bias in UK Biobank data. Future research is needed to extend the study coverage to the UK rather than England areas, consider edge effect and adopt more macro variables, such as local policies, local taxes and regional poverty.

Conclusion

Most studies on neighbourhood effects implicitly assume that such effects are critical in producing health disparities. However, using the UK Biobank Survey, we found no substantial evidence for the importance of geographical context in understanding difference in BMI values or their predictors. The possible exception is neighbourhood environments, including accessibilities to fast food outlets, health services, pubs and tobacco stores, for which their relationships with BMI values vary from place to place in England. However, their importance in predicting individual BMI values is limited when compared with interpersonal lifestyle habits, such as physical activities. Furthermore, individual and neighbourhood socio-economic status almost constantly influence obesity in Birmingham, Manchester, Newcastle and Nottingham while they are more pronounced in London. However, when six cities are considered as one entirety, the associations between socio-economic status and obesity remain constant and limited compared with interpersonal behaviours.

Additionally, there are no apparent clustering patterns of individual BMI values at a number of geographical scales, specifically city, MSOA and coordinates, suggesting that the variations of individual BMI differences mainly lie within geographical scales, at the level of the individual, rather than between any geographical scales. It is not to conclude that our results support a “one-size-fit-all” policy in relation to health interventions because our study areas contain relatively developed urban cities in England and deprived cities and towns are not considered in our analysis due to data scarcity issues. Furthermore, we find that ethnic minority populations and deprivation areas each is associated with higher obesity risks, thereby requiring attention from policy makers. This paper contributes to wide debate on the importance of geography on health disparities, with, in this study, evidence for neighbourhood or contextual effects being somewhat elusive and suggests a deeper understanding of the magnitude and scope of application of geography and neighbourhood effects. Future work could extend to cover a wide range of deprived areas of the UK, taking into account edge effects and using more macro and time-varying variables to examine the temporal association of obesity risk across a wider area.