Background

Excess body weight is determined by multiple factors acting in combination, including genetic, metabolic and behavioral factors, as well as more upstream socio-economic influences and built environment characteristics [1]. Those that are modifiable provide important potential targets for preventive interventions [2]. Diet and physical activity are recognized as the most proximal determinants of energy balance [3] but there is growing recognition of the role of sedentary behaviors (e.g., sitting time), independent of physical activity [47]. The influences of smoking and alcohol intake on body weight are also well documented [810]. More recently, a role has also been suggested for sleep duration [1113].

The inter-relationship of these obesity-related lifestyle behaviors has stimulated interest in co-occurrence patterns [14, 15]. Several studies have used explorative data-driven methods, such as cluster analysis or latent class analysis to examine the relations between diet, physical activity, and sedentary behaviors, independently of the health outcome of interest [6, 16, 17]. Smoking status and alcohol consumption have been included in some analyses [1820]. The variety of methodologies used make it difficult to ascertain how these factors correlate with each other and what this means for body weight and health. Additionally, previous studies have not considered contextual factors such as socio-economic characteristics and the built environment, increasingly recognized as major upstream determinants of overweight [21].

A recursive partitioning method—the classification and regression tree (CART) approach [22]—makes it possible to examine how a set of risk factors jointly influence the risk of an outcome such as overweight. This approach has previously been used to assess the risk of overweight in children [23, 24] and the risk of reduced mobility in older obese adults [25].

This study sought to identify the hierarchy of lifestyle-related behaviors associated with overweight in European adults, and to examine how subgroups identified differed by socio-demographic and built environment characteristics.

Methods

Study design and sampling

This study, part of the EU-funded SPOTLIGHT project [26], was conducted in five European urban regions: Ghent and suburbs (Belgium), Paris and inner suburbs (France), Budapest and suburbs (Hungary), the Randstad (a conurbation including Amsterdam, Rotterdam, the Hague and Utrecht in the Netherlands) and Greater London (United Kingdom). Sampling of neighborhoods and recruitment of participants have been described in detail elsewhere [27]. Briefly, neighborhood sampling was based on a combination of residential density and socio-economic status (SES) data at the neighborhood level. This resulted in four pre-specified neighborhood types: low SES/low residential density, low SES/high residential density, high SES/low residential density and high SES/high residential density. In each country, three neighborhoods of each neighborhood type were randomly sampled (i.e. 12 neighborhoods per country, 60 neighborhoods in total). Subsequently, adult inhabitants (≥18 years) were invited to participate in a survey. A total of 6037 individuals participated in the study between February and September 2014. The study was approved by the corresponding local ethics committees of participating countries and all participants in the survey provided informed consent.

Measures

Body mass index

Body mass index (BMI) was calculated by dividing self-reported weight (kg) by the square of the self-reported height (m2). Adults were categorized as overweight if their BMI was ≥25 kg/m2 [1].

Socio-demographic data

Socio-demographic variables included age, gender and educational level (defined as ‘lower’ [from less than primary to higher secondary education] and ‘higher’ [college or university level] to allow comparison between country-specific education systems).

Physical activity

Physical activity during the last 7 days was documented using questions from the long version of the validated International Physical Activity Questionnaire (IPAQ) [28]. Good reliability (Spearman correlation coefficients ranged from 0.46 to 0.96) and acceptable criterion validity (median ρ of about 0.30) have been found for this questionnaire in a 12 country study [28]. Transport-related and leisure time physical activity were estimated (in minutes per day − min/d) by multiplying the frequency (number of days in the last 7 days) and duration (average time/d).

Sedentary behavior

The validated Marshall questionnaire was used to collect sedentary behavior data during the last 7 days [29]. Acceptable criterion validity (Spearman correlation coefficient greater than or equal to 0.50 for watching TV, and using a computer at home during weekdays) has been demonstrated. Lowest validity coefficients were found for other leisure-time activities and transport-related sedentary behaviors during weekend days (correlation coefficients ranged from 0.15 to 0.42) [29]. Time spent (min/d) sedentary for travel, television (TV), computer and other leisure time activities (e.g., socializing, movies but not including TV and computer use) was averaged over a week.

Eating habits

Current eating habits were assessed using common food frequency questions on consumption of fruit, vegetables, fish, sweets, fast-food, sugar-sweetened beverages, and alcohol. Response options were ‘once a week or less’, ‘2 times a week’, ‘3 times a week’, ‘4 times a week’, ‘5 times a week’, ‘6 times a week’, ‘7 times a week’, ‘twice a day’, and ‘more than twice a day’.

Smoking status

Participants reported their smoking status: current, former or never.

Sleep duration

Participants provided information on their hours of sleep during an average night. The response options ranged from 4 to 16 h/night (in half-hour intervals).

Neighborhood clusters

Four neighborhood clusters were previously identified based on data related to food and physical activity features of the built environment collected by a Google Street View-based virtual audit performed in 59 study neighborhoods [30]. The clusters were labeled: cluster 1 (n = 33) ‘green neighborhoods with low residential density’, cluster 2 (n = 16) ‘neighborhoods supportive of active mobility’, cluster 3 (n = 7) ‘high residential density neighborhoods with food and recreational facilities’, and cluster 4 (n = 3) ‘high residential density neighborhoods with low level of aesthetics’.

Data analysis

CART approach

Recursive partitioning was used to identify the hierarchy and combinations of all lifestyle-related behaviors described in the Measures section that best differentiated overweight (≥25 kg/m2) vs. non-overweight (<25 kg/m2) participants.

Recursive partitioning is an algorithm of the CART nonparametric statistical method [22]. This approach has been used in different research fields, such as genetic epidemiology [31], and produced greater homogeneity in subgroups than has been achieved with other approaches, such as regression models [32]. Recursive partitioning is a step-by-step process by which a decision tree is built by either splitting or not splitting each node of the tree into two daughter nodes. Each possible split among all variables present at each node is considered. The tree is constructed by the algorithm asking a sequence of hierarchical Boolean (yes/no) questions (e.g., is Xi ≤ θj ?, where Xi is a candidate variable, and θj is a cut-off) generating descendant nodes [33]. The cut-off in the candidate variable that produced the maximal differentiation between individuals is retained, and used to split the sample into two subgroups (i.e. two daughter nodes). This process is repeated for each new subgroup found. Every variable is a potential candidate at each stage in growing the tree, so some variables may appear several times, using different cut-offs. The best way to split the data is determined by the Gini impurity index. This index ranges from 0 (pure node, i.e. all observations within the node assigned to a single target class—e.g., a node with a class distribution [0;1]) to 1 (impure node, i.e. mixed target classes—e.g., a node with a class distribution [0.5;0.5]). The complete tree is pruned by a sequential node-splitting process to avoid over-fitting the data; a sequence of sub-trees is generated and compared. The optimum tree is obtained using both cross-validation and cost-complexity pruning method. The cost-complexity pruning method assesses the balance between misclassification costs and complexity of the sub-tree. Additionally, each terminal node was set to require a minimum of 200 subjects.

Lifestyle subgroups

Characteristics of the subgroups identified through the CART analysis were compared. All variables included in the CART analysis were considered, in addition to socio-demographic and built environment characteristics (i.e. urban region, neighborhood type—pre-specified neighborhood type, and residential density and SES levels examined separately—and neighborhood cluster).

Chi-squared tests, and Kruskal-Wallis tests with post-hoc Bonferroni-Dunn test were used to examine differences between subgroups.

Multilevel regression analyses

Because participants were nested within neighborhoods, the likelihood of being overweight for each partitioning variable was estimated by a multilevel logistic regression model (neighborhood identifier included as a random effect) adjusted for potential confounders (gender, age, education level, and neighborhood type).

Statistical analyses were performed using R version 3.2 [34] (‘R-part’ package [35]), and STATA software (release 13.0; Stata Corporation, College Station, TX, USA).

Results

Characteristics of the study population

Results are given for 5295 individuals for whom BMI was available. The study population comprised 55.8 % females, with a mean (standard deviation-SD) age of 51.7 (16.4) years; 54 % were highly educated. Mean BMI was 25.2 (4.5) kg/m2, and 46.0 % adults were overweight. Compared to non-overweight subjects, overweight adults were more likely to be male, older, less educated, former smokers, short sleepers, less physically active, eating less fruit and vegetables, and spending more time sitting, especially when viewing TV. The prevalence of overweight ranged from 38.3 % in Greater Paris to 53.2 % in Greater Budapest (Table 1).

Table 1 Characteristics of the overall study population and according to weight status in the SPOTLIGHT study

CART analysis

The final tree contained 10 nodes (i.e. 10 subgroups) and had a classification error of 35.4 %. The 6 variables that were retained as the most important for discriminating overweight status were in the following order: sedentary time while watching TV, smoking status, sleep duration, leisure time physical activity, and vegetable intake (Fig. 1).

Fig. 1
figure 1

Recursive partitioning analysis (CART) of lifestyle-related behaviors for overweight status in SPOTLIGHT study (N = 5295). In dark grey are the identified subgroups with overweight prevalence above 50 %, and in light grey, those with overweight prevalence below 50 %. OR [95 %], odds ratios and confidence intervals at 95 % for each partitioning variable obtained by multilevel logistic regression model (dependent variable: overweight [yes/no], independent variables: partitioning variable identified by CART, gender, age, education, neighborhood type, and neighborhood identifier included as a random effect) are also provided. Abbreviations: h/n hours per night, min/d minutes per day, t/w times per week

The odds of being overweight were 61 % (41–85 %) higher for those reporting longer time watching TV (≥142 min/d) than others.

Longer time spent watching TV (≥142 min/d) and being a former smoker were important correlates of overweight. Current or non-smokers who spent a long time watching TV and were less physically active during leisure time were also at risk of being overweight.

Among adults watching less TV (<142 min/d) and being former smokers, those who were short sleepers (<7 h/night) were more likely to be overweight compared to long sleepers. Protective factors against being overweight among current and non-smokers included: short time watching TV, being physically active during leisure time, and eating vegetables every day.

Lifestyle subgroups

Table 2 shows the characteristics of the subgroups identified by CART. The proportion of overweight subjects ranged from 20 % (Subgroup 1) to 65.4 % (Subgroup 10). Overall, participants from the various subgroups differed in terms of lifestyle-related behaviors as well as socio-demographic and built environment characteristics.

Table 2 Profiles of the subgroups identified by recursive partitioning analysis (CART) in the SPOTLIGHT study

Subgroup 1 (n = 315, mean [SD] BMI: 22.7 [3.4] kg/m2) consisted of the youngest (40.8 [13.6] years-old), and highly educated participants (78.4 %). This subgroup reported the lowest time spent watching TV (mean [SD]: 5.2 [7.9] min/day, median: 0 min/day), the highest mean frequency of eating fruits and vegetables. The highest percentage of participants living in neighborhoods that were characterized by high SES and high residential density was observed in this subgroup, as was the lowest percentage of participants living in ‘green neighborhoods with low residential density’.

In 4 subgroups (7, 8, 9, and 10), overweight prevalence was >50 %. Members lived mainly in low SES neighborhoods. Subgroup 7 grouped less physically active individuals, who ate fruits, vegetables, and fish less frequently. Subgroup 8 members were short sleepers. The greatest percentage of individuals living in low residential neighborhoods was reported in this subgroup. Subgroup 9 included the greatest percentage of current smokers, individuals who reported long mean time watching TV (mean [SD]: 306.0 [131.3] min/day, median: 257 min/day), and high mean consumption of sugar-sweetened beverages (4.9 [5.7] times/week, median: 3.0 times/week).

Subgroup 10 (n = 676, mean [SD] BMI = 27.2 [5.0] kg/m2) included mainly males, older (59.6 [14.4] years-old) and low educated adults (64.5 %), who reported high alcohol consumption and living in ‘green neighborhoods with low residential density’.

Discussion

This study investigated the hierarchy and combination of lifestyle-related behaviors in relation to the prevalence of overweight in European adults. Prolonged sitting while watching TV, being a former smoker, short sleep, lower levels of physical activity and lower vegetable consumption were the lifestyle-behaviors that identified the subgroups with highest likelihood of being overweight. High-risk subgroups included mainly males, older and less well educated adults living in greener neighborhoods with low residential density.

Although it is well recognized that overweight and obesity are multifactorial in origin [1, 2], few studies have examined the joint relation of lifestyle-related behaviors with overweight in adults. In this study, a hierarchy of lifestyle-related behaviors in identifying subgroups at risk was established through a visual chart showing how risk factors are inter-related. The tree indicated that the most important factor was sitting while watching TV. This variable appeared several times at different levels of the tree, underlying its importance. The variable that followed was smoking status, in both tree branches, and no additional variable appeared to explain the risk for overweight in former smokers (among those with longer duration of watching TV), suggesting its very high impact. Sleep duration, leisure time physical activity and vegetable intake appeared at later stages in the tree, suggesting they would have less importance compared to sedentary behavior and smoking status. Relations between the lifestyle-related behaviors and overweight status were confirmed in multilevel regression analyses taking into account potential confounding factors. The findings also suggested nonlinear relations between lifestyle-related behaviors and overweight. Indeed, subgroups who watched TV a lot (>180 min/d) had lower odds of being overweight than subgroups who watched less TV (between 24 min/d and 142 min/d).

Although it has been suggested that a combination of several sedentary behavior variables is appropriate to capture sedentary lifestyle [36], only TV viewing was retained among several variables related to sedentary time. The greater importance of TV viewing has been previously suggested in cross-sectional studies [3739]. Given the lack of evidence from prospective studies, the issue of bidirectional or reverse causality has been raised [40]. In the Nurses’ Health study, each 2 h/d increment in TV watching was associated with a 23 % [17–30 %] increased risk of obesity. However, the risk of developing obesity was attenuated after adjustment for baseline BMI [5]. These findings may suggest that, even at baseline, women who watched more TV were already on a trajectory to become obese [5]. Heavier individuals at baseline could have a preference for sedentary habits due to their higher body weight. TV viewing is not only an indicator of sedentary behavior but may represent a potential surrogate of other behaviors affecting the energy balance e.g., via increased snacking behavior [7, 41].

Former smokers were more likely to be overweight than both current and never smokers. These results are consistent with previous findings [10, 4244]. Weight gain after quitting smoking has been related to the fact that nicotine acts as an appetite suppressant and quitting may be associated with increased energy intake [45, 46]. The average weight gain is about 4.5 kg, 1 year after quitting [46]. In the NHANES survey, weight gained over 10-years was significantly higher in former smokers compared to current smokers (8.4 kg vs. 3.5 kg, after adjustment for age, gender, ethnicity, education level) [44]. A recent study has estimated that smoking cessation leads to an average increase of 1.5–1.7 BMI units and that the drop in smoking may explain up to 14 % of the rise in obesity prevalence in recent decades [47]. Weight gain after smoking cessation was less pronounced when number of years since smoking cessation increased [43], and negatively associated with socio-economic status [48].

Short sleep duration was found to be associated with an increased risk of overweight. The hypothesized underlying mechanisms include thermoregulation, hunger hormone regulation changes, and/or an impact on physical activity and sedentary behaviors [4952]. Short sleep duration was associated with other lifestyle-related behaviors, such as TV or computer use [52, 53], a correlation between time spent sleeping, physical activity and sedentary behavior was documented [54]. High leisure time physical activity and intake of vegetables were associated with lower prevalence of overweight. These behaviors—which tend to co-occur—are both well recognized as healthy lifestyle behaviors [55, 56]. Interestingly, some cut-offs found are close to thresholds previously reported and/or recommended guidelines (e.g., 2 h/d watching TV [5, 38], 7–9 h of sleep [57]). In addition, at least one variable from each component of lifestyle (physical activity, sedentary behavior, sleep duration, eating habits, and smoking status) was identified as a correlate of overweight. Moreover, subgroups at high-risk of overweight were characterized by at least one unhealthy lifestyle behavior. These findings emphasize how all components of lifestyle are important to consider and a combination of unfavorable lifestyle factors may predict overweight in adults.

Lifestyle subgroups identified by CART differed in terms of socio-demographic factors. The subgroup with the highest prevalence of overweight comprised mainly males, older adults and lower educated adults. These findings are in line with previous studies [58, 59]. Individuals with higher educational background may be more informed about the health consequences of their lifestyles and have the resources to take action, leading to healthier lifestyle behaviors [60]. The subgroups identified also varied across urban regions: 72.2 % of French respondents were in the subgroups with lower overweight prevalence (subgroups 1–6). Looking at differences at neighborhood level, as previously documented [61], some neighborhoods seem more obesogenic than others, especially low SES and low residential density neighborhoods. Socio-spatial disparities in obesity prevalence at census-tract level have been previously documented with lower prevalence in neighborhoods with high median home values [61]. Low SES neighborhoods have been shown to have less supportive environmental conditions for active transportation [21]. Moreover, a greater percentage of ‘green neighborhoods with low residential density’ was observed in subgroups with high overweight prevalence. Greener neighborhoods with low residential density may be less supportive of active transport and more oriented towards motorized transport. Use of motorized transportation may be linked to weight gain [62]. Conversely, in high residential density neighborhoods, many destinations are easily accessed since located at shorter distance, and parking a car may be more difficult therefore encouraging active transportation (e.g., walking, cycling, public transport) [21]. Thus, adults living in neighborhoods unsupportive of physical activity and far away from destinations may be more likely to remain indoors and watch TV.

This study has several strengths: a relatively large sample size, assessment of a number of lifestyle-related variables using standardized procedures, a survey performed in different geographical areas across Europe, and the use of a nonparametric method (CART) providing a visual representation of lifestyle-related behavior inter-relationships. This study has some limitations, caution is thus needed when interpreting and generalizing the results. Due to its cross-sectional nature, temporal relations between overweight and lifestyle behaviors cannot be assessed. As data were self-reported, potential (recall) bias, and possible underestimation or overestimation of variables (e.g., weight/height [63], sedentary behaviors [29, 64], physical activity [6466]) cannot be excluded. Although behaviors such as eating habits were not recorded in enough detail to assess the role of more detailed dietary aspects, such as macronutrient intake, many aspects of lifestyle currently thought to be associated with body weight were covered (sedentary behavior, physical activity, eating habits, alcohol consumption, smoking status, and sleep duration). The CART method is data-driven, and the misclassification error was about 35 %. In the literature, it is not uncommon to report a misclassification error around 30 % and this might be higher for health promotion-based intervention strategies [67].

Conclusions

Low levels of TV viewing, non-smoking, high leisure time physical activity, high vegetable consumption, and longer sleep duration were identified as components of a healthy lifestyle associated with decreased risk of excess weight in adults. The results specifically point to the importance of sedentary habits as a key component to focus on when addressing the multiple factors associated with excess weight in preventive interventions.