Background

Physical activity (PA) is an important health behaviour that is linked to cardiometabolic disease risk including obesity [1]. Sedentary behaviour (SB), any low-energy-expenditure activities (≤1.5 MET) in a sitting or reclining posture (e.g. computer use, watching television, driving a car) [2], has been shown to be linked to health risks, even among people who engage in some PA [35]. Research to date has been treating PA and SB as distinct entities, although they are part of the same continuum and the above studies [35] highlighted some complex interrelationships between them.

In the literature, prediction equations for PA and SB have largely been guided by a key assumption that individuals engage in PA and SB at two discrete time points and when they choose to do one they don’t think of the other. In other words, the two behaviours are independent of each other. This has led many analysts to consider single-equation models, regressing for instance time spent on PA and SB separately on potential correlates [6]. As researchers don’t always observe variables that may have synergies between the decisions to engage in PA and SB, such as environmental factors (e.g. neighbourhood safety), just accounting for observables to study one behaviour at a time, ignoring the other, may be erroneous. This assumption of independence may be limiting because one could argue that individuals’ time allocation in various activities are optimised by their satisfaction space and available time. People could know a priori possible scenarios in which they want to do various activities and gain a defined level of satisfaction. Predicting equations for PA and SB may therefore be interrelated to some extent and a failure to account for such inter-correlation could lead to inefficient estimates because it does not make optimal use of all available information [7]. Our earlier empirical work using population-based accelerometry data show a low-to-moderate correlation (rho = −0.28) between moderate to vigorous PA and SB [8], suggesting that this area can be subjected to further empirical testing.

In the present study, we empirically tested the interdependent nature of PA and SB. We compare two different modelling frameworks, namely independent equations (which is current approach in the literature), and joint estimators (which is a promising and novel, but yet unexplored approach) using objectively-assessed PA and SB.

Methods

Data

The data source was the 2008 Health Survey for England (HSE08) which had a special focus on PA and SB. HSE08 was a cross sectional survey that drew a roughly nationally representative sample of people residing in private households in England. The sampling was based on a multi-stage stratified random sampling design that used the postcode address file as a sampling frame. A randomly selected sample of 4,507 adults (16 plus years) wore the accelerometer (Actigraph model GT1M, Pensacola, FL,USA). Respondents were to wear (at the waist) the Actigraph during waking hours for 7 consecutive days. Daily use was considered ‘valid’ if the Actigraph was worn for at least 10 hours. Kinesoft software (3.0.98) was used to analyse the raw accelerometry data to generate standardised measures. Further details on the survey and use of accelerometer in the HSE 2008 can be found elsewhere [9].

Like previously [8], moderate to vigorous physical activity (MVPA) was defined as a minutely count of ≥2020 counts/minute and SB was defined as the number of daily minutes with a minutely accelerometry count of <200 counts/minute. We have previously found that in this dataset using different cut off points for sedentary time (e.g. 100 cpm cut point) makes virtually no difference for analysis related to SB [8].

The explanatory factors we included are variables that are commonly correlated with PA and SB [10, 11]. These included socio-demographic variables (age, income, sex, education, ethnicity, marital status, employment status, access to vehicle), household characteristics (number of children and adults in household), health indicators (having a limiting long-standing illness,), health behaviours (smoking and alcohol drinking status) and season of the year participants were measured. At the area level, included correlates were region of residence (strategic health authourity) and urbanisation.

Statistical analysis

Two multivariable modelling frameworks were used to estimate the factors that are associated with MVPA and SB. First, following the current empirical approach in the literature, separate OLS models were fitted for MVPA and SB respectively, assuming that MVPA and SB are independent. The estimates from such an approach served as baseline for us to test the joint nature of the two lifestyle choices. The assumption underlying this approach was that the error terms of both equations were not correlated [7]. The second approach fitted a seemingly unrelated regression model (SUR), which jointly and simultaneously estimates equations for MVPA and SB. This estimator accounts for the correlation between both equations. In practice, the seemingly unrelated model is estimated using two joint linear models that indicate an individual’s participation in MVPA(Y1) and SB (Y2) correspondingly:

Y 1 = X 1 β 1 + ϵ 1
(1)
Y 2 = X 2 β 2 + ϵ 2
(2)

where X1 and X2 are regressors of PA and SB respectively. X1 ≠ X2 and we also require T > Ki (where T = total observations; Ki = total regressors).

ϵ= ϵ 1 , ϵ 2

regressors are assumed to be strictly exogenous:

E ϵ X 1 , X 2 =0

For any given equation the disturbance is homoscedastic and that the errors terms are uncorrelated across observations but correlated across Eqs. (1) and (2). Therefore:

E ϵ 1 ϵ 2 X 1 , X 2 = σ 12 I T

Comparative analysis of the SUR and OLS was conducted with a number of indicators. Using the test for diagonality, Breusch Pagan test, we examined whether SUR compared with the OLS leads to efficiency gain. This test produces a LM statistic that adds the squared correlations between the residual vectors for equations (for MVPA, and SB), with a null hypothesis of diagonality, zero contemporaneous covariance between the disturbances of the two equations [12]. The size of standard errors of estimated parameters of both equations was also compared in relative terms (specified as: (StandardErrorOLS- StandardErrorSUR/StandardErrorSUR)*100)). The Hausman specification test was used to check which estimator produced better consistent and efficient estimates.

The models were estimated with sampling weights that were calculated as the inverse of the probability of being a respondent in a household multiplied by the household weight which accounts for non-responding households [9]. As individuals are nested within households, the models were estimated with cluster (household identifier) to allow the errors associated with individuals residing in a household to correlate with each other.

The PA data was log-transformed using the default logarithmic scale of [u = log10] to improve normality of its distribution. Marginal effects (or elasticity for continuous correlates) were computed to show relative impact of a correlate on MVPA or SB. In the case of equations for MVPA, marginal effects were expressed as exponential values because the associated geometric means (as showed by the marginal effect) arising from log transformed dependent variables has to be converted to the arithmetic mean for comparison with the original data for interpretation. The t-test was used to examine whether missing data occurred completely at random. If not at random, missing values for explanatory variables were treated as separate categories and included in the models in order to avoid biased estimates [13]. The threshold for statistical significance was set at ≤5 % in all analyses. Multiple comparisons were adjusted for using Bonferroni correction. All analyses were undertaken using Stata version 13.

Results

Description of sample

Descriptive statistics for the variables are presented in Table 1. A total of 2,289 adults had valid accelerometry data and were included in the analyses. On the average, people spent 28 minutes/day participating in MVPA, and 472 minutes undertaking SB per valid day. The mean age of sample was 52 (SD = 18) years. Most were female (55%), married and living with their partners (55%) and employed (54%). Few were obese (BMI ≥ 30 kg/m2) or current smokers (20%), while the large majority were drinkers (89%), defined as drinking alcohol at once or twice a year.

Table 1 Descriptive statistics of variables

Only three variables (income, smoking status, and drinking status) had missing observations. ‘Income’ had the highest number of missing observations (n = 366), whereas ‘drinking status’ had the lowest (n = 5). Daily MVPA of respondents with missing values were different from those without (income: 26.28(SD = 1.25) vs 28.61(0.56), p value = 0.107; smoking status: 60.33(SD = 23.41) vs 28.15(0.53), p value = 0.002; drinking status: 76.60(SD = 28.18) vs 28.13(0.53), p value <0.001).

Comparison of modelling frameworks

Tables 2 and 3 show estimates for the results of SUR and OLS models for MVPA, and SB respectively. The MVPA equation was found to be correlated with that of SB equation (r = −0.156; P <0.001), with the Breusch-Pagan test for independence of the residual vectors of both equations suggesting such correlation is not by chance.

Table 2 Estimation results of seemingly unrelated regression model (SUR) and ordinary least square (OLS): physical activity
Table 3 Estimation results of seemingly unrelated regression model (SUR) and ordinary least square (OLS): sedentary behaviour

Second, further evidence of efficiency gains via the SUR can be ascertained by comparing standard errors. The magnitude of reduction in standard errors achieved via SUR was found to be relatively more for the estimated parameters of SB equation compared to the MVPA one (see sixth columns of Tables 2 and 3).

Third, the Hausman specification test suggested that for SB, the parameters of the SUR model were systematically different from those of the OLS (Chi2 = 92.75; P <0.001) and produced better consistent and efficient estimates. Both models, however, yielded similar coefficients in the case of MVPA. Table 2 shows both SUR and OLS indicate that older individuals, females, and individuals with access to vehicles were associated with lower levels of MVPA. Conversely, individuals who had non-limiting illness (or no illness) undertook more PA per day. Compared with individuals who drank alcohol almost every day, people who drank on 5 or 6 days/week spent more time undertaking MVPA (ME = 1.415 to 1.438). The correlates of SB differed from that of MVPA in terms of type of correlates as well as direction of correlation (when same correlates were found across both behaviours; except for gender). As shown on Table 3, SB was positively correlated with age, and educational qualification. Females, individuals in better health, and smokers spent less SB time.

Discussion

This is the first study, to our knowledge, to conduct an empirical test as to which of the two conceptual constructs (i.e. whether PA and SB are jointly determined or independent) is likely to be more efficient modelling framework. A search of SCOPUS (largest bibliographic database) and PUBMED, conducted in May 2014, located no applications of joint estimators for PA and SB, although two previous studies have applied SUR to PA and sport or diet [14, 15]. The findings from the analysis in this study showed a contemporaneous correlation between the errors terms of equations for SB and PA, suggesting that the two movement behaviours are interdependent and hence the utility of the SUR particularly for SB (albeit small efficiency gains).

In addition, the SUR provided relatively less uncertain coefficients than the OLS estimator, especially in the case of SB. Therefore, joint estimators were found in this study to be a more efficient modelling framework than the current approach in the literature (single estimators). Notably, if there was no evidence of a contemporaneous correlation between the SB and MVPA equations, the SUR would have been equivalent to equation by equation via OLS. Nonetheless, the SUR would have still been worthwhile because it has an attractive feature of allowing restrictions to be imposed and appropriate tests conducted across parameters in the different equations. For example, joint significance tests for variables that are common to both equations could be undertaken.

Our analysis is not free from an important limitation though; but we judge that the implications of this limitation for our conclusions if any are only minimal. For example, our analysis had unequal observations on which the MVPA and SB equations were estimated (the former had 20 less observations). Noted by McDowell [16], fitting a SUR on equations that have varying number of observations could lead to loss of information because observations that are unavailable for both equations are discarded, potentially leading to more uncertain estimates. In our case, because the excluded observations were not systematically different from the included ones and fitting the equations on the full set of imputed observations resulted in similar findings, it is unlikely that the current conclusions would change if actual information were available on those 20 observations.

Notwithstanding the limitation, the findings from this study do offer an important consideration for future research on SB and PA. The key message here is that to achieve more efficient and tighter estimates, analysis of the correlates of PA and SB in particular ought to use simultaneous joint equations that account for the intercorrelation between PA and SB. Further research using different samples is, however, required to fully understand the magnitude of efficiency gains accruable from using the joint estimators. Only by examining this, will we be in a position to accurately determine the importance of joint estimators in this area.

Conclusion

This is the first study examining the efficiency gains accruable to joint estimators (e.g. SUR based analysis) into the associated factors of PA and SB. However, it would be important to replicate these results in other datasets in order to provide firmer conclusions on the most appropriate modelling framework for analysing PA and SB.