It is recognised that people’s health is patterned by individual characteristics and also by area characteristics. There remains, however, debate as to whether people’s health behaviours and health outcomes are influenced by the social and physical environments of the place in which they live (Macintyre et al. 1993) or whether the different health outcomes and health behaviours observed across areas merely reflect the concentration of people living within those areas (Sloggett and Joshi 1994). The first of these—the influence of the characteristics of the environment—are usually termed contextual effects ; the second—the characteristics of people within areas and consequent concentration of these characteristics—are called compositional effects. Multilevel modelling presents a natural way of determining the relative importance of compositional and contextual effects and thereby of disentangling their importance. This is easily generalised to other example, such as hospitals, where, for example, we could think of a contextual effect as the influence of a hospital quality system on patient outcomes, and compositional effects might include the concentration of people with a particular stage of the disease. This chapter considers how multilevel modelling can be used to disentangle individual and contextual influences on individual health.

Context or Composition?

To be clear as to the definitions of context and compositional effects, we can refer to their definitions provided in Diez-Roux’s (2002) glossary for multilevel analysis:

COMPOSITIONAL EFFECTS

When inter-group (or inter-context) differences in an outcome (for example, disease rates) are attributable to differences in group composition (that is, in the characteristics of the individuals of which the groups are comprised) they are said to result from compositional effects.

CONTEXTUAL EFFECTS

Term generally used to refer to the effects of variables defined at a higher level (usually at the group level ) on outcomes defined at a lower level (usually at the individual level) after controlling for relevant individual level (lower level) confounders.

It is important that we should consider the meaning of any contextual variables in an analysis. Once an individual variable is aggregated to a context (e.g. by taking the mean), then its interpretation may change. For example,

mean neighborhood income may provide information that is not captured by individual-level income. The mean income of a neighborhood may be a marker for neighborhood-level factors potentially related to health (such as recreational facilities, school quality, road conditions, environmental conditions, and the types of foods that are available), and these factors may affect everyone in the community regardless of individual-level income. Similarly, community unemployment levels may affect all individuals living within a community , regardless of whether or not they are unemployed. ( Diez-Roux 1998 )

However, the distinction between compositional and contextual characteristics may not be straightforward since individuals may be constrained by their environment. Macintyre and Ellaway explain this idea:

Occupation may be determined by the local labor market; housing tenure by the local housing market; education by the available educational system and local provision; income by the prevailing labor market conditions; and car ownership by the density of population, distance to facilities, and local transport networks. Hence, rather than seeing [these] as properties of individuals, we could … see them as features of the local environment. ( Macintyre and Ellaway 2003 )

The fact that people with certain characteristics are concentrated in the same neighbourhoods is related to neighbourhood processes, such as selective migration and retention. These neighbourhood processes may be related to the outcome of interest, for example self-rated health, in a direct (less healthy people staying in the area) and indirect way (people with higher education or income moving out, but income and education in part determining individual health). Such neighbourhood processes are important for the interpretation of results of our analyses and pose interesting research questions in themselves (see Sampson 2012).

Using Multilevel Modelling to Investigate Compositional and Contextual Effects

We can illustrate the ways in which MLA makes it possible to investigate compositional and contextual impacts on health using an example based on an investigation of the influences of individual and neighbourhood social capital on self-rated health. This study used data from the Dutch Housing and Living Survey (Mohnen 2012; Mohnen et al. 2015). For this example, we concentrate on one measure of individual social capital that was used: whether or not the respondents had contact (including by telephone) at least weekly with friends, people whom they knew very well or family members (who did not live in the same household) . The authors also created a neighbourhood social capital score using ecometric techniques (see Chap. 8) based on respondent views as to whether people in the neighbourhood knew each other, whether neighbours were nice to each other and whether there was a friendly and sociable atmosphere in the neighbourhood. (In this study, the neighbourhoods comprised on average 2500–3000 addresses and about 4000 residents. The total analytic sample of 53,260 lived in 3273 neighbourhoods giving an average of 16.3 respondents per area.) Individual social capital is therefore a dichotomous variable (72.7% reported at least weekly contact with friends and family, subsequently referred to as high individual social capital) whilst neighbourhood social capital is a score ranging from −0.78 to 0.46 (mean = −0.10, standard deviation = 0.20) . Positive scores indicate greater social capital. Self-perceived health is dichotomised with 79.0% rating their health as good or better; as such, multilevel logistic modelling is appropriate for these data.

We can investigate the importance of the compositional (individual) and contextual (area) social capital on good or better self-rated health by comparing the following series of random intercept models:

Model Description
M0 Null model
M1 Individual social capital
M2 Neighbourhood social capital
M3 Individual and neighbourhood social capital
M4 Individual and neighbourhood social capital and their interaction

All models adjust for a range of individual socio-demographic confounders: sex, age, ethnic background, education, employment, income, home ownership and length of residence. Furthermore, all models include three neighbourhood variables: the proportion of respondents with income in the lowest twenty percent, an average measure of perceived home maintenance and urban density. So the null model M0 above is not empty; it consists of the eight individual and three neighbourhood variables listed above (as do all of the other models), but coefficients of these are not relevant to our interest in the relative importance of individual and area social capital. For each model, we will examine the interpretation of the effects of interest by plotting the predicted log odds. This series of models is a good way of disentangling context and composition (more on developing a modelling strategy can be found in Chap. 9).

Table 7.1 presents the estimated coefficients for the social capital variables for each of the models. The following sections interpret these coefficients and detail the implications of the specified model for the association of individual and area social capital with self-rated health.

Table 7.1 Coefficients (log odds ratios) exploring associations between social capital and good or better self-rated health for models M0 (null), M1 (individual), M2 (area), M3 (individual and area) and M4 (individual, area and interaction) (Mohnen 2012)

Model M0: Null Model

Since we are not interested in other covariates, we omit them and describe this model algebraically as

$$ {\displaystyle \begin{array}{c}{y}_{ij}\sim \mathrm{Binomial}\left(1,{\pi}_{ij}\right)\\ {}\mathrm{logit}\left({\pi}_{ij}\right)=\log \left(\frac{\pi_{ij}}{1-{\pi}_{ij}}\right)={\beta}_0+{u}_{0j}\end{array}} $$
(7.1)

The logit of the probability of reporting good or better self-rated health for individual i in neighbourhood j is modelled using a mean or intercept and a random effect for each area. The estimate of β0 is the estimated log odds of good health for an individual living in the average area, conditional on having certain baseline characteristics of both individual and area. (The exact characteristics depend on the precise coding of variables and how age is centred, etc., but these are not of interest to our substantive research question regarding the relationship between social capital and health.) The estimates from this model are plotted in Fig. 7.1. Figure 7.1a plots the predicted log odds of good or better health separately for those with high (solid grey line) and low (solid black line) individual social capital across the observed range of values of area social capital on the horizontal axis. In this instance, the lines coincide since we have not included a term differentiating between high and low individual social capital in model M0, and the lines are flat since there is no effect of neighbourhood social capital (again this is not included in M0). Figure 7.1b plots the predicted log odds of good or better health separately for areas with high (solid grey line), average (dotted black line) and low (solid black line) social capital across individual social capital on the horizontal axis. (We have used areas with a social capital score of 0.23, −0.10 and −0.43 to indicate high, average and low social capital, respectively.) Again all three lines overlap because there is no term in M0 denoting area social capital, and the lines are flat because there is no difference in the estimated log odds of good or better health between those with high or low individual social capital.

Fig. 7.1
figure 1

Predicted log odds of good or better health obtained under model M0 (null model) across (a) area and (b) individual social capital

Model M1: Individual Social Capital

This time our model includes individual social capital, x1ij, and its associated parameter estimate β1:

$$ \mathrm{logit}\left({\pi}_{ij}\right)={\beta}_0+{\beta}_1{x}_{1 ij}+{u}_{0j} $$
(7.2)

Parameter estimates from this model are used to create Fig. 7.2. From Fig. 7.2a we can see that those with high individual social capital are now more likely to report being in good health (or better) than those with low individual social capital. Since neighbourhood social capital is not included in M1, the predicted log odds of good health are constant regardless of the extent of neighbourhood social capital. Figure 7.2b illustrates this another way; we are unable to distinguish between areas with high, average or low neighbourhood social capital (the lines lie on top of each other) but, regardless of the extent of neighbourhood social capital, respondents with high individual social capital are more likely to report good health than those with low individual social capital.

Fig. 7.2
figure 2

Predicted log odds of good or better health obtained under model M1 (containing individual social capital only) across (a) area and (b) individual social capital

Model M2: Neighbourhood Social Capital

Our model this time includes neighbourhood social capital, x2j, and its associated parameter estimate β2:

$$ \mathrm{logit}\left({\pi}_{ij}\right)={\beta}_0+{\beta}_2{x}_{2j}+{u}_{0j} $$
(7.3)

Parameter estimates this time have been used to create Fig. 7.3. Figure 7.3a shows that there are no differences between those with high or low individual social capital since individual social capital is not included in Eq. (7.3). What we do see, regardless of individual social capital, is a gradient corresponding to area social capital; respondents living in areas with high social capital are more likely to report being in good health than those living in areas with average social capital who are, in turn, more likely to report being in good health than those living in areas with low social capital. Figure 7.3b shows again no difference between individuals with high or low individual social capital; there is a distinction in the likelihood of reporting being in good health that is dependent on the social capital of the area of residence but which is not affected by individual social capital as this is not included in Eq. (7.3).

Fig. 7.3
figure 3

Predicted log odds of good or better health obtained under model M2 (containing area social capital only) across (a) area and (b) individual social capital

It is worth noting at this stage that in terms of Diez-Roux’s definition we could argue that in this case area social capital (x2j) is not strictly a contextual variable (Diez-Roux 2002) since there is an important individual-level confounder missing from Eq. (7.3), namely individual social capital. It is possible that the relationships discovered in model M2 and described in Fig. 7.3 reflect a relationship between individual social capital and health combined with a tendency for those with high (low) individual social capital to cluster in neighbourhoods which therefore have high (low) area social capital. We can explore this in models M3 and M4 when both individual and area social capital are included in the same model. In general, it is important to ensure that the lowest level in a model is as complete as possible when we are interested in contextual effects to ensure that we are interpreting these appropriately and not incorrectly assigning individual characteristics, for which we have not fully controlled, to the area level.

Model M3: Individual and Neighbourhood Social Capital

This time the model is expanded to include both individual and neighbourhood social capital:

$$ \mathrm{logit}\left({\pi}_{ij}\right)={\beta}_0+{\beta}_1{x}_{1 ij}+{\beta}_2{x}_{2j}+{u}_{0j} $$
(7.4)

The parameter estimates from this model are used to plot the predicted log odds of good or better health in Fig. 7.4. It is clear that the likelihood of reporting good health increases as area social capital increases, but individuals with weekly contact with friends and family were also more likely to report good health. The two effects are independent (there is no interaction included in M3); the predicted difference between people with high and low individual social capital is the same (on the log odds scale) regardless of the area social capital. This is reflected in the lines in Fig. 7.4a being parallel. Similarly, the fact that the lines in Fig. 7.4b are parallel indicates that the impact of area social capital is the same regardless of whether an individual is classified as having high or low individual social capital.

Fig. 7.4
figure 4

Predicted log odds of good or better health obtained under model M3 (containing individual and area social capital) across (a) area and (b) individual social capital

Model M4: Individual and Neighbourhood Social Capital and Their Interaction

Model M4 develops M3 by including the interaction between individual and neighbourhood social capital:

$$ \mathrm{logit}\left({\pi}_{ij}\right)={\beta}_0+{\beta}_1{x}_{1 ij}+{\beta}_2{x}_{2j}+{\beta}_3{x}_{1 ij}{x}_{2j}+{u}_{0j} $$
(7.5)

The inclusion of the interaction term between individual and area social capital—x1ijx2j in Eq. (7.5)—and its associated parameter estimate β3 means that the assumption of independence of the compositional and contextual effects has been dropped. Figure 7.5 illustrates the impact of this on the predicted log odds of good health or better. Whilst it is still clear that there is a gradient across area social capital, with an increase in the probability of reporting being in good health increasing with increasing area social capital, from Fig. 7.5a we can see that the gradient is stronger (i.e. the impact of area social capital is more pronounced) for those with low individual social capital than with high individual social capital. Figure 7.5b suggests that individual social capital has a greater impact on self-reported health in low social capital areas than in average social capital areas, and more in average social capital areas than in high social capital areas. Despite this, people in high social capital areas tend to report better health than those in average or low social capital areas irrespective of their individual social capital. Note that the presence of the interaction means that the lines in Fig. 7.5 are no longer parallel; the magnitude of the individual effect (the distance between the lines) depends on the context, and the magnitude of the contextual effect depends on individual circumstances.

Fig. 7.5
figure 5

Predicted log odds of good or better health obtained under model M4 (containing individual and area social capital and their interaction) across (a) area and (b) individual social capital

Random Slopes and Cross-Level Interactions

A quick comparison of the illustrations in Fig. 7.4 (parallel lines) and Fig. 7.5 (in which the lines are no longer parallel) brings to mind the comparison between the models for random intercepts and random slopes in Fig. 5.5. The same principle applies: if the lines are not parallel, then this indicates that the relationship between an individual variable and the outcome varies between contexts. In a random slopes model, we do not know the reason for the relationship varying between contexts, just the fact that this variation exists. In the example used in the previous section relating to individual and area social capital, the authors could have tested for the existence of a random slope by expanding model M3 to enable the coefficient of individual social capital x1ij to vary between neighbourhoods (let us call this model M3A).

$$ \mathrm{logit}\left({\pi}_{ij}\right)={\beta}_0+{\beta}_1{x}_{1 ij}+{\beta}_2{x}_{2j}+{u}_{0j}+{u}_{1j}{x}_{1 ij} $$
(7.6)

The coefficient of individual social capital is now given by (β1 + u1j). This varies between contexts but not in a way that is determined by known area characteristics. For each neighbourhood j, we would estimate a slope residual u1j which would determine the nature of the relationship between individual social capital and health in that neighbourhood. With a cross-level interaction, we are able to describe the contextual circumstances associated with this relationship. From Eq. (7.5) we can see that the coefficient of x1ij is given by (β1 + β3x2j); this again varies between contexts but this time in a predictable way. (We saw from Fig. 7.5 that the impact of individual social capital was more pronounced in areas with low social capital.) In this way, it is possible to use random slope models as a means of hypothesis generation in exploratory analyses. Inspection of the values of the slope residuals u1j may reveal an apparent association with a known contextual factor. A cross-level interaction is generally to be preferred to a random slope since the former provides a means to describe how relationships differ between contexts (thus providing the potential for an explanation of the mechanism) rather than simply noting that such variation exists.

Impact of Compositional and Contextual Variables on the Variances

We have emphasised the important information that can be conveyed by the variances at different levels in a multilevel model. It is also worth reflecting on changes to the variances that occur during the modelling process.

When any variable is added to a multilevel model, as with an ordinary least squares (single level) regression model, we would expect to see a reduction in the total variance—the additional term is explaining some of the variability in outcomes. When compositional characteristics are added, we may see a reduction in the variance at any level of the model; patient characteristics, for example, may explain some of the differences between hospitals in patient outcomes. A hospital serving an elderly community, for example, may achieve worse patient outcomes than average solely due to the difference in the ages of the patients they see compared to other hospitals. And of course we would expect individual characteristics to explain some of the differences in outcomes between individuals.

The situation is slightly different when we consider contextual variables. Whilst this will still produce a reduction in the total unexplained variance (or no change in the total variance if the variable is not related to the outcome), a variable describing contexts cannot explain variation within those contexts. If we consider the impact of individual and neighbourhood income on self-reported health, then individual income could account for some of the variations between individuals within neighbourhoods as well as between the neighbourhoods themselves, whilst mean neighbourhood income could only explain some of the variation between neighbourhoods. Mean neighbourhood income does not differ between individuals in the same neighbourhood and therefore cannot explain differences in individual outcomes within neighbourhoods.

We should note here that a cross-level interaction between a level 1 and a level 2 variable will behave like a level 1 variable. In the above example, the interaction between individual and area income will vary between individuals living in the same neighbourhood and so may explain part of the variation within neighbourhoods.

Although the addition of a variable defined at a certain level should reduce the total variance, and the variance in the outcome attributed to that level, there may be circumstances under which the addition of a variable may increase the variance at higher levels. For example, the addition of a compositional variable (such as the patient’s age) may increase the variance between hospitals whilst decreasing the total (hospital plus patient) variance. There are three possible reasons for this phenomenon which we outline below.

Firstly, we should note that we are dealing with estimates, and there is uncertainty around these estimates. This is particularly true in the random part of the model and particularly at higher levels where there are fewer observations. So when noting a small increase in a high-level variance following the addition of a compositional characteristic, it is worth considering whether such an increase is important or whether this may reflect a lack of precision in the estimated variances. Certainly if the total variance appears to increase following the addition of a variable, this can only be due imprecision in the estimates.

Secondly, there may be a genuine increase in the variance between contexts following the addition of a compositional characteristic. In these circumstances, the omission of a compositional variable in effect masks existing variation between contexts. An unadjusted analysis of patient outcomes may show little variability between hospitals when the patient’s age is ignored. However, if outcomes deteriorate with increasing age, then the inclusion of individual age within a multilevel model may increase the variance between hospitals as those hospitals with a greater proportion of elderly patients are in fact performing better than average, given the age of their patients, and those with a smaller proportion of elderly patients are actually performing worse than would be expected. An example of this is given by Aakvik et al. (2010) who consider the contributions of patient, GP and municipality to certified sickness absence. They find that upon the addition of patient-level covariates to a null model, the total variance in the number of days of sick leave for females decreases from 5828 to 5650. However, they indicate an increase in the variance attributable to the GPs from 46.7 to 47.4.

Finally, multilevel logistic regression is a special case in which the reported variance at the higher level may appear to increase following the addition of a variable measured at a lower level. An explanation as to why this may happen is provided by Snijders and Bosker (2012), but briefly this reflects the link between the variance and the probability of an outcome described in Chap. 6, with the variance of the yij being given by πij(1 − πij) when the outcome follows a binomial distribution. As we saw from Eq. (6.5), the variance partition coefficient in a multilevel logistic regression model can be approximated by

$$ {\rho}_{\mathrm{I}}=\frac{\sigma_{u0}^2}{\sigma_{u0}^2+{\pi}^2/3} $$
(7.7)

If we add a compositional variable to a two-level multilevel logistic regression, then we might reasonably expect to see this explain a greater proportion of the variance within contexts (level 1) than between contexts (level 2), in which case the variance partition coefficient (the proportion of unexplained variance attributable to differences between contexts) should increase. Since π2/3 ≈ 3.29 is fixed, the only way to increase the variance partition coefficient is to increase the level 2 variance \( {\sigma}_{u0}^2 \). This means that, in a multilevel logistic regression model, \( {\sigma}_{u0}^2 \) can increase even though the variance between level 2 units decreases. Jat et al. (2011) provide such an example in their analysis of maternal health service use in India. They show that the district-level variance associated with the receipt of postnatal care increases from 0.389 in the empty model to 0.480 when a variety of individual, community and district variables are included. As a consequence, the proportion of the unexplained variance associated with the districts increases from 8.5 to 11.1%.

Model Specification and Model Interpretation

The exact specification of the model that is fitted can impact on the estimates that are obtained and hence on the interpretation of the model. It is not surprising to find that regression coefficients can differ depending on whether certain terms are included in a regression model or not, but in a multilevel model regression coefficients can also differ depending on the terms that are included in the random part of the model. We will illustrate this with an example.

A reanalysis of 1930 US Census data considered levels of illiteracy by race/nativity (with the population classified into ‘native whites’, ‘foreign-born whites’ and ‘blacks’) and, importantly, whether the relationship between illiteracy and race varied between states (Subramanian et al. 2009). The two models of interest shown in Table 7.2, derived from Table 2 of the original paper, compare a two-level variance components model with a model in which the coefficients for the three racial groups are allowed to vary.

Table 7.2 Odds ratios (OR) and 95% credible intervals (CI) for illiteracy by race/nativity under different models
$$ \mathrm{M}3:\mathrm{logit}\left({\pi}_{ij}\right)={\beta}_0+{\beta}_2{x}_{2 ij}+{\beta}_3{x}_{3 ij}+{u}_{0j} $$
(7.7)
$$ \mathrm{M}4:\mathrm{logit}\left({\pi}_{ij}\right)={\beta}_0+{\beta}_2{x}_{2 ij}+{\beta}_3{x}_{3 ij}+{u}_{1j}{x}_{1 ij}+{u}_{2j}{x}_{2 ij}+{u}_{3j}{x}_{3 ij} $$
(7.8)

The probability of illiteracy πij for racial group i in state j is modelled in terms of three dummy variables indicating race/nativity, x1ij, x2ij and x3ij, denoting ‘native whites’, ‘foreign-born whites’ and ‘blacks’, respectively.

The odds ratio indicating average illiteracy among the ‘foreign-born white’ group compared to the ‘native white’ group decreased from 13.63 (95% CI 13.58–13.67) to 5.71 (95% CI 5.18–6.29) when this coefficient is allowed to vary between states. These are derived from the coefficients β2 in Eqs. (7.7) and (7.8), respectively, and the substantial difference between these odds ratios indicates the dependence of the fixed parameters on the specification of the random part of the model. The substantial reduction in the deviance information criterion (DIC)—an indicator of the fit of a model (Spiegelhalter et al. 2002)—shown in Table 7.2 suggests that model 4 provides a better fit to the data. In this case inappropriate specification of the random part of the model has a sizable impact on the estimate of illiteracy among the ‘foreign-born white’ group. The reasons for this difference relate to the relationship between the fixed part coefficients and the higher level variance detailed in the section ‘Population Average and Cluster-Specific Estimates’ in Chap. 6.

Sources of Error Affecting the Estimation of Contextual Effects

Blakely and Woodward (2000) identified six limitations in study design and sources of error that affected the estimation of contextual effects. This paper remains relevant, and these limitations should be borne in mind when fitting or interpreting a multilevel model that includes one or more variables at the macro level.

Lack of Variation in the Contextual Variable

The variation present in an individual-level variable will be reduced when aggregated to a contextual level. For example, there will be more variation in individual income than in mean neighbourhood income. Such a reduction in variability between contexts, combined with there often being few contexts (there will certainly be fewer contexts than individuals in a multilevel model), means that there will be less power to detect a contextual effect than there is to detect the effect of an individual-level variable. Given the reduction in the range of values that a contextual variable can have (because of the reduced variability), it is worth bearing in mind that fairly modest contextual effects may be important.

Precision of Estimates and Study Design

Since there will always be fewer contexts than lower level units (individuals), contextual effects will be estimated with less precision. If the estimation of contextual effects is an essential part of your research, then this should be taken into account through the research design; an increase in the precision of the contextual effects will generally be achieved by increasing the number of higher level units (possibly at the expense of the number of lower level units included, as discussed in Chap. 3).

Selection Bias

If the individuals sampled for or otherwise included in a study are not representative of the population (such as would be achieved through a random sample), then the study is said to suffer from selection bias. The concern is that the association between a variable of interest and the outcome in the analytical sample differs from that seen in the eligible population (Hernán et al. 2004). In a multilevel study, particularly when we are interested in estimating contextual effects, the potential for selection bias exists at all levels of the model. We therefore have to consider representativeness at all levels (not just at the individual level) and should report response levels and any consideration of bias at all levels.

Confounding

Confounding occurs when one variable is associated with a key variable (such as the exposure of interest) and also influences the outcome. Contextual factors may suffer from both within-level confounding (confounding by other contextual factors) and cross-level confounding (confounding by individual characteristics). It is also possible that a contextual variable will confound the relationship between an individual-level variable and the outcome. The solution to the presence of such confounding variables is generally to adjust adequately for such variables in the analysis (Royston et al. 2006).

Information Bias

The estimation of contextual effects may be affected both by misclassification or mismeasurement of the contextual variable and by the incorrect assignment of individuals to the contexts. If either occurs in a systematic way, then there is a potential for biased results. Whilst misclassification and mismeasurement issues are also present for individual-level variables, the incorrect assignment of individuals to contexts introduces further potential for bias, particularly in the case when contextual variables are subsequently created by aggregating individual variables to their (incorrectly assigned) contexts.

Model Specification

The exact specification of the multilevel model may influence the estimation of contextual effects for several reasons. The contexts used may impact on the magnitude of the effect detected (with smaller areas more closely approximating individual circumstances) but may also be important in terms of the mechanism through which the contextual variable operates (e.g. with areas defined by political or other administrative boundaries). Cross-level effect modification and indirect cross-level effects are often overlooked; the presence of a cross-level interaction, for example, may mean that the interpretation of a contextual effect depends on the circumstances of the individual. The nature of a contextual effect may be complex and may not be linear. It is therefore important to consider different functional forms or multiple categories for the contextual effects although the lack of variation in the contextual variable noted above, and in some cases a restricted number of contexts, may make this difficult. Finally, multicollinearity is likely to be more problematic for contextual variables than for individual variables which may in turn make it impossible to estimate independent effects for several contextual variables.

Conclusions

Both the characteristics of the individuals themselves (compositional factors) and those of the relevant contexts in which individuals operate (contextual factors) may influence individual outcomes. In order to be able to judge the importance of contextual variables, it is important that full and appropriate adjustment has been made for potential differences in composition between the higher level units. Multilevel analysis provides a useful tool to explore the impact of compositional and contextual factors, and the interpretation of potentially complex models can be aided by relatively simple figures. The analysis of contextual effects can introduce a further dimension of complexity into regression modelling.