Correction to: Canadian Studies in Population

https://doi.org/10.1007/s42650-022-00062-6

The original version of this article unfortunately contained mistakes.

The Appendices were missing in the published paper. The missing Appendices are shown below.

1 APPENDIX A – DESCRIPTION OF MULTILEVEL MODELLING

A regression pooling all clusters would omit cluster-level variations in the estimation of individual effects. As an alternative, one could opt for estimating the effects of individual variables separately for each cluster, but splitting the sample this way would be greatly inefficient if individual level effects tend to have a similar influence on the outcome in all clusters. It would also be problematic when the number of clusters is high, as in the case of reserves, or when some clusters have small sample sizes. By contrast, multilevel models offer a compromise between these two options by borrowing information across clusters for more robust estimation of individual-level effects. Adding higher-level effects also helps to correct for heterogeneity shrinkage, the underestimation of the estimates caused by unobserved heterogeneity (Allison 1999).Footnote 1

A logistic regression model with a single explanatory variable in which a higher-level variable (cluster) would be ignored could be written as follows:

$$ln\left(\frac{P\left({Y}_{ij}=1\right)}{1-P\left({Y}_{ij}=1\right)}\right)=logit \left(P\left({Y}_{ij}=1\right)\right)={\beta }_{0}+{\beta }_{1}{x}_{ij}$$

where \(\left({Y}_{ij}=1\right)\) is the conditional probability that an outcome of interest \({Y}_{ij}\) occurs, given specific values of the predictor variables \({x}_{ij}\) and \({\beta }_{0}\) and \({\beta }_{1}\) are estimated coefficients.

By contrast, taking the cluster into account in a varying-intercept model yields the following equation:

$$logit \left(P\left({Y}_{ij}=1\right)\right)={\beta }_{0j}+{\beta }_{1}{x}_{ij}$$

where \({\beta }_{0j}\) equals

$${\beta }_{0j}={\upgamma }_{00}+{u}_{0j}$$

with \({\upgamma }_{00}\) being the average (constant) intercept and \({u}_{0j}\) the cluster−specific deviation from \({\upgamma }_{00}\) (residual error terms at the cluster level). Note that \({u}_{0j}\) are random variables assumed to follow a Gaussian distribution with an expected value of 0:\({u}_{0j} \sim \mathcal{N}\left(0,{\tau }^{2}\right)\). What is of interest here is not the specific values of \({u}_{0j}\), but their variance (the varying−intercept variance),\({{\tau }^{2}}_{0}=Var\left({u}_{0j}\right)\). Finally, \({\beta }_{1j}\) is the constant regression coefficient (and is often denoted as\({\upgamma }_{10}\)).

A specific model of interest is the intercept-only model, containing no explanatory variables:

$$logit \left(P\left({Y}_{ij}=1\right)\right)={\upgamma }_{00}+{u}_{0j}$$

  

From this model, the intraclass correlation coefficient (ICC) can be computed:

$$ICC=\frac{{\tau }_{0}^{2}}{{\tau }_{0}^{2}+\left({\pi }^{2}/3\right)}$$

The ICC will be used to measure the proportion of the total variance that is caused by variations found at the cluster level (Hox 1995). Like standard correlation coefficients, its value may range from 0 to 1. The term \({\pi }^{2}/3\) is the variance component at the first level given the standard logistic distribution. It is constant, since logistic regression models do not include level−1 residuals (Sommet and Morselli 2017).

Another useful indicator for analysis of between-cluster level variations proposed by Larsen et al. (2000) is the median odds ratio (MOR):

$$MOR=exp\left(\sqrt{2{\widehat{\tau }}^{2}}\times {\Phi }^{-1}\left(0.75\right)\right)$$

where τ 2 is the estimated variance of the varying effects and \({\Phi }^{-1}\left(0.75\right)\) is the 75th percentile of the standard normal distribution. Because it is on the same scale, the MOR can be compared to constant effects in the model. The MOR can be understood as the value where—in repeated comparisons of two individuals with identical characteristics but picked randomly from different clusters—the odds of the individual with the higher risk of outcome (compared with the one with the lower risk) would be higher than that value half of the time, and lower than that value half of the time (Austin and Merlo 2017).

2 APPENDIX B – DESCRIPTION OF THE COMMUNITY WELL-BEING INDEX AND THE REMOTENESS INDEX

Two indexes are included as “contextual” variables in the regression models: the Community Well-Being Index and the remoteness index. Here is a brief description of both indexes:

2.1 Community Well−Being Index

The Community Well-Being (CWB) Index measures socioeconomic well-being for individual communities (census subdivisions) across Canada. It takes into account four components: education, labour force activity, income and housing. These four components are measured with the help of seven variables:

  1. 1.

    proportion of the population aged 20 years and older with a high school diploma

  2. 2.

    proportion of the population aged 25 years and older with a university degree

  3. 3.

    labour force participation among the population aged 20 to 64 years

  4. 4.

    employment rate among the population aged 20 to 64 years

  5. 5.

    income per capita

  6. 6.

    proportion of the population living in an uncrowded dwelling

  7. 7.

    proportion of the population living in a dwelling not in need of major repairs.

The CWB score can vary between 0 and 100. A value of 0 means a very low level of community well-being, while a score of 100 means a very high level of well-being. More information about the CWB index can be found in Indigenous Services Canada (2019).

2.2 Remoteness index

The remoteness index, or index of remoteness of community, is an indicator of the geographic proximity of a community (at the census subdivision level) to service centres and population centres. The score of the index can vary between 0 and 1. A value of 0 means that the community is very close to large agglomerations, while a score of 1 means that the community is very isolated. Please refer to Alasia et al. (2017) for more details.

The year for reference Lee, E. S. should be (1966) instead of 2016. The corrected year the reference Lee., E. S. is shown below.

Lee, E. S. (1966). A theory of migration. Demography, 3, 47–57. https://doi.org/10.2307/2060063

The original article has been corrected.