Youth substance use surveillance and prevention efforts in North America (Government of Canada, (n.d.); Johnston et al., 2016) and global wise (Kraus et al., 2015; Ahumada et al., 2019) typically focus on single substance use, potentially overlooking important user differences among adolescents (Zapert, Snow & Tebes, 2002). Recent evidence from a large sample of Canadian high school students revealed that in 2017/18, 18% reported using two or more substances, 16% reported using only one substance, and 61% reported no substance use in the past month (Zuckermann et al., 2020). Notably, the number of individuals using multiple substances has increased, possibly influenced by the rising popularity of e-cigarette use (Zuckermann et al., 2019) and the high prevalence of cannabis use among youth. With the increasing availability and legalization of non-medical cannabis products in Canada in October 2018 (Zuckermann et al., 2021a; Zuckermann et al., 2021b; Haines-Saah & Fischer, 2021), studies have demonstrated that ever use of cannabis among youth has steadily increased since legalization, going from 30.5% in 2016/17 to 32.4% in 2018/19 (Zuckermann et al., 2021a).

The growing importance of considering e-cigarettes in the discussion of general substance use patterns among youth is becoming evident (Zuckermann et al., 2020; Rothrock et al., 2020; Hughes et al., 2015; Thepthien et al., 2021; Mehra et al., 2019). Despite being a relatively new product, e-cigarettes have rapidly gained popularity among youth, potentially contributing to increased substance use in this population (Zuckermann et al., 2020; Zuckermann et al., 2019; Rothrock et al., 2020; Hughes et al., 2015; Thepthien et al., 2021; Mehra et al., 2019).

While previous research has identified different patterns of youth substance use, many studies have focused mainly on alcohol, cigarette, and cannabis use due to their widespread occurrence. For instance, a study involving Canadian youth between the ages of 12 and 18 in Victoria, British Columbia, identified three patterns: no or low use (63%), concurrent use of alcohol and cannabis (23%), and poly-use including cigarettes, alcohol, cannabis, and other illicit drugs (11%) (Merrin & Leadbeater, 2018). Recent research has highlighted polysubstance use (PSU) patterns involving concurrent and multiple uses of e-cigarettes along with other substances, underscoring the need to consider e-cigarettes in examining multiple substance use (Zuckermann et al., 2020; Rothrock et al., 2020; Hughes et al., 2015; Thepthien et al., 2021; Mehra et al., 2019).

Understanding the factors contributing to PSU patterns among youth is crucial for assessing health risks, identifying intervention opportunities, and evaluating existing policies and practices. Various approaches, including family and peer support, policies, and school interventions, can address the increasing trend of youth PSU. However, limited research exists on the factors influencing the initiation of PSU patterns among youth. Therefore, this study aims to investigate PSU patterns in a large longitudinal sample of Canadian youth and explore associated factors using a latent variable modeling approach. PSU, in this study, refers to the use of cigarettes, e-cigarettes, alcohol, and cannabis. By examining these factors, we aim to gain insights that can inform effective prevention and intervention strategies targeting youth PSU. For the purpose of this study, we specifically focus on a more restricted age range, targeting mid-adolescents in grades 9 and 10. Mid-adolescence is a critical period marked by increased risk-taking behaviors, making it particularly crucial to examine risk factors associated with substance use during this developmental stage (Government of Canada (n.d.); Hale & Viner, 2016).

Methods

Participants and Setting

The COMPASS study is a longitudinal health survey that collects self-reported student- and school-level data on youth health behaviors from a large, convenience sample of Canadian secondary school students every year starting in 2012/13 (Leatherdale et al., 2014; Reel, Bredin & Leatherdale, 2018). The COMPASS questionnaire (Cq) includes demographic and personal information and students’ responses to multiple choice questions about their physical characteristics, behaviors, attitudes towards health and wellness, and academic performance gathered annually (Leatherdale et al., 2014). Participating schools use a separate questionnaire to provide information about school policies and practices related to their students’ health behaviors. Additionally, school-level socio-economic status, urbanity, and built environment are collected as supplementary community-level data. Students must actively assent to participate in the COMPASS study, and parental/guardian consent is required using active-information passive-consent protocols (Thompson-Haile & Leatherdale, 2013). The COMPASS study has been granted ethics clearance by the University of Waterloo Office of Research Ethics (ORE 30118).

Data Collection and Preprocessing

For this retrospective cohort study, we used 3 years of linked longitudinal data from the Cq (collected in 2016/17, 2017/18, and 2018/19) before the outbreak of the COVID-19 pandemic. This study included 9307 Canadian students in grades 9 and 10 (including students at secondary I and II in Quebec) in 2016/17 from 76 secondary schools in Alberta, British Columbia, Ontario, and Quebec and followed up through 2018/19. The guidelines of Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) were adhered to in this study.

To prepare the data for analysis, we performed several preprocessing steps, such as cleaning, linking, merging, and analyzing missing data. The flowchart of data preprocessing steps with detailed descriptions was published in another paper (Yang et al., 2022).

Of the 9307 linked samples, present analyses were limited to 8824 students with regular patterns of advancement from one grade to the next in each school year. In 2016/17, about half (49.4%) of the students were in grade 9, 33.4% were in grade 10, and 15.3% were in grades 7–8 (secondary I–II) in Quebec. Approximately, 3/4 was White, 53.6% of the students were female, and 67.2% were from Ontario. Fifty-four percent lived in large urban areas, 31.3% of the students lived in small urban areas, and 46.8% of the students’ household incomes were between $50 and $75K. The descriptive statistics of the linked samples at three annual waves were published in another paper (Yang et al., 2022).

Measures

The Cq was used to assess substance use, including cigarette, e-cigarette, alcohol, and cannabis use. The Cq includes two questions about cigarette and e-cigarette smoking to determine the prevalence and frequency of use of these substances. The first question asks if the students have tried smoking cigarettes (or e-cigarettes) to distinguish between never use and ever use. A follow-up question asks about the last 30-day use of cigarettes and e-cigarettes, with the responses ranging from “none” to “30 days (every day).” The Cq includes specific measures to assess the consumption of alcohol and cannabis in the last 12 months, with the responses ranging from “never use” to “every day.” The ordinal responses were collapsed into three-category indicators to avoid the sparseness of the observed frequency table. Following the most common categorization for determining the patterns of youth substance use, the initial responses were categorized into “never use,” “occasional use,” and “current use.”

In addition to the substance use measures, this study examined a set of risk and protective factors related to substance use among youth, encompassing a range of individual, social, and environmental factors that have been previously linked to youth substance use. The specific risk factors analyzed in the model included, but were not limited to, the number of smoking friends, the number of classes skipped, and the amount of weekly allowance. Additionally, protective factors, such as school connectedness, mental health (as measured by the Centre for Epidemiologic Studies Depression Scale (CES-D)), were also examined in the model. Please refer to the supplemental materials for a comprehensive list and detailed descriptions of all factors included in the model and analyzed.

Analysis

Before constructing the multivariate latent Markov models (LMM) to examine different patterns of substance use among youth, we applied the least absolute shrinkage and selection operator (LASSO) regression (Tibshirani, 1996), a linear regression method that performs both variable selection and regularization to prevent overfitting. In this study, LASSO regression was applied to select a subset of covariates from the dataset. The goal was to identify the most relevant risk and protective factors associated with youth substance use. The LASSO regression model was implemented using an adaptive version of the coordinate descent fitting algorithm with an ordinal response (Archer, 2015). This approach is suitable for dealing with an ordinal response variable, which was used to categorize substance use patterns.

After applying LASSO regression, we proceeded with the construction of LMM, aiming to identify different subgroups of youth exhibiting distinct patterns of substance use. LMM is a type of latent variable modeling technique that allows for the analysis of sequential data, such as repeated measurements over time. In this study, LMM was used to test the hypothesis that different subgroups of youth tend to exhibit different patterns of substance use over the 3-year longitudinal data. Comprising structural and measurement models, the LMM incorporates latent (unobserved) endogenous factors, such as substance use indicators, to model the conditional probabilities of response variables. Conversely, the measurement models solely involve manifested (observed) endogenous factors, conditioned on the latent status of substance use patterns. The LMM construction was built based on the variables selected through the LASSO regression process. The selection of the optimal number of classes was based on the Bayesian information criterion (BIC), which helps to assess the model’s fit and complexity. Additionally, goodness-of-fit measures were utilized to evaluate the accuracy of the fitted models.

To provide a comprehensive understanding of the selection process, we chose only variables with non-zero coefficients after LASSO regression for further analysis and model construction. Multiple imputation (MI) was performed simultaneously with the LMM fitting to address missing data, and coefficients were averaged over five imputed runs to obtain overall coefficients.

To assess the significance of covariates on the effect of class membership for the initial state probabilities, we employed Wald test statistics based on the parameter estimates and standard errors. All p-values were calculated using two-sided tests, with results having p < .05 considered statistically significant.

Statistical analyses were conducted using the R language, open-source software for statistical computing and graphics (R Core Team, 2013). We used the LMest package (Bartolucci et al., 2015) for generalized LMMs. Detailed descriptions of the analysis and LMM construction were published in another paper (Yang et al., 2022).

Results

Prevalence of Substance Use

Overall, the prevalence of “never use” has decreased over time, particularly for alcohol drinking (from 60.6% in 2016/17 to 29.2% in 2018/19). Over 3 years, the prevalence of “occasional use” and “current use” has risen for all substances. In particular, the current use of e-cigarettes (from 6.8% in 2016/17 to 32.3% in 2018/19) and cannabis (from 4.4% in 2016/17 to 16.1% in 2018/19) has increased 4.8 and 3.7 times, respectively.

PSU Patterns

Four patterns/classes of substance use were identified and summarized as follows: no-use (C1), alcohol-only (C2), concurrent use of e-cigarettes and alcohol (C3), and poly-use (C4). The final model classified each class based on the predominant conditional response probabilities of substance use. The conditional response probabilities were generally distinct, indicating a clear heterogeneity between the classes (see Fig. 1). At baseline (2016/17), the initial state probabilities of these use patterns were PC1 = 0.589, PC2 = 0.216, PC3 = 0.149, and PC4 = 0.047.

Fig. 1
figure 1

Polysubstance use patterns at baseline (2016/17). The patterns are determined by the prominent conditional response probabilities. X-axis: four substance use measures (left to right: cigarette, e-cigarette, alcohol, cannabis); Y-axis: conditional response probability between 0 and 1 (as shown in the caption below the image in a table format, prominent conditional response probabilities are highlighted in bold font); and Z-axis: three categories of never use (0), occasional use (1), and current use (2) for any substance. Class 1 (C1, No-Use) C1 corresponds to students who have not used substances, where the probability of never using any substance was > 94.7%. In 2016/17, C1 was the largest class, containing 60.5% of the students. The longitudinal evidence showed that although C1 was prominent in 2016/17, its prevalence had decreased significantly over time (60.5% ➔ 39.0% ➔ 25.0%; 2016/17 ➔ 2017/18: ΔC1 = −21.5%; 2017/18 ➔ 2018/19: ΔC1 = −14.0%). Class 2 (C2, Alcohol-Only) C2 was composed of individuals who typically used alcohol with a more significant probability of 89.6%, whereas over 85.5% of probabilities never used the other three substances in this class. As the second largest class consisting of 20.9% of youth in 2016/17, its prevalence had been relatively stable over time, increasing to 24.4% (ΔC2 = +3.5%) in 2017/18 and slightly decreasing to 22.3% (ΔC2 = −2.1%) by 2018/19. Class 3 (C3, Concurrent Use of E-cigarettes and Alcohol) Individuals in C3 had larger probabilities of using e-cigarettes and alcohol, 70.6% and 85.9%, respectively. The probabilities of using other substances in this class were not prominent. C3 was the second minor class in 2016/17, containing 14.0% of students. Its prevalence had increased significantly over time (14.0% ➔ 24.9% ➔ 32.5%; 2016/17 ➔ 2017/18: ΔC3 = +10.9%; 2017/18 ➔ 2018/19: ΔC3 = +7.6%), becoming the prominent use pattern by 2018/19. Class 4 (C4, Poly-Use). Individuals in C4 differed from those in C3 by having a greater probability of using multiple substances concurrently. For instance, the conditional response probabilities of current e-cigarette, alcohol, and cannabis use were 74.5%, 79.5%, and 60.1%, respectively. Thus, C4 corresponds to the heavy/current multi-user group. Although C4 had been the minor class for 3 years, its prevalence had increased significantly across the 3 years by 4.4 times (4.6% ➔ 11.7% ➔ 20.2%; 2016/17 ➔ 2017/18: ΔC4 = +7.1%; 2017/18 ➔ 2018/19: ΔC4 = +8.5%), becoming very close to C2 and C1 by 2018/19. To provide clarity and facilitate labeling of the latent classes, we plotted the z-scores of each of the four substances across the four patterns/classes, C1 to C4 (Fig. 2).

Fig. 2
figure 2

Z-scores for each substance use across the four classes (2016/17). X-axis: four patterns/classes, from left to right, C1 (no-use), C2 (alcohol-only), C3 (concurrent use of e-cigarettes and alcohol), and C4 (poly-use); Y-axis: Z-scores for each substance use

The Z-scores provide insights into the average substance use for each class compared to the overall average. For cigarette use, C4 shows above-average use, while C1, C2, and C3 have slightly below-average use. Similarly, C4 exhibits above-average e-cigarette use, C3 is slightly above the overall average, and C1 and C2 have slightly below-average use. For alcohol use, C1 has significantly below-average use, C2 and C3 are slightly above the overall average, and C4 has significantly above-average use. Regarding cannabis use, C4 demonstrates significantly above-average use, while C1, C2, and C3 have below-average use. This underscores the diversity in PSU patterns among youth, with distinct classes demonstrating different levels of engagement in cigarette, e-cigarette, alcohol, and cannabis use.

Associated Factors

We estimated the covariates’ effects on their initial state probabilities to investigate the factors associated with the diverse patterns of PSU among youth. Table 1 lists the coefficients of all factors with corresponding odds ratios (OR) affecting the initial state probabilities for each class membership under the final model.

Table 1 Factors associated with polysubstance use patterns at baseline (2016/17, REF: C1)

Overall, not gambling online for money (ORC2 = 0.22, 95 % CI −0.16 − 0.58; ORC3 = 0.14, 95 % CI −0.24 − 0.52; ORC4 = 0.08, 95 % CI −0.47 − 0.63), eating breakfast (ORC2 = 0.80, 95 % CI 0.66 − 0.94; ORC3 = 0.64, 95 % CI 0.47 − 0.80; ORC4 = 0.56, 95 % CI 0.25 − 0.87), residing in urban areas (ORC2 = 0.82, 95 % CI 0.75 − 0.90; ORC3 = 0.77, 95 % CI 0.68 − 0.85; ORC4 = 0.66, 95 % CI 0.51 − 0.81), having higher school connectedness (ORC2 = 0.95, 95 % CI 0.93 − 0.98; ORC3 = 0.91, 95 % CI 0.89 − 0.94; ORC4 = 0.82, 95 % CI 0.77 − 0.86), and being black (vs. white, ORC2 = 0.92, 95 % CI 0.88 − 0.97; ORC3 = 0.94, 95 % CI 0.89 − 0.99; ORC4 = 0.96, 95 % CI 0.88 − 1.03) consistently positively affected the initial membership in C2 through C4 relative to C1.

On the contrary, risk factors associated with PSU patterns included truancy as measured by number of classes skipped (ORC2 = 1.67, 95 % CI 1.55 − 1.79; ORC3 = 1.92, 95 % CI 1.80 − 2.04; ORC4 = 2.79, 95 % CI 2.64 − 2.94), having more smoking friends (ORC2 = 1.35, 95 % CI 1.25 − 1.46; ORC3 = 1.81, 95 % CI 1.71 − 1.91; ORC4 = 2.75, 95 % CI 2.64 − 2.86), being older (ORC2 = 1.32, 95 % CI 1.24 − 1.40; ORC3 = 1.26, 95 % CI 1.17 − 1.34; ORC4 = 1.51, 95 % CI 1.37 − 1.65), attending school unsupportive in quitting drugs/alcohol (ORC2 = 1.25, 95 % CI 1.17 − 1.34; ORC3 = 1.30, 95 % CI 1.21 − 1.39; ORC4 = 1.43, 95 % CI 1.27 − 1.60), obtaining more weekly allowance (ORC2 = 1.14, 95 % CI 1.08 − 1.20; ORC3 = 1.29, 95 % CI 1.22 − 1.36; ORC4 = 1.34, 95 % CI 1.21 − 1.46), having elevated BMI (ORC2 = 1.22, 95 % CI 1.16 − 1.28; ORC3 = 1.20, 95 % CI 1.13 − 1.27; ORC4 = 1.28, 95 % CI 1.16 − 1.40), having more physically active friends (ORC2 = 1.21, 95 % CI 1.17 − 1.26; ORC3 = 1.18, 95 % CI 1.13 − 1.23; ORC4 = 1.10, 95 % CI 1.01 − 1.19), and having longer sedentary time (ORC2 = 1.02, 95 % CI 1.01 − 1.04; ORC3 = 1.07, 95 % CI 1.06 − 1.08; ORC4 = 1.07, 95 % CI 1.04 − 1.09). These factors consistently negatively affected the initial membership in C2 through C4, relative to C1.

Sex (female as reference group vs. male, ORC2 = 0.74, 95 % CI 0.60 − 0.89; ORC3 = 1.34, 95 % CI 1.18 − 1.50; ORC4 = 0.80, 95 % CI 0.51 − 1.10) had mixed positive and negative effects on the initial membership in C2 through C4 relative to C1. Except for race/ethnicity, all the factors had significant effects (p < .05) for each class.

Discussion

PSU Patterns

Our study, utilizing multivariate LMM modeling on longitudinal health surveys (COMPASS data), reveals four distinct patterns of PSU among Canadian adolescents: no-use (C1), alcohol-only (C2), concurrent use of e-cigarettes and alcohol (C3), and poly-use (C4). Each pattern represents an overarching theme, increasing severity by class. Generally, these patterns are well-separated, with considerably different sizes and good heterogeneity between classes. The longitudinal evidence on PSU patterns suggests an increasing tendency and frequency of youth substance use over time, demonstrating a successful application of LMM. In this study, we focus on identifying youth PSU patterns and associated factors using LMM modeling technique for unobserved heterogeneity analysis.

Specifically, we observed that C4 consistently demonstrated above-average use of cigarettes, e-cigarettes, alcohol, and cannabis, while C1, C2, and C3 exhibited slightly below-average or significantly below-average use across these substances. The higher levels of substance use observed in C4 are in line with previous research on multiple substance use among adolescents. Studies have shown that individuals in the C4 category, often referred to as “multi-substance users,” are at higher risk of experiencing adverse health outcomes and psychosocial problems compared to those in other classes. This finding is consistent with the notion that poly-use may indicate more severe or problematic substance use behaviors (Brown et al., 2015; Stockings et al., 2016).

The slight below-average or significantly below-average use of substances in C1, C2, and C3 suggests that these groups may be engaging in less frequent or lower-risk substance use behaviors. C1 exhibiting the lowest levels of substance use may represent a group of individuals who abstain from substance use altogether or have minimal exposure to such behaviors. C2 and C3 may include individuals who engage in substance use, but not at levels considered to be problematic or high risk. These findings align with previous research that identifies distinct subgroups of youth with varying levels of substance involvement, ranging from non-use to moderate or occasional use (Tomczyk et al., 2015; Lee et al., 2020).

According to a systematic review of youth substance use patterns from 70 studies conducted by Halladay et al., (2020), the average number of use patterns was four, including low use, single or dual use, moderate multi-use, and high multi-use (Halladay et al., 2020).12 Our research identifies four patterns (C1 to C4) using novel analytical methods, aligning with their findings. Concerning e-cigarettes and other substance use among youth and young adults (Rothrock et al., 2020; Hughes et al., 2015; Thepthien et al., 2021; Mehra et al., 2019), studies identify the concurrent use of e-cigarettes with alcohol, cannabis, and tobacco products. Notably, alcohol and e-cigarette co-users among adolescents are representative in North America and other countries (Rothrock et al., 2020; Hughes et al., 2015; Thepthien et al., 2021). Our results support the evidence of dual and multi-use of e-cigarettes with other substances, highlighting the importance of considering e-cigarettes in the analysis of substance use patterns. Moreover, it is alarming that the current use of e-cigarettes has increased significantly over time, from 6.8% in 2016/17 to 32.3% in 2018/19, by 4.8 times.

Associated Factors

Evidence suggests that factors such as sex, race, early onset of alcohol drinking, and academic achievements in secondary school can influence youth substance use patterns (Lanza, Patrick & Maggs, 2010).26 Lanza et al., (2010) found that, when examining alcohol, cigarette, and cannabis use, males are 4.5 times more likely to be in the highest use group than females relative to non-users. In contrast, females are more likely to smoke cigarettes or binge drink than their male peers. Several studies among US youth have found that female students are at a higher risk of using multiple substances (Silveira et al., 2019; Cranford, McCabe & Boyd, 2013). Similarly, our study reveals sex-related differences in substance use patterns among youth. We found that in 2016/17, females were 1.35 times more likely to engage in alcohol use (C2) compared to males, while males were 1.34 times more likely to initiate concurrent e-cigarette and alcohol use (C3) relative to females. Additionally, females exhibited a 1.25 times higher likelihood of engaging in poly-use (C4) compared to males. Our findings highlight the complex interplay between sex and substance use behaviors, emphasizing the need for tailored interventions and prevention strategies to address the specific needs of male and female adolescents.

Our study reveals the impact of race/ethnicity on substance use patterns among youth. In 2016/17, students who reported being black were 0.92 times less likely to start in the alcohol-only (C2) subgroup relative to the no-use (C1) subgroup, compared to their white counterparts. Similarly, black students were 0.94 times less likely to start in the concurrent use of e-cigarettes and alcohol (C3) subgroup compared to white students and 0.96 times less likely to start in the multi-use (C4) subgroup relative to the no-use (C1) subgroup. These findings are consistent with prior research, which has also shown that black students are less likely to engage in multiple substance use compared to white students (Connell, Gilreath & Hansen, 2009; Gilreath et al., 2014). Additionally, studies focusing on Asian, Hispanic, and other ethnic students have consistently found lower risk of using more than three substances (Zuckermann et al., 2020; Lanza, Patrick & Maggs, 2010; Banks et al., 2017; Keyes et al., 2022). Our results demonstrate similar protective effects, indicating that Latin American/Hispanic students are less likely to engage in higher-use patterns compared to Asians, who, in turn, have a lower likelihood of engaging in higher-use patterns than black students. These findings highlight the importance of considering race/ethnicity as a potential factor in understanding substance use disparities among youth and may help inform targeted prevention efforts and interventions to address substance use behaviors in different racial and ethnic groups.

In addition to sex and race/ethnicity differences, we contribute to the addiction literature with multifaceted factors and their estimates examined. To our knowledge, no other research includes all the variables investigated in our study to evaluate the association with the initial membership of PSU patterns, summarizing the positive, negative, or mixed effects. For example, peer effect has been identified to correlate with multi-use (Tomczyk, Isensee & Hanewinkel, 2016; Tomczyk, Hanewinkel & Isensee, 2015). Our results show that students with more smoking friends are 2.75 times more likely to start in C4 than their peers who reported fewer smoking friends relative to C1. Likewise, having more physically active friends was 1.21 times more likely to engage in C2 relative to C1, demonstrating the significant impact of peer influence on youth substance use. Older students are more likely to use multiple substances (Zuckermann et al., 2020; Zuckermann et al., 2019). Some studies have found that students who are in higher-use classes are more likely to have access to more spending money (Zuckermann et al., 2020; Valente, Cogo-Moreira & Sanchez, 2017), which is consistent with our results that having more weekly allowance is associated with engaging in a higher-use class (ORC2 = 1.14, 95 % CI 1.08 − 1.20; ORC3 = 1.29, 95 % CI 1.22 − 1.36; ORC4 = 1.34, 95 % CI 1.21 − 1.46). Our results suggest that students with elevated BMI are 1.28 times more likely to engage in C4 than C1. In addition to these individual-level factors, environmental factors such as urbanity and school policies supporting anti-smoking/drugs are important. Our study reveals that students attending a school unsupportive to quitting drugs/alcohol are 1.43 times more likely to start in C4 than those attending a supportive school relative to C1. Our results agree with Silveira et al. (2019), who reported that youth living in non-urban areas are more likely to engage in multi-use of substances (Silveira et al., 2019).

Underage gambling, a problematic behavior, is positively associated with substance use (Barnes et al., 2009). Our study reveals that students not gambling online are less likely (ORC4 = 0.08) to start in C4 than otherwise, relative to C1. Concerning the correlation between youth substance use and health behaviors, e.g., attitudes towards nutrition, Isralowitz and Trostler (1996) found that substance users are more likely to have unhealthy eating habits, such as skipping breakfast or not eating three meals a day, which puts them at a higher risk for negative health outcomes (Isralowitz and Trostler, 1996). Eating breakfast, a proxy for healthy behaviors, is picked up from our results concerning personal health and its protective effects against a higher-use class. Low social connectedness (White, Walton & Walker, 2015), e.g., school connectedness or engagement, is a risk factor for multi-use among youth. Some studies have shown that having a lower level of school connection is associated with an increased likelihood of multi-use of substances, consistent with our findings (Zuckermann et al., 2020; Jongenelis et al., 2019).

This study investigates cross-sectional and longitudinal evidence on PSU patterns from multi-dimensional COMPASS data. The factors contributing to youth substance use are complex, involving individual characteristics, peer influence, and environmental components. The diversity of student characteristics, modifiable individual-level risk factors, and the influence of family, friends, school environment, and community-level factors can all contribute to the heterogeneity of substance use patterns. Factors such as eating habits (Isralowitz & Trostler, 1996), gambling behaviors (Barnes et al., 2009), BMI (Hammami et al., 2019), weekly allowances (Zuckermann et al., 2020; Valente, Cogo-Moreira & Sanchez, 2017), peer influence (Tomczyk, Isensee & Hanewinkel, 2016; Tomczyk, Hanewinkel & Isensee, 2015), truancy (Henry & Thornberry, 2010), school programs/policies supporting anti-drugs/alcohol, and school connectedness (White, Walton & Walker, 2015; Su, Supple & Kuo, 2018) are modifiable, which can have multi-dimensional interventions tailored to address these risk factors collectively. In comparison, grade/age and sex are non-modifiable, so interventions can target at-risk groups, such as higher grades and females who tend to be more likely to engage in alcohol and multi-use. Differences in substance use patterns at the population level may significantly affect the mental and behavioral health of youth. This can be further conceptualized as the ability of different school settings to prevent addictive behaviors.

Implications

Our study findings have significant implications for public health and health policy, providing insights that can guide interventions and prevention strategies to reduce the prevalence of substance use behaviors among youth. Understanding the use patterns and associated risk factors can help identify at-risk groups and tailor interventions to address the specific needs of different subgroups. This targeted approach is crucial for reducing the negative consequences of substance abuse and addiction, benefiting both individual youth and society as a whole. Decision makers should consider the diverse associations between substance use patterns and multiple health-related behaviors to design school environments, policies, and procedures that improve the health behaviors of youth.

The variation in PSU patterns observed in our study highlights the importance of considering heterogeneity among youth when developing prevention and intervention strategies. Tailored approaches that address the unique risk profiles and needs of each class can be more effective in reducing substance use and associated harms (Johnston et al., 2016; Ramo et al., 2012). Our findings emphasize the importance of targeted and evidence-based prevention and intervention strategies to address the unique needs of each subgroup. For instance, preventive efforts targeting C4 individuals may focus on harm reduction strategies, early intervention, and providing support for those at higher risk of experiencing substance-related problems.

Moreover, this study underscores the significance of early detection and screening for substance use behaviors among youth. Identifying and addressing substance use patterns at an early stage can prevent escalation to more problematic use and related negative consequences (Volkow & Baler, 2014). Schools, healthcare providers, and community organizations play a crucial role in offering preventive measures and support to adolescents, especially those identified as being at higher risk of engaging in multiple substance use.

Demographic differences were found to impact the initial state probabilities of diverse class membership at baseline (2016/17). Students in higher grades, living in Alberta (Zuckermann et al., 2020), coming from rural areas in Canada, or being obese, had a higher probability of starting in higher-use classes (C2 to C4) relative to C1 compared to those with other demographic characteristics. Therefore, provincial and federal governments should collaborate to develop specialized preventive programs tailored to the unique needs of different regions and demographics, particularly in areas with higher prevalence rates.

It is important to highlight that in this study, the risk and protective factors were not handpicked from the relevant literature. Instead, a rigorous variable selection process using LASSO regression was employed, ensuring that the factors included in the model are supported by empirical evidence. This data-driven approach enhances the validity and robustness of our findings and provides transparency in the factors’ inclusion. Employing advanced machine learning techniques, such as LASSO and multivariate latent variable modeling, represents a significant advancement in understanding the complexity of PSU patterns and associated factors among youth. This cross-disciplinary approach showcases the power of data science in addressing real-world public health challenges. By combining cutting-edge techniques with a comprehensive examination of PSU patterns, our study provides a more nuanced analysis of the diverse pathways and interactions between different substances during adolescence.

Study Limitations

This study has certain limitations. First, it is preferable to have external validation data to evaluate our model performance. Yet, the data collection cycle from 2019/20 onward is challenging due to the COVID-19 pandemic. Second, caution should be exercised in interpreting and generalizing the results of the COMPASS study due to the non-random convenience sampling method used for many participating schools. As a result, the school samples may not fully represent the entire population and may exclude certain Canadian provinces (Yang et al., 2022). Third, our analyses are confined to the variables available in the Cq and are subjected to self-reported bias. Fourth, as with any large-scale health survey, non-responses introduce missing values. We apply MI techniques to impute missing values. Despite this, stakeholders agree that the COMPASS methodologies are reliable in handling the delicate balance between data accuracy and participant anonymity in longitudinal studies focused on youth health behaviors (Battista et al., 2019). Moreover, it is important to note that our findings are in line with previous research examining patterns of substance use among youth, which further supports the validity and generalizability of our results (Esser et al., 2021; Steinhoff et al., 2022). However, there may be other factors not included in our analysis that could influence substance use patterns among youth, such as peer influence, family dynamics, and environmental factors (Marschall-Lévesque et al., 2017; Russell et al., 2019). Future research may consider incorporating these additional factors to provide a more comprehensive understanding of youth substance use behaviors. Furthermore, one review indicates a significant decline in the prevalence of youth substance use during the pandemic (Layman et al., 2022). However, it is crucial to continue monitoring and surveilling youth substance use in the post-pandemic years to ensure a comprehensive understanding of trends and potential changes in behavior.

Conclusion

Our findings shed light on the prevalence and complexity of PSU among youth, emphasizing the significance of understanding the heterogeneity of substance use patterns and associated factors during this critical developmental period. By recognizing unique risk factors associated with specific subgroups, targeted interventions can be developed to address the multifaceted nature of substance use behaviors among youth effectively. This comprehensive approach contributes valuable insights to the existing literature, enhancing our ability to design evidence-based prevention strategies and improve public health outcomes for adolescents. This approach effectively deepened our understanding of this pressing public health challenge and has the potential to inform the development of effective, evidence-based strategies to address it. Ultimately, our data-driven analysis paves the way for more effective and tailored interventions to address the challenges of substance use among youth. Understanding the complex relationship between substance use patterns and multifaceted impact factors empowers us to protect the well-being of the younger generation and foster a healthier, more resilient society.