In early 2020, the global outbreak of coronavirus disease 2019 (COVID-19) had a severe impact on countries worldwide. Following the declaration of COVID-19 as a global pandemic by the World Health Organization on March 11, 2020 (Cucinotta & Vanelli, 2020), a majority of higher education institutions transitioned from on-site courses to online course delivery, requiring students to attend classes remotely (Luchetti et al., 2020). The practice of a rapid shift to online instruction in response to a crisis or emergency situation is referred to as emergency remote teaching, and the learning context is referred to as an emergency remote teaching environment (ERTE) (Hodges et al., 2020). As a result, a thorough exploration of the possible environmental and individual factors that might be related to the learning outcomes of students in the changed learning context became necessary. Previous studies had presented a variety of factors that were associated with learning outcomes in ERTEs during the COVID-19 pandemic, such as teacher support, course design, system quality, students’ interaction and communications with teachers and peers, and self-discipline (Su & Guo, 2021; Tzafilkou et al., 2021; Wang et al., 2021a). However, all these studies mainly focused on factors that had already been examined in online learning environments. Their aim was to confirm the significance of associations between these factors and learning outcomes through a theory-driven and confirmatory approach, leaving unresolved the question of what unknown associated factors are. Therefore, we aimed to conduct an exploratory data-driven study using a large-scale dataset (Aristovnik et al., 2020a) to explore environmental and individual factors that were significantly associated with learning outcomes. In addition, in order to obtain individualized insights, we aimed to group students into distinct profiles or classes based on their responses to the associated factors and examine the relations of these profiles to varying learning outcomes and student characteristics. The findings of this study have implications not only for ERTEs during the COVID-19 pandemic but also for those that might arise in the future due to natural disasters or other health crises (Dhawan, 2020; Whittle et al., 2020). The findings can provide insights for future research, practice, and policymaking aimed at improving the quality of education during crisis situations.

1 Literature review

The importance of environmental and individual factors in learning has long been evidenced in the field of educational psychology from several perspectives. For instance, the bioecological perspective (Bronfenbrenner, 1989) emphasizes the critical roles of multiple environmental and individual systems and their interactions in shaping individuals’ learning outcomes. The sociocultural perspective (Vygotsky, 1978) stresses the importance of social and cultural contexts in shaping learning. The sociocognitive perspective (Bandura, 1997) highlights a reciprocal relationship among environmental factors, personal factors, and learning outcomes. In emergency remote teaching environments (ERTEs) brought about by the COVID-19 pandemic, many researchers examined the roles of environmental and individual factors in learning outcomes. For example, based on the constructivist model of learning, Su and Guo (2021) conducted a study among 457 Chinese higher education students studying in ERTEs in early 2020 and found that system quality, course design, and student-student and student-content interactions positively predicted learning outcomes. Wang et al. (2021a) conducted another study among 7210 Chinese undergraduate students studying in synchronous ERTEs and found that teacher support and teacher innovation were associated with students’ learning outcomes directly and indirectly via improved academic self-efficacy. In addition, based on the theory of technology acceptance, Tzafilkou et al. (2021) conducted a study among 116 higher education students in ERTEs at a Greek university and found that acceptance of online lectures, collaborating with peers, and communicating with teachers were positively related to learning outcomes. Moreover, Su and Guo (2021) found that students’ self-discipline, a critical individual factor in online learning environments, promoted learning outcomes among 457 Chinese higher education students studying in ERTEs.

The environmental or individual factors analyzed in these studies were selected based on existing theories and findings of the relationships between these factors and learning outcomes in online learning environments (Su & Guo, 2021; Tzafilkou et al., 2021; Wang et al., 2021a). The purpose of these studies was to confirm the significance of the relationships through a theory-driven and confirmatory approach. However, ERTEs differ from typical online learning environments; therefore, there might exist potential factors unique to this special learning environment yet to be explored (Usher et al., 2021). For example, researchers suggested that access to learning technologies and stable internet is essential for learning in ERTEs (Dhawan, 2020; Erlam et al., 2021; Shin & Hickey, 2021). However, due to quarantine, some students may lack access to these resources, which may hinder their abilities to achieve satisfactory learning outcomes (Fong, 2022). Moreover, researchers pointed out that higher education students experienced negative emotions related to the pandemic, such as worry about job loss (Dhawan, 2020) and stress from the unpredictability of their circumstances and prospects (Fong, 2022; Usher et al., 2021). These negative emotions may also be adversely related to learning outcomes. Despite the potential significant associations of these factors with student learning outcomes in ERTEs, they had not been verified empirically. To the best of our knowledge, there had been limited data-driven research using appropriate data to explore important factors associated with learning outcomes in ERTEs. Therefore, a data-driven exploratory study was necessary to identify important environmental and individual factors in ERTEs that had not been found in theory-driven confirmatory research (Maass et al., 2018).

Furthermore, it is important to note that individual students might respond differently to various environmental and individual factors. These differences might relate to individuals’ characteristics such as gender, level of study, and field of study. Therefore, an individual-level analysis is necessary following the population-level analysis to fully understand these differences. To achieve this, we employed a person-centered approach that differs from the variable-centered approach. The variable-centered approach focuses on examining the general patterns of associations between factors, whereas the person-centered approach aims to identify different subgroups of students who shared similar patterns of responses to these factors (Laursen & Hoff, 2006; Rebetez et al., 2015).

1.1 The purposes of this study

In this study, we aimed to achieve two goals. Firstly, we aimed to gain a comprehensive understanding of the factors associated with learning outcomes in ERTEs during the COVID-19 pandemic at the population level. Secondly, we aimed to bridge the gap in understanding the connection between the population-level and individual-level analyses by identifying distinct student profiles based on these factors.

2 Methods

2.1 Dataset and participants

We utilized a large-scale dataset collected through a web-based global survey between May 5 and June 15, 2020, to evaluate the self-perceived impacts of the first wave of the COVID-19 pandemic on various aspects of higher education students’ lives and circumstances (Aristovnik et al., 2020a, 2021). The survey, adapted from the European Students’ Union survey, involved a total of 30,383 students from 62 countries (Aristovnik et al., 2020b). The survey consisted of 14 newly developed five-point Likert scales covering various aspects of higher education students’ learning outcomes, lives, and circumstances in ERTEs. The aspects of lives and circumstances included academic lives, emotional lives, social lives, infrastructure and skills, financial circumstances, support for institutional measures, hygiene habits, and changes in lifestyle. In addition, characteristic information, including age, gender, level of study, field of study, student status, student type, and country of study, was collected. In this dataset, we first eliminated participants and scale items with excessive missing values (> 18% and 40%, respectively) and with missing values considered “missing not at random.” Then, we imputed missing data using the multiple imputation approach (Pickles, 2005) and removed unengaged respondents (i.e., those who provide identical responses across all scale items). We further only kept countries with sufficient participant numbers (at least 50) to control for sampling errors. As a result, 9418 participants from 41 countries were analyzed in our study (Fig. 1; Table 1).

Fig. 1
figure 1

Participant number for each country

Table 1 Characteristic Information of Participants

2.2 Data analysis

We utilized the R programming language (version 4.2.1) for data analysis (R Core Team, 2019). To perform exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) on separate samples, we divided the data into two random subsamples using a 50/50 split. EFA was performed on one subsample to identify latent factors underlying the observed sets of scale items. The optimal number of factors to retain was determined by the scree plot of the eigenvalues of the correlation matrix. CFA was conducted on the other subsample to further examine the validity and reliability of the construct derived from EFA. A comprehensive set of indexes was carefully examined in CFA for construct validation. They included indicators of validity (construct, convergent, and discriminant validity), indicators of reliability, degree of fit to the observed data, and model invariance. Specifically, factors’ adequate construct, convergent, and discriminant validity, were indicated respectively by factor loadings above 0.60, average variance extracted (AVE) values above 0.50, and heterotrait-monotrait (HTMT) correlation ratios below 0.85 (Awang et al., 2015). Adequate reliability of factors was indicated by composite reliability above 0.60 and alpha coefficients above 0.70, and that of items was indicated by item-total correlations above 0.40 (Awang et al., 2015). The degree of model fit was evaluated by four widely used fit indexes: the comparative fit index (CFI), the Tucker-Lewis index (TLI), the standardized root-mean-square residual (SRMR), and the root mean square error of approximation (RMSEA). CFI and TLI compare the model under evaluation with a baseline model; hence, they are considered relative indexes, with larger values indicating greater improvement over the baseline model. We considered 0.90 for CFI and TLI as the threshold for an acceptable fit (Hu & Bentler, 1999). SRMR and RMSEA evaluate how well the model reproduces the observed data; hence, they are considered absolute indexes, with smaller values indicating better fits to the observed data. We considered 0.10 and 0.08 for SRMR and RMSEA, respectively, as the threshold for an acceptable fit (Hu & Bentler, 1999). We further investigated the model invariance across different groups of students with different characteristics, such as gender, level of study, and field of study. The difference in CFI values (i.e., ΔCFI) between two models was used as an indicator (Meade et al., 2008). Invariance is deemed to be held when ΔCFI is less than −0.01 (Cheung & Rensvold, 2002).

Based on the confirmed construct from CFA, the associations between the identified environmental and individual factors and learning outcomes were investigated by a structural equation modeling (SEM) analysis as our first goal. Characteristic variables of gender, level of study, and field of study were converted to the numeric type and included as covariates in the analysis. A mediation analysis was later performed using the bootstrapping method to test the significance of the indirect relationships between environmental and individual factors and learning outcomes mediated by two academic emotion factors (Baron & Kenny, 1986; Preacher & Hayes, 2004). The bootstrapping method was utilized as it allows for inference without distributional assumptions on indirect effect estimates (Zhang et al., 2014). A mediating effect is considered significant when the 95% confidence interval does not contain zero, and the magnitude of the effect is evaluated by the ratio of the indirect effect to the total effect (Preacher & Kelley, 2011; Zhao et al., 2021).

To achieve our second goal—to identify profiles or classes of students who exhibited similar response patterns to the environmental and individual factors—we performed a latent profile analysis (LPA). LPA is a model-based technique that can identify internally homogeneous and externally heterogeneous latent profiles within the population (Laursen & Hoff, 2006; Wang et al., 2021b). We identified the optimal profile model by comparing model fits using information criteria, likelihood ratio, and entropy (Lee & Chei, 2020; Tein et al., 2013). Lower information criteria values, assessed by the Akaike information criteria (AIC), the Bayesian information criteria (BIC), and the sample-size adjusted Bayesian information criteria (aBIC), indicate better model fit (Tein et al., 2013). The bootstrap likelihood ratio test (BLRT) is used to examine the relative fits of two models with different profile numbers (Tein et al., 2013), with a significant BLRT p (< 0.05) indicating an improvement in the fit of the K-profile model as compared to the (K-1)-profile model (Lee & Chei, 2020; Tein et al., 2013). Higher entropy values (> 0.70) among profiles indicate a greater degree of discrimination (Tein et al., 2013). We also ensured the smallest profile frequency was no less than 5.00% of the total population, as it is crucial to have a sufficient number of students in each profile to ensure meaningful interpretation (Lee & Chei, 2020). The stability of the optimal profile solution was assessed by randomly dividing the sample into two distinct subsamples using a 50/50 split (Biwer et al., 2021; Vansteenkiste et al., 2009). LPA was conducted on one subsample and validated on the other. Kruskal-Wallis tests and Dunn’s post-hoc tests for continuous variables and Pearson’s Chi-squared tests and corresponding post-hoc tests for discrete variables (Kruskal & Wallis, 1953; Pearson, 1900) were employed to compare the differences in learning outcomes, environmental and individual factors, and student characteristics among profiles. The SEM model scaled score for each environmental or individual factor in each class was depicted, which was computed by multiplying the factor’s latent value estimate by the absolute value of the standardized estimate of its total association with learning outcomes.

In the end, utilizing the classes identified through LPA, we developed a classifier based on the random forest algorithm to identify the membership of students’ classes, with the aim to establish a connection between new, unseen students and the characteristics of identified classes. The data was divided into training and testing sets using an 80/20 split. During the training process, 10-fold cross-validation was performed. Feature importance for classification was derived from the random forest method and compared with the importance calculated by Boruta (Kursa & Rudnicki, 2010), taking into account the correlations between factors. The study design framework is depicted in Fig. 2.

Fig. 2
figure 2

Framework of the design of this study

3 Results

3.1 Exploratory and confirmatory factor analyses

The scree plot of the eigenvalues in exploratory factor analysis (EFA) determined that ten latent factors underlay the observed scale items. These factors included five environmental factors, two individual factors, two academic emotion factors, and learning outcomes (Fig. 3). The five environmental factors included two social ones, perceived teacher support (TS) and satisfaction with administration support (AS), and three physical ones, satisfaction with synchronous course organization (SCO), satisfaction with asynchronous course organization (ACO), and accessibility to learning technologies (LT). Two individual factors were computer skills (CS) and worry about life, with CS being a cognitive quality factor and worry being a well-being factor. Two academic emotion factors were positive academic emotions (PAEs) and negative academic emotions (NAEs). Among these factors, PAEs and NAEs were considered as emotional mediators in the relationships between environmental and individual factors and learning outcomes in the structural equation modeling (SEM) analysis due to their relevance to learning tasks or outcomes (Pekrun, 2006; Pekrun et al., 2002). The scale items are listed in Table S1 in the supplementary information document.

Fig. 3
figure 3

Structural equation modeling analysis results

Note. Values represent standardized coefficients; solid lines represent statistically significant associations, whereas dashed lines represent statistically insignificant associations. * = p < 0.10; ** = p < 0.05; † = p < 0.01; ‡ = p < 0.001

The validity and reliability of these items and factors were further examined by confirmatory factor analysis (CFA). All items exhibited adequate factor loadings between 0.615 and 0.897 and item-total correlations between 0.70 and 0.94 (Table S1). In addition, all factors displayed AVE values above 0.50, HTMT correlation ratios below 0.85, and both composite reliability and alpha coefficients ranging from 0.78 to 0.89. According to relative and absolute fit indexes (Hu & Bentler, 1999) (Table 2), the ten-factor CFA model demonstrated an acceptable to good model fit. Furthermore, the measurement invariance analysis revealed ΔCFI values of less than −0.01, indicating that the model was consistent across various characteristic variables, including age, gender, level of study, field of study, student status, and student type.

Table 2 Assessment of Model Fits According to Criteria (Hu and Bentler, 1999)

3.2 Structural equation modeling and mediation analyses

The SEM model yielded an acceptable to good model fit (Table 2) and explained 20.79% of variance in PAEs, 22.67% of variance in NAEs, and 49.19% of variance in learning outcomes (Fig. 3). Results indicated that five factors had a significant positive and direct association with learning outcomes (Figs. 4 and S1). These factors, ranked by their standardized path coefficient magnitudes, were SCO (β = 0.234, p < 0.001), CS (β = 0.158, p < 0.001), ACO (β = 0.138, p < 0.001), AS (β = 0.095, p < 0.001), and TS (β = 0.051, p < 0.01). In contrast, worry had a significant negative and direct association with learning outcomes (β = -0.037, p < 0.01). The direct association between LT and learning outcomes was insignificant (β = 0.000, p > 0.10), and thus was not considered in further analyses.

Fig. 4
figure 4

Data visualization of the direct associations of environmental and individual factors with learning outcomes

Note. Values on the y- and x-axes are latent values estimate; sloping lines represent regression equations, where the independent variable is an environmental or individual factor and the dependent variable is learning outcomes; the circumflexes (or “hats”) and intercepts are omitted from the regression equations for the sake of simplicity (Hallgren et al., 2019); the arrows represent the change in the dependent variable when one unit increases in the independent variable; values near arrows represent standardized coefficients. † = p < 0.01; ‡ = p < 0.001

The mediation analysis results are presented in Table 3. The mediating effects of PAEs and NAEs were both significant in the associations between TS (Standardized Estimate = 0.019; 0.018), AS (0.018; 0.010), SCO (0.019; 0.012), ACO (0.020; 0.015), CS (0.017; 0.009), and worry (-0.014; -0.048) and learning outcomes. The direct associations of TS, AS, SCO, ACO, and CS with learning outcomes were larger than their indirect associations via the mediators, whereas the indirect association of worry with learning outcomes was larger than its direct association. AS, SCO, ACO, and CS primarily related to learning outcomes indirectly via PAEs rather than NAEs; TS exerted an equal relation to learning outcomes via both PAEs and NAEs; and worry primarily related to learning outcomes indirectly via NAEs. Fig. 5 displays the data visualization of the mediating effects of PAEs and NAEs in the associations between environmental and individual factors and learning outcomes (Fritz & MacKinnon, 2008).

Fig. 5
figure 5

Data visualization of the indirect associations of environmental and individual factors with learning outcomes via positive and negative academic emotions

Note. Values on the y- and x-axes are latent values estimate; sloping lines represent regression equations, where the independent variables being an environmental or individual factor and positive or negative academic emotions and the dependent variable being learning outcomes; the red and blue sloping lines represent the regression equations when the values of the environmental or individual factor equal 0 and 1, respectively; the circumflexes (or “hats”) and intercepts are omitted from the regression equations for the sake of simplicity (Hallgren et al., 2019); the highest vertical arrow in each figure represents the change in the dependent variable when one unit increases in the environmental or individual factor; values near arrows represent standardized coefficients; all associations are statistically significant

Table 3 Mediation Analysis Results

3.3 Latent profile analysis and classifier

An eight-profile model was determined to be optimal based on the lowest values for information criteria, including AIC, BIC, and aBIC, as well as a significant BLRT p of 0.01, an acceptable entropy of 0.76, and the smallest profile frequency of 7.00% (Table 4). This model was further supported by its performance when tested on two subsamples. The eight profiles were furthered arranged by average scores for learning outcomes, with Class 1 having the lowest score and Class 8 having the highest score. Fig. 6 illustrates the scores for six environmental and individual factors, learning outcomes, and academic emotions in each profile. In addition, Tables 5 and 6 provide means, standard deviations, percentages, and results from Kruskal-Wallis tests and Pearson’s Chi-squared tests and their corresponding post-hoc tests. Table S2 lists the country distribution in each profile.

Table 4 Fit Indexes for Models With Different Numbers of Profiles
Table 5 Descriptive Information of Environmental and Individual Factors, Academic Emotions, and Learning Outcomes in Each Class
Table 6 Descriptive Information of Characteristics in Each Class

The heatmap in Fig. 6 reveals noticeable patterns in general. As the class number increases, TS, AS, SCO, ACO, CS, and PAEs exhibit higher values, whereas worry and NAEs demonstrate lower values. In addition, the heatmap and line plot indicate distinct characteristics across different classes. For example, students in Class 5 tended to report the highest level of CS, while those in Class 2 tended to report the highest level of worry. Students in Classes 8 and 1 tended to report the highest and lowest SCO levels, respectively. The results of the Kruskal-Wallis tests and Dunn’s post-hoc tests revealed a significant variation in learning outcomes and academic emotions among profiles. Specifically, learning outcomes was significantly different between each pair of profiles, with the exception of between Classes 5 and 6 and between Classes 6 and 7. In addition, the level of PAEs increased as the class number increased, and PAEs were significantly different between each pair of profiles, with the exception of among Classes 5, 6, and 7. On the other hand, the level of NAEs was highest in Class 2 and decreased as the class number increased. NAEs were significantly different between each pair of profiles, with the exception of between Classes 4 and 5 and Classes 6 and 7. Furthermore, the eight profiles were characterized by different levels of environmental and individual factors. In Class 1 (n = 698; % = 7.41), students demonstrated the lowest levels of TS, AS, SCO, ACO, and CS. As a result, this class was labeled as “Most Low-Achieving.” In Class 2 (n = 712; % = 7.56), students were characterized by the highest score for worry and below-average scores for other factors. Thus, Class 2 was designated as “Below Average With Highest Worry.” In Class 3 (n = 2055; % = 21.82), students displayed below-average scores for TS, AS, SCO, ACO, and CS. AS and CS in Class 3 were significantly higher than those in Class 2 and worry in Class 3 was significantly lower than that in Classes 2 and 1. This class was identified as “Below Average With Lower Worry.” In comparison to Class 3, students in Class 4 (n = 1152; % = 12.23) exhibited significantly higher scores for TS, AS, and ACO. However, TS, AS, and ACO in Class 4 were still below average. On the other hand, Class 4 showed a significantly lower score for CS than Class 3. Thus, this class was labeled as “Below Average With Lower CS.” Class 5 (n = 659; % = 7.00) students had the highest CS score, which was significantly higher than that in other classes, and scores for TS, AS, and SCO were all slightly above average. This class was labeled as “Above Average With Highest CS.” In both Class 6 (n = 1033; % = 10.97) and Class 7 (n = 2241; % = 23.79), students had scores for TS, AS, SCO, ACO, and CS above average. Compared with students in Class 6, those in Class 7 had significantly higher scores for SCO and CS but significantly lower scores for TS, AS, and ACO. Therefore, Class 6 was labeled as “Above Average With High TS, AS, ACO,” and Class 7 was labeled as “Above Average With High SCO, CS.” Lastly, in Class 8 (n = 868; % = 9.22), students had the highest levels of TS, AS, SCO, and ACO, which were all significantly higher than those in other classes. Students in Class 8 also had the lowest level of worry. This class was labeled as “Most High-Achieving.”

Fig. 6
figure 6

Scores for six factors, learning outcomes, and academic emotions in each class. (a): heatmap; (b): line plot

Note. 100 random samples from each class are shown in the heatmap

The results of the Pearson’s Chi-squared tests and post-hoc tests revealed discrepancies between the proportions of different student characteristics (including gender, level of study, and field of study) in each profile and those in the population. Specifically, there was a significant imbalance in the representation of male and female students in Classes 5 and 1, with more male students in comparison to female students. On the other hand, Class 7 had a significantly disproportionate representation of female students in comparison to male students. In addition, there was a significant imbalance in the representation of students across different levels of study in Classes 4, 2, and 1, with a higher number of bachelor’s students and fewer master’s students. Conversely, Classes 5, 8, and 7 had a significantly higher proportion of master’s students, and Classes 8 and 7 had a significantly lower representation of bachelor’s students. Furthermore, there existed a significant imbalance in the distribution of students across various fields of study in different classes. Class 1 had a significantly higher representation of students studying applied sciences and a significantly lower representation of those studying social sciences. On the other hand, Class 8 had a significantly larger number of students studying social sciences and a significantly smaller number of those studying applied sciences. Moreover, there was a significantly lower representation of students studying natural and life sciences in Class 8. Class 2 had a significantly higher representation of students studying arts and humanities.

Fig. 7 illustrates the SEM model scaled scores for all the six factors in each class. The results of the SEM analysis indicated that the association of SCO with learning outcomes was the strongest, and it was apparent that students in Class 1 were negatively impacted by the lowest level of this factor, whereas those in Class 8 benefited from the highest level of it. In addition, the association of CS with learning outcomes was the strongest besides that of SCO, which helped explain why students in Class 7, despite having lower scores for TS, AS, and ACO, had similar scores for learning outcomes as those in Class 6, due to their higher scores for SCO and CS. The mediation analysis results indicated that worry about life was mainly related to learning outcomes via NAEs, which explained why Class 2 with the highest worry reported the highest NAEs. The random forest-based classifier reached a testing accuracy of 0.904, with all factors significantly contributing to student classification (Fig. S2). Ranked by feature importance determined by random forest, the factors in descending order were SCO, worry, CS, ACO, TS, and AS. The Boruta algorithm ranked them as SCO, CS, worry, ACO, TS, and AS. CS and worry exhibited very similar feature importance.

Fig. 7
figure 7

SEM model scaled scores for six factors in each class. (a): radar charts; (b): dot charts

Note. Values represent latent values estimate of factors multiplied by the absolute values of the standardized estimates of total associations with learning outcomes

4 Discussions

The sudden shift to emergency remote teaching environments (ERTEs) in higher education institutions worldwide due to the COVID-19 pandemic in early 2020 prompted a thorough investigation into the various factors that might be associated with students’ learning outcomes. Although previous studies had presented various environmental and individual factors associated with students’ learning outcomes, they were often theory-driven and focused on factors that had been commonly examined in online learning environments, thereby leaving room for unexplored factors. To address this issue, we employed a data-driven approach using a comprehensive dataset (Aristovnik et al., 2020a) containing responses from 9418 students in 41 countries attending courses in ERTEs. We identified six environmental and individual factors that were significantly related to learning outcomes and the mechanism by which they related to learning outcomes through positive and negative academic emotions. Moreover, we conducted a cluster analysis to identify eight profiles among students, each with unique characteristics related to the six factors, and found that different profiles were associated with varying learning outcomes and student characteristics. Furthermore, we built a classifier to identify each student’s profile, which enables the association of new, unseen students with the characteristics of identified profiles. Given the possibility of future global-scale ERTEs due to health crises or natural disasters (Dhawan, 2020; Whittle et al., 2020), the findings of this study provide valuable information for key stakeholders in higher education, including researchers, practitioners, and policymakers, on important factors associated with learning outcomes in ERTEs and implications for individualized interventions and support strategies to improve learning outcomes and mitigate educational disparities. The findings also contribute to the literature in the field of educational psychology by demonstrating how various environmental and individual factors are associated with learning outcomes via academic emotions in ERTEs. Finally, this study sheds light on integrating population-level and individual-level analyses using variable-centered and person-centered approaches in educational research to bridge the gap in understandings of general patterns and individual differences in students’ learning (Fig. 2), which aligns with the paradigm of precision education, an emerging trend in the field of education.

4.1 General patterns: What factors were associated with learning outcomes and how were they associated?

Our findings showed that six factors were associated with learning outcomes with varying magnitudes in ERTEs during the pandemic (Figs. 3 and 4). When considering improving learning outcomes in ERTEs, all these factors and their respective magnitudes need to be paid attention to rather than focusing on a single factor. Through factor analysis, we identified five environmental latent factors, that is, perceived teacher support (TS), satisfaction with administration support (AS), satisfaction with synchronous course organization (SCO), satisfaction with asynchronous course organization (ACO), and accessibility to learning technologies (LT). Among these factors, TS and AS were social factors, while SCO, ACO, and LT were physical factors. By conducting a structural equation modeling (SEM) analysis, we found that except for LT, all the identified environmental factors had a positive relationship with learning outcomes, indicating that the higher the levels of TS, AS, SCO, and ACO, the greater the students’ learning outcomes. Specifically, TS refers to the level of support perceived by students from their teachers. TS had been widely studied in the context of online learning environments in higher education, with research conducted in various countries such as the United States (Eom & Ashill, 2016), South Korea (Kang & Im, 2013), and China (Wang et al., 2021a) all finding that TS enhances learning outcomes. Another social factor, AS, refers to the quality of support provided by administrative staff and trained personnel that complements students’ learning resources or course materials. While providing a variety of AS had been emphasized as an integral part of quality online education in the literature (Blount, 2002; Carnwell & Harrington, 2001; Carroll-Barefield, 2006; LaPadula, 2003; Tait, 2000), its significance had yet to be empirically verified. We found that AS was significantly associated with learning outcomes in ERTEs, and this association was even stronger than that between TS and learning outcomes.

In addition to the two social factors that exhibit shared meanings, two physical factors, namely SCO and ACO, also exhibit shared meanings. Specifically, both SCO and ACO pertain to the perceived quality of online course organization, encompassing various elements such as the scheduling of course activities, organization of peer work, and assessment of assignments and papers (Schell & Janicki, 2013). The difference is that SCO is related to the perceived quality of the organization of courses that require real-time learning processes and feature real-time social interactions and instant feedback, whereas ACO is related to the perceived quality of the organization of courses that do not require real-time learning processes and lack real-time social interactions or instant feedback (Dhawan, 2020). SCO and ACO being identified as two distinct factors suggested that perceived quality of SCO in ERTEs did not necessarily translate to perceived quality of ACO. This distinction had not been made in previous studies. Both SCO and ACO were found to be significantly related to learning outcomes, and the relationships of both physical factors were stronger than that of the two social factors. Furthermore, SCO was found to have the most significant relationship with learning outcomes among all environmental and individual factors. This finding may be relevant to the quality of interactions and collaborations between students, as such interactions and collaborations were shown to be significantly associated with learning outcomes within the context of ERTEs (Su & Guo, 2021; Tzafilkou et al., 2021). Synchronous courses offer more opportunities for student-student interactions and collaborations than asynchronous courses (Dhawan, 2020). These findings aligned with that of the present study, which showed that satisfaction with the quality of synchronous course organization, including the organization of peer work and activities, had a stronger relationship with learning outcomes compared to that of asynchronous course organization.

Besides SCO and ACO, another physical factor, LT, refers to students’ accessibility to technologies that aid in learning while attending courses from home in ERTEs. Previous studies and position articles had suggested that accessibility to LT is essential for learning in ERTEs (Dhawan, 2020; Shin & Hickey, 2021). However, we found no significant association between LT and learning outcomes, indicating that more factors may be involved in this association or that this association may be contextual. For instance, LT may contribute to the increase or decrease in other factors that enhance learning outcomes, as suggested by a participant in a qualitative study conducted among U.S. higher education students attending courses in ERTEs. Specifically, the participant noted that inaccessibility to Wi-Fi led to difficulties in completing coursework on time and hindered communication with classmates and teachers, ultimately decreasing their academic performance (Shin & Hickey, 2021).

Besides the five environmental factors, two individual factors were identified, namely computer skills (CS) and worry about life. CS is a cognitive quality factor that refers to the student’s computer-related abilities and skills. The results showed that higher levels of CS were associated with higher levels of learning outcomes, which is consistent with the findings from previous studies, for instance, among higher education students studying in online environments at universities in China (Wan et al., 2008) and the U.S. (Wang et al., 2013). The significance of CS was the strongest among all environmental and individual factors, besides SCO, emphasizing the importance of providing students with resources to improve their computer-related abilities and skills to improve learning outcomes in ERTEs. Worry is a well-being factor that refers to students’ concerns regarding their life circumstances during the pandemic. Results indicated that higher levels of worry were associated with lower levels of learning outcomes, suggesting that negative emotions related to the pandemic could worsen learning outcomes in ERTEs. This significance had not been examined previously. In summary, multiple factors were associated with learning outcomes in ERTEs to varying degrees. When considering how to enhance higher education students’ learning outcomes in ERTEs, it is important to take into account all the associated factors and their respective magnitudes, rather than focusing on a single factor.

Finally, by testing the mechanism through which these factors related to learning outcomes via academic emotions, we found that the associations between all six factors and learning outcomes were partially mediated by positive academic emotions (PAEs) and negative academic emotions (NAEs) (Table 3; Fig. 5). Factors of TS, AS, SCO, ACO, and CS were positively related to students’ PAEs, including joy, hope, and pride, and were negatively related to their NAEs, including frustration, anger, anxiety, and hopelessness, linking to higher learning outcomes. All these factors were primarily related to learning outcomes directly without being mediated. In contrast, an increase in students’ worry was associated with a decrease in PAEs and an increase in NAEs, resulting in lower learning outcomes. Worry was primarily related to learning outcomes indirectly via NAEs, where the heightened worry about life during the pandemic was associated with increased negative emotions related to learning, subsequently relating to decreased learning outcomes. Although the mediating roles of PAEs and NAEs had been established between certain environmental and individual factors and learning outcomes, these roles in the context of ERTEs during crisis situations had been understudied. In this study, we examined their mediating roles between various environmental and individual factors and learning outcomes in ERTEs, which provides practical implications for developing interventions and support strategies to improve students’ learning-related emotional experiences and, ultimately, their learning outcomes. Additionally, this study contributes to the literature on academic emotions by providing evidence for a mediation model that examines the relationships between environmental and individual factors and learning outcomes mediated by PAEs and NAEs in ERTEs.

4.2 Individual differences: How were students grouped into different classes based on the associated factors, and how were different classes associated with varying learning outcomes and student characteristics?

By conducting SEM using a variable-centered approach, we were able to identify the general patterns of six environmental and individual factors associated with learning outcomes in ERTEs at the population level. A more critical question is how we can apply this knowledge to improve students’ learning outcomes at the individual level. Usher et al. (2021) conducted a study among 358 undergraduate psychology majors studying in ERTEs and found that some students reported that their learning outcomes were primarily related to external factors such as the method of instructional delivery, while others reported that their learning outcomes were primarily related to internal factors such as difficulties with self-regulation and motivation. These findings highlighted the importance of considering individual differences and tailoring interventions and support based on these differences to improve individuals’ learning outcomes in ERTEs. To achieve this, we combined variable-centered and person-centered approaches (Fig. 2) using SEM and latent profile analysis (LPA) to bridge the gap in understandings of general patterns and individual differences (Laursen & Hoff, 2006).

By conducting LPA, we identified eight distinct profiles or classes among students with different characteristics related to the six environmental and individual factors (Table 5; Fig. 6). The profiles identified by LPA were arranged according to their scores for learning outcomes, with the lowest scoring class being Class 1 and the highest scoring class being Class 8. From Class 1 to Class 8, these profiles were named “Most Low-Achieving,” “Below Average With Highest Worry,” “Below Average With Lower Worry,” “Below Average With Lower CS,” “Above Average With Highest CS,” “Above Average With High TS, AS, ACO,” “Above Average With High SCO, CS,” and “Most High-Achieving.” The different classes demonstrated not only that students were heterogeneous in terms of learning outcomes and its six associated factors, but also that the heterogeneity was associated with student characteristics of gender, level of study, and field of study (Table 6). For example, gender could be an important attribute to understand the heterogeneity in learning outcomes. Class 1 had a disproportionate representation of males, whereas Class 7 had a disproportionate representation of females. Gender could also be relevant to computer skills because Class 5 with the highest CS level had a relatively large proportion of males and a relatively small proportion of females. The result was aligned with what previous research showed that male undergraduate students had higher levels of computer knowledge than their female counterparts (He & Freeman, 2010). Level of study was another important attribute to understand the heterogeneity. Classes 1, 2, and 4 had a disproportionate representation of bachelor’s students, whereas Classes 5, 7, and 8 had a disproportionate representation of master’s students. The findings suggested that graduate students studying in ERTEs were more likely to be grouped into classes with greater learning outcomes, TS, AS, SCO, ACO, and CS and less worry compared to undergraduate students. It is reasonable because according to previous studies, in ERTEs during the COVID-19 pandemic, graduate students reported greater academic performance, more support from teachers, greater access to academic advising and learning support services, fewer difficulties in attending synchronous online classes, more familiarity with learning technologies, and less financial stress compared to their undergraduate counterparts (Dial et al., 2021; Meletiou-Mavrotheris et al., 2022; Soria et al., 2020). One of the possible explanations for the superior learning outcomes of graduate students in ERTEs was attributed to their higher levels of critical thinking skills and lower tendencies towards academic procrastination in comparison to their undergraduate counterparts in online learning environments (Artino & Stephens, 2009). Moreover, field of study could be an important attribute to understand the heterogeneity across classes. Class 1 had a relatively large proportion of students studying applied sciences and a relatively small proportion of those studying social sciences. Conversely, Class 8 had a relatively large proportion of students studying social sciences and a relatively small proportion of those studying applied sciences and natural and life sciences. Prior research found that students studying law and education reported the greatest academic performance in ERTEs, while those studying engineering reported the least (Meletiou-Mavrotheris et al., 2022). It is not surprising because applied sciences need more lab and hands-on classes than social sciences (Meletiou-Mavrotheris et al., 2022). These types of classes are challenging to conduct effectively and require more support in virtual and remote settings (Meletiou-Mavrotheris et al., 2022). As a result, the learning outcomes and satisfaction with course organizations of students who study applied sciences may be adversely influenced in ERTEs. Field of study could also be relevant to the level of worry about life. Class 2 with the highest worry had a higher representation of students studying arts and humanities. Prior research revealed that in ERTEs during the pandemic, arts and humanities students at U.S. universities had higher levels of anxiety than students in other majors during the pandemic (Chirikov et al., 2020).

While the SEM analysis revealed the factors associated with learning outcomes, it offers limited insights into effectively improving these factors for different classes of students. With the knowledge gained from SEM, which suggested that a one-unit increase in SCO was associated with the greatest increase in learning outcomes, it is important to note that the support required to improve SCO may vary among students with high and low levels of SCO. Providing the same level of support to improve SCO for the entire population may not be the optimal approach, as it may not address the specific needs of students in individual classes. Similarly, the support needed to improve other factors may also vary based on the different levels of these factors. For example, students in Class 5 with the highest CS level may benefit more from support aimed at improving other factors. Meanwhile, for students in Class 2 with the highest worry level, reducing their worry during the pandemic may be a priority. Students in Class 3 may benefit from support to improve their TS, AS, and ACO, while those in Class 4 may require support to enhance their CS. In addition, students in Class 7 may benefit from support to improve their TS, AS, and ACO, while those in Class 6 may benefit from support to develop their SCO and CS. Finally, students in Class 1 may require comprehensive support due to their lowest levels of TS, AS, SCO, ACO, and CS and high worry.

Using LPA, we grouped students into different classes and analyzed their distinct learning patterns. We also studied how the heterogeneity was related to students’ characteristics, such as gender, level of study, and field of study. The findings have implications for the development of adaptive and personalized interventions and support strategies to cater to the unique needs of individual students, with the ultimate goal of promoting their learning outcomes (Luan et al., 2020) and reducing disparities and promoting equity in education in ERTEs (Rose & Meyer, 2002; Tomlinson, 2014). The use of clustering techniques, such as LPA, can provides opportunities for timely interventions and support to optimize learning outcomes based on individual student learning patterns (Hart, 2016; Lu et al., 2018; Luan et al., 2020). Following clustering, we employed the classification technique of classifier to check class characteristics, aiming to better identify the classes of future individuals and optimize support for them (Fig. S2 in the supplementary information document). Classification techniques can predict the membership of predefined groups for new, unseen data instances based on their characteristics (Soofi & Awan, 2017). Incorporating the two machine learning techniques alongside various technologies and tools, such as statistical analysis, data visualization, data mining, text mining, and other artificial intelligence techniques, can aid in achieving the goal of precision education or personalized learning, which is an emerging trend in the field of education that deviates from the traditional one-size-fits-all approach (Luan et al., 2020; Madhavan & Richey, 2016). The precision education approach aims to personalize the learning experience by taking into account the unique characteristics and needs of each student, including their learning environments and strategies (Hart, 2016; Luan et al., 2020), to improve the diagnosis, predication, and treatment of learning outcomes (Lu et al., 2018). Previous research in the field of education has encountered difficulties in providing real-time, actionable feedback based on the learning data collected from and about students, as well as other stakeholders such as teachers and administrators (Drigas & Leliopoulos, 2014; Madhavan & Richey, 2016). Precision education techniques have the potential to alleviate this issue.

In summary, exploring various factors that were associated with learning outcomes can identify general areas for improving overall learning outcomes. Besides, identifying specific factors and student characteristics that are crucial for certain profiles is important not only for optimizing individualized interventions and support but also for reducing educational disparities. Moreover, classification based on knowledge derived from the analyzed data may benefit a wider range of students. In educational research, it is important to embrace a variety of approaches to gain a comprehensive understanding of and optimal implications for students’ learning. The choice of prioritizing one approach over the other depends on the specific learning contexts and goals.

4.3 Limitations and future directions

This study offers valuable insights; however, it is important to acknowledge certain limitations. The data was collected utilizing the convenience sampling method, thus the dataset represents a sample of the global population of higher education students (Aristovnik et al., 2020a). In addition, the data was collected online and as such, the dataset represents students with access to electronic devices and the internet (Raccanello et al., 2022). Moreover, the measurement of learning outcomes relied solely on students’ self-reported scores. Furthermore, the data was collected in ERTEs during the COVID-19 pandemic, and no comparable data was collected in traditional learning contexts. As a result, it is not possible to compare the results of this study with those in standard learning environments and to investigate the impacts of transitioning learning contexts on the relationships among factors and the mechanism by which environmental and individual factors relate to learning outcomes through academic emotions. Finally, not all potential associated factors were incorporated in our study.

In order to improve the generalizability of the study results, researchers could consider alternative methods of data collection in future studies (Raccanello et al., 2022). Course grades or grade point average (GPA) in ERTEs retrieved from the institutions’ course systems can be indicators of students’ learning outcomes in addition to self-reported scores. Additional characteristic information such as individual cognitive variables, socioeconomic status, and race can be collected to gain a more accurate understanding of general SEM patterns by being added as covariates and detailed understanding of individual profiles. The items, which were grouped together as factors in exploratory factor analysis (EFA) and validated in confirmatory factor analysis (CFA) using separate subsamples, show promise for measuring specific factors. Experts in test development can further adapt these items, develop scales, and conduct more detailed psychometric testing among specific populations in specific contexts for construct measurement in future empirical studies. Researchers can employ qualitative techniques in future studies to understand why some factors are or are not associated with learning outcomes and how students in different profiles perceive certain factors differently. Longitudinal studies can also be conducted to examine how environmental and individual factors cause better or worse learning outcomes. Moreover, data in traditional learning contexts during non-pandemic situations can be collected in the future to investigate how these important factors might be associated with learning outcomes differently. As learning management systems (LMSs) become increasingly prevalent, more big data from LMSs will become available (Fischer et al., 2020), providing an opportunity to explore factors associated with learning outcomes more comprehensively. Finally, researchers can bridge data-driven and theory-driven techniques together by accounting for potential factors frequently examined in online learning environments in future studies, such as system quality, course design, students’ interaction and communications with teachers and peers, and self-regulation (Eom & Ashill, 2016; Su & Guo, 2021; Tzafilkou et al., 2021; Wang et al., 2021a).

5 Conclusion

In emergency remote teaching environments (ERTEs) during the COVID-19 pandemic, multiple factors were associated with the learning outcomes of higher education students with varying magnitudes. Based on these factors, various profiles were identified among students, each corresponding to varying learning outcomes and student characteristics. Incorporating population-level and individual-level analyses utilizing structural equation modeling (SEM) and latent profile analysis (LPA) techniques has bridged the gap in understandings of general patterns and individual differences regarding factors associated with learning outcomes. By integrating the classification technique with LPA, the insights obtained from studying specific student profiles can be extended to benefit a wider range of students. It is crucial to have a comprehensive understanding when designing individualized interventions and support strategies to enhance the learning outcomes of higher education students and mitigate educational disparities in ERTEs during crisis situations.