Introduction

The year 2020 was very challenging, especially because the COVID-19 pandemic has spread worldwide, impacted both individuals and society, and caused major changes in various fields, including education. This challenging situation is also faced by 270 million Indonesians in the large archipelagic area consisting of 34 provinces (The World Bank 2020). For teachers, the COVID-19 pandemic is a quintessential adaptive and transformative challenge, one for which there is no pre-configured playbook that can guide appropriate responses (Reimers et al. 2020).

In the field of education, one of the challenges is the continuation of teaching and learning processes during a pandemic (Reimers et al. 2020). To ensure the continuation of these processes, higher education in Indonesia has changed from the traditional classroom system to online learning (Habibie et al. 2020). Even though lecturers had been trained to conduct online learning, online lectures were almost never carried out in Indonesia before the pandemic. However, online learning has since become a valuable solution that enables classroom teaching during a pandemic (Harto 2020). Compared with other levels of education (e.g., senior high school), higher education has special characteristics because university students come from all regions of Indonesia. Moreover, long before the pandemic, the implementation of online learning at the tertiary level had various obstacles, such as funding and availability of experts, as well as the difficulty of equitable internet access on 14,000 islands (Diki 2013).

In contrast to Indonesia, various countries have been investing in the development of educational technology since the late 1990s in the hope that its use in schools would improve the quality of learning outcomes (Huang et al. 2010), although proponents of online learning doubted the effectiveness of using technology in the classroom environment (Yang et al. 2018). However, in the past 20 years, proponents of online learning have stated that, when compared with traditional teaching, online lectures facilitate learning in the context of higher education (Li et al. 2014; McPhee and Söderström, 2012).

Amid this debate, one important point that should be addressed is that of online learning environments, given that online and traditional classroom contexts are very different (Trinidad et al. 2005; Yang et al. 2018). These differences impact academic performance (Demian and Morrice 2012), the quality of learning (Haro et al. 2019), and teacher–student interactions (Marks et al. 2005; Wright 2015). In addition, the implementation of online lectures in Indonesia during the pandemic has been rather sudden (Habibie et al. 2020); institutional competency development should be carried out carefully to prepare facilities that are ready to implement online lectures (Gregory and Lodge 2015). Therefore, online learning environment needs to be analysed through students’ perspectives, especially because there is no online lecture standard in Indonesia, which means that the online learning context experienced by students is quite varied.

To evaluate the introduction of online learning from the perspective of students, suitable instruments are needed. Unfortunately, such evaluation methods are very difficult to find in the existing literature in Indonesia, even though learning environment questionnaires are known to be an important source of information in curriculum preparation and program evaluation (Spinner and Fraser 2005), including the preparation of standards for online learning implementation. In addition, an understanding of the online learning environment provides various benefits such as adaptability and a set of special instructions that can be given to students who have different levels of academic knowledge (van Seters et al. 2012).

An analysis of existing studies revealed several instruments specifically developed to measure online learning environments based on student perspectives, such as the Online Learning Environments Survey (OLES; Trinidad et al. 2005). Additionally, there are instruments that focus on readiness and student perceptions, such as the Online Learning Readiness Scale (OLRS; Hung et al. 2010) and, more recently, the Online Learning Perception Scale (OLPS; Wei and Chou 2020). Wei and Chou also tested the interrelations between OLPS and OLRS. In addition, existing literature provides a record of instruments developed in traditional classroom contexts, such as the What Is Happening In this Class? (WIHIC; Fraser et al. 1996). The WIHIC questionnaire also allows contexts to be adapted into online learning, as in OLES. The adaptation of WIHIC provides various advantages from a theoretical point of view because the aspects covered are very broad and comprehensive (Rahayu et al. in press).

Although OLES, which contains three aspects of WIHIC (Fraser et al. 1996), covers a broad spectrum of online learning environments, the development of new instruments based on observations of these web-based learning contexts in Indonesia can provide alternatives. This approach was carried out in previous studies before the WIHIC was introduced (Chionh and Fraser 2009). With the availability of online learning environment instruments from the perspective of students, lecture quality can be improved by considering important points such as student–lecturer interaction (Wright 2015), access (Li and Tsai 2017), or resources and can be used as a reference for a comparison between traditional and online classrooms (Yang et al. 2018).

However, online lectures are affected by various issues, such as lecturers’ inadequate technology skills and the risk of being unable to resolve technology-related problems during online classes, which can impact student access to learning materials (Roddy et al. 2017). In addition, online teaching requires good interaction and communication between lecturers and students (Easton 2003) during e-mail, chat, live class questions, or feedback (Roddy et al. 2017). Another challenge in conducting online lectures is the absence of face-to-face interaction. This means that the feedback given by lecturers should be understood by students immediately (Darabi et al. 2006), in contrast to the lower levels of education (e.g., junior and senior high school), and online lectures must have a form of activity which involves consultation between lecturers and individual students (e.g., thesis guidance). Online teacher support should involve effective monitoring of student progress, anticipation and resolution of key learning queries, and establishment and maintenance of student–teacher relationships. Collectively, these types of skills shape the effectiveness of online instructors and, in turn, the entire student experience (Roddy et al. 2017).

Considering that all of these aspects need to be verified and evaluated by students, the aims of this study were to (1) develop instruments to assess students' perceptions of the online learning environment, (2) build these constructs, (3) validate these instruments via factor analysis, (4) perform Rasch analysis to obtain detailed psychometric properties for the OCLEI, and (5) test the interrelations of instruments that assess other variables related to the construct.

Methods

Instrument development and data-analysis procedures

Developing an instrument for assessing students' perceptions of the online learning environment in Indonesia was carried out with reference to previous studies conducted by Trinidad et al. (2005), but with several different statistical methods. The four development stages are described below.

Phase 1: Observation in the online classroom

In the initial stage, observations were carried out for eight online lecture meetings. After the observation process, information on the online learning environment was generated. This approach, which is the hallmark of the field today (Tobin and Fraser 1998), was used in previous studies (e.g. Chionh and Fraser 2009). The findings of the observation showed that the environment has actually been covered well by aspects of the WIHIC questionnaire. However, several elements needed to be added (e.g. Trinidad et al. 2005). Therefore, a literature review was carried out to develop the instrument’s framework, while considering several aspects of WIHIC.

Phase 2: Framework and literature review

The theoretical framework is prepared with an approach that is in line with studies conducted by Trinidad et al. (2005). The first aspects are focused on extrapolating observations to show that access to learning is a common topic in lectures. In our study, access is built on the concept of the smart classroom as measured by the name of the resource (Yang et al. 2018), which is an extension of the aspects of computer usage available in OLES (Trinidad et al. 2005), as well as in the Technology-Rich Outcomes-Focused Learning Environment Instrument (Aldridge et al. 2004). In addition, this aspect is closely related to elements of accessible traditional classroom environments (Yang and Huang 2015).

The next aspect of interaction is taken from features of student communication and collaboration (Trinidad et al. 2005) that also were developed as an independent construct (Wright 2015). Based on findings from observations, three aspects of the WIHIC are in line with the conditions of online learning in Indonesia: teacher support, equity, and investigation (Fraser et al. 1996). These three aspects work very well in the Indonesian context (Margianti 2002; Rahayu et al. in press), but not all of them have been adapted in the context of online learning. Moreover, equity and investigation are aspects of the Smart Classroom Inventory (Li et al. 2015), teacher support and investigation are aspects of the Classroom Environment Evaluation Scale (CEES; Yang and Huang 2015), and teacher support and equity have been adapted to the context of online learning (Trinidad et al. 2005). Because this instrument was developed specifically in the context of higher education, the term teacher support was changed to lecturer support.

Phase 3: Item writing and final version of instruments

In preparing for the focus-group discussion stage, item writing was carried out and the final version was administered to respondents consisting of active students who had attended online lectures. It was then tested for construct validity using factor analysis and the Rasch model. The Online Classroom Learning Environment Inventory (OCLEI) assesses university students' evaluation of the online learning environment in Indonesia with 25 items that measure five aspects (see Table 1). Each of the 25 items has a 4-point Likert response scale ranging from 1 to 4 for the possible responses of Strongly Disagree, Disagree, Agree, and Strongly Agree. These categories are clearly ordered. All items consist of positive statements so that the highest numerical value of the response scale reveals positive evaluations of online learning environments for each subscale. The OCLEI was administered to respondents using an online survey system.

Table 1 Framework for evaluating online learning environments

Phase 4: Statistical analysis for validation

The OCLEI was then administered directly to the study sample of 669 Indonesian students (454 females, 215 males). Their ages ranged from 18 to 31 years (mean age = 21.11, SD 2.55); 596 respondents were undergraduate students (89.1%) and 73 were graduate students (10.9%); 217 respondents came from metropolitan areas (32.4%) and 452 from rural areas (67.6%). Data collection was carried out using an online survey system with 750 respondents, although only 692 are reported to have participated in this study (response rate = 92.2%). Further, 23 of 692 respondents were not included in the data analysis because of listwise deletion for missing data. Participants also filled out informed consent forms that made it clear that this research would be published while maintaining the confidentiality of the respondents’ identity.

Item factor analysis

Confirmatory Factor Analysis (CFA) is a method commonly used to evaluate the factor structure of the theory behind an instrument. Both the number of factors and factor loadings are assessed through CFA (Harrell-Williams and Wolfe 2013). The CFA model is often used in the analysis of ordinal scale data (e.g., Likert), which assumes that data are continuous. In contrast, if the data are considered categorical, Item Factor Analysis (IFA) is used, which includes a threshold into the estimation process (Cai 2010a; Forero et al. 2009).

IFA can be used as an exploratory or confirmatory tool and has a long history of application to the evaluation of instruments in education and psychology (Cai 2010a, 2010b). When a study is carried out to evaluate the psychometric characteristics of an instrument and the response options for each item are small (e.g., five or less), IFA should be applied (Rhemtulla et al. 2012; Wirth and Edwards 2007). With IFA, we utilised several fit statistics and indices, namely, root mean square error of approximation (RMSEA), comparative fit index (CFI), Tucker–Lewis Index (TLI), standardised root mean square residual (SRMR), and chi-square (\({\chi }^{2}\)). The following standards for good fit were used: RMSEA < 0.050, CFI and TLI > 0.950, SRMR < 0.080, and nonsignificant \({\chi }^{2}\) (Wang and Wang 2020). In this study, the IFA was applied with the Mplus 8.4 program using the weighted least squares means and variance adjusted estimation.

Multidimensional Rasch analysis

Based on results from IFA, the analysis was continued with multidimensional random coefficients of multinomial logit models (Adams et al. 1997), also known as the multidimensional Rasch model (Shih et al. 2013; Wang et al. 2006). This model is a generalisation of the simple Rasch model (Rasch 1960) and polytomous Rasch models (e.g. Andrich 1978).

If a test consists of several unidimensional tests (e.g. the five subscales of the OCLEI), it can be calibrated using standard Rasch analysis procedures. The test can be either analysed as a whole or, using the unidimensional Rasch model, it could be applied to each subscale separately, one test at a time (e.g. Pichardo et al. 2018). However, the unidimensional Rasch approach ignores the claims for the subscale structure of the test (Wang et al. 2004, 2006) because there are no estimated correlations between traits. In using that approach, one can only compute the correlation of person trait level between aspects.

To consider the correlations between latent traits, one needs a multidimensional model that simultaneously calibrates all the tests and utilises the correlations to increase measurement precision. In reality, because there are always non-zero correlations between latent traits, at least in theory, the multidimensional approach is more appropriate than the unidimensional one. In addition, the higher the correlations, the greater the measurement precision using the multidimensional approach (Wang et al. 2004, 2006).

In this study, the multidimensional version of the rating scale model (Andrich 1978) was chosen using the ACER Conquest 5.13 program with a marginal maximum likelihood estimation method for item parameters and a Monte Carlo-based approach with 2000 nodes for person parameter. The analysis was intended to obtain information about item fit, with Infit and Outfit MNSQ, ranging from 0.6 to 1.4, indicating that the item was fit for the model with Likert scales (Wright and Linacre 1994). Furthermore, the functioning of the 4-point Likert scale was also investigated with an Outfit MNSQ < 2.0, indicating that the response category was functioning well (Linacre 1999). Rasch analysis also yielded information about person separation reliability for each aspect of OCLEI in the form of plausible values (PV) reliability indices. This drew an estimate of how reliably items could be used to distinguish students’ underlying abilities (Fulmer et al. 2015), with values above 0.70 regarded as acceptable (Fauth et al. 2019).

Validation measures

Online learning readiness

To measure online learning readiness, we used OLRS (Hung et al. 2010), which contains 18 items: Items 1–3 measure computer/internet self-efficacy, Items 4–8 measure self-directed learning, Items 9–11 measure learner control (in an online context), Items 12–15 measure motivation for learning (in an online context), and Items 16–18 measure online communication self-efficacy. The OLRS is answered on a 5-point Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree) for each dimension separately. In the present study, alpha reliabilities were 0.771 (computer/internet self-efficacy), 0.803 (self-directed learning), 0.734 (learner control), 0.730 (motivation for learning), and 0.715 (online communication self-efficacy). With the criterion that alphas of 0.70 represent good internal consistency (Cortina 1993), all aspects of the OLRS have good internal consistencies.

Results

Item factor analysis

Model fit

Initially, although we made a hypothetical model that with five aspects of the OCLEI questionnaire that are theorised to be correlated, we compared three models: 1-factor, 5-factor, and 5-uncorrelated factor models (correlation between factors was determined to be 0). The results of the comparison are presented in Table 2.

Table 2 Goodness-of-fit indices for three hypothetical models

As seen in Table 2, the 5-factor model fits very well with RMSEA = 0.032 < 0.050, SRMR = 0.041 < 0.080, with CFI and TLI of, respectively, 0.975 and 0.972 > 0.950. Although chi-square (\({\chi }^{2}\)) is significant, which means the model does not fit, it is reported only as descriptive information because this statistic is influenced by sample size. With the large sample size in this research, chi-square (\({\chi }^{2}\)) is significant (Wang and Wang 2020). The data support the hypothetical model based on the five aspects that have been developed since the item writing process.

Item parameter and reliabilities

Further information about the item parameters for each subscale is presented in Table 3. All items had a significant factor loading with p < 0.001. Factor loadings for all items are within the range of 0.463 to 0.889 in a positive direction, which supports construct validity with all items being valid for measuring each dimension (Knekta et al. 2019). In addition, reliability is indicated by Cronbach’s α values of 0.744 (access), 0.723 (interaction), 0.718 (lecturer support), 0.758 (equity), and 0.723 (investigation). Using the criterion that an alpha of 0.70 indicates good internal consistency, all aspects of OCLEI have satisfactory internal consistency (Cortina 1993).

Table 3 Factor loadings from IFA and Cronbach’s alpha for five OCLEI factors

Multidimensional Rasch analysis

Item parameter, threshold, and separation reliabilities

Table 4 provides an overview of the psychometric characteristics of OCLEI, including item fit statistics and location parameters for all items. All item fit statistics fall into an acceptable range of Infit and Outfit MNSQ (0.6–1.4) (Wright and Linacre 1994). Therefore, all items fit the multidimensional Rasch rating scale model. The location of items is in the range of –0.294 to 0.397 logits. Item 9, Saya berdiskusi secara daring dengan mahasiswa lain untuk menyelesaikan tugas, with its location at –0.294 logits, is the easiest item for obtaining a score of 4 (the easiest to endorse). In contrast, Item 17, Dosen tidak lupa menyampaikan salam kepada seluruh mahasiswa yang mengikuti perkuliahan secara daring, which is located at 0.397 logits, is the most difficult item for obtaining a score of 4 (the most difficult to endorse).

Table 4 Rasch analysis results, reliabilities, and item wording

The following information is focused on the item’s threshold (τ), which rises from low to high (\({\uptau }_{1}= -0.641; {\uptau }_{2}=-0.446; {\uptau }_{3}=1.087\)), as shown by the analysis. In addition, MNSQ's Infit and Outfit for all response categories are in the acceptable range, meaning that the response categories of OCLEI are functioning well. In addition, person separation reliability, obtained in the form of PV reliability for each aspect, were 0.732 (access), 0.715 (interaction), 0.741 (lecturer support), 0.733 (equity), and 0.711 (investigation), all of which meet acceptable criteria.

Wright map

Having previously presented information about the estimated results of item parameters, the link between the levels of student perception aspects and the location of each item can be compared simultaneously using the Wright map (Fig. 1).

Fig. 1
figure 1

Wright Maps of OCLEI (Dimension 1 = Access; Dimension 2 = Interaction; Dimension 3 = Lecturer Support; Dimension 4 = Equity; Dimension 5 = Investigation)

To complete the information in Fig. 1, the average participant’s result for each aspect is 0.380 logits (access), 0.185 logits (interaction), 0.403 logits (lecturer support), 0.580 logits (equity), and 0.466 logits (investigation). This shows that the mean of respondents’ trait level was consistently higher than the mean of item difficulty (zero in each aspect). The equity aspect has the highest mean of person trait level, while the interaction aspect has the lowest mean. In addition, for all aspects, the persons are spread more than item measure orders on the continuum (range from -0.247 to 0.397 logits). However, there are no items aligned with the sample located on the top range of trait levels, indicating that the OCLEI might need some more difficult items for assessing individuals with high trait level of each aspects.

Correlation between subscales: IFA and multidimensional Rasch

The estimation results for the correlation between factors were obtained considering the 5-factor model fit for the data. Correlations between the five aspects of OCLEI are presented in Table 5. The IFA analysis showed that the correlation between OCLEI factors is significant in a positive direction. The highest correlation was between lecturer support and interaction (r = 0.348, p < 0.001) but, overall, the correlation was between 0.156 and 0.348, meaning that it tended to be low, although statistically significant. Therefore, the 5-uncorrelated factor model does not fit for the data, given that the factors tend to correlate significantly with each other.

Table 5 Correlations among OCLEI scales

Based on information about factors correlations, in the lower triangle of Table 5, the 5-factor IFA model proved to be appropriate in describing the OCLEI measurement model. This is because the range of correlations between factors is relatively low to moderate, thus minimising the use of the IFA higher-order model, which is possible when the correlation between factors is high. In the upper triangle of Table 5, the correlation between factors can be seen from the results of the multidimensional Rasch analysis. The use of these two methods results in a correlation solution between factors that are in line. These findings confirm the theoretical framework used in developing the OCLEI.

Correlation with other scales: Convergent validity

In this study, a convergent validity test of the OCLEI was conducted against an instrument that measures OLRS. Table 6 contains correlations between aspects of the OCLEI and those of another instrument that measures other related variables. This analysis supports the convergent validity of the OCLEI instrument because all correlations were significant and positive. This correlation pattern supports the theoretical foundation of online learning environments and their aspects of readiness. The resulting correlation range was 0.227 to 0.429, which can be considered moderate.

Table 6 Correlations between OCLEI scores and OLRS

Discussion and conclusion

Measurement instruments for online learning environments have been developed in the last two decades since they were pioneered by Trinidad et al. (2005). However, online learning carried out during the COVID-19 pandemic in Indonesia is unique, especially in the implementation of online learning systems. Our research focused on the development of a short, reliable, and valid measure of online learning environments based on students’ perspectives. The data supported the scale’s latent structure and for convergent validity, including empirical support for five dimensions of online learning environments (access, interaction, lecturer support, equity, and investigation). Because of the low and moderate correlations among the subscales, we advise researchers to consider using separate scores for each subscale. Regarding the correlation with other measures, the 25-item OCLEI showed the expected correlations with relevant measures of five aspects of OLRS, indicating good convergent validity.

Considering that a previous study had revealed that students' positive perceptions contributed to their higher readiness for online learning (Wei and Chou 2020), we realised that the OCLEI’s theoretical framework had a concept that was very close to the newly-developed OLPS used in that study. Positive correlations between our five aspects of OCLEI and the five aspects of OLRS are in line with the findings of a previous study, which had also found a positive relation between OLPS and OLRS (Wei and Chou 2020).

Data from the OCLEI were consistent with results from another survey about online learning conducted in Indonesia in that access factors, including network or internet connection, were two great challenges. Moreover, developing countries are facing many challenges in the implementation of online learning: poor ICT network infrastructure, lack of high-quality content, and lack of skills among those who interact with e-learning systems (Harto 2020). This study, however, did not include internet network as part of the access factors, because it is outside the control of the online college education provider (e.g., university). In addition, findings about this internet access problem in online learning during the COVID-19 pandemic have not been formally published in journals or other scientific publications.

Instead of classical test methods, more-advanced techniques were applied to analyse and describe the OCLEI's psychometric properties. One of these techniques was IFA to evaluate the latent structure and provide evidence for the proposed 5-factor theory of online classroom learning environments. These results suggest that the OCLEI is a valid assessment tool for researchers interested in exploring the effects of online learning environments within a latent variable modelling framework. These findings also support the multidimensional model of both online learning environment theory (e.g. Trinidad et al. 2005) and traditional classroom learning environment (e.g. Fraser et al. 1996) from previous studies.

Previous studies revealed that the combined Rasch–CFA approach provided better insight into scale performance problems than either method alone (Kelly et al. 2007). In our results, the multidimensional Rasch analysis also yielded several encouraging findings for the scale. First, even though the item measure was in the narrow range, item fit was good. Second, difficulty thresholds were ordered and provided fit to the multidimensional Rasch rating scale model. Thirdly, regarding the correlation between factors, all significant positive correlations in line with findings from the IFA.

The highest correlation between factors was between ‘lecturer support’ and ‘interaction’, indicating that the interaction between lecturers and students in online lectures is related to positive student support. Previous studies found that lecturer–student communication is the most-important factor in online lectures. In these interactions, lecturers actively involve students in class discussions, provide advice and input, and share experiences. This interaction process seems to have been understood by our respondents as lecturer support for students (Marks et al. 2015). In addition, we found that the equity aspect has the highest average person measure which suggests that, although lectures are conducted online, students feel that lecturers can still treat all their students equally.

An important implication from our findings with OCLEI is that the five-aspect measures show a fairly high mean based on Rasch person measures (trait level). This means that the ‘forced’ implementation of online learning during the COVID-19 pandemic caused students to feel that the move from traditional to online classrooms was not a problem, given that the mean of the five aspects was greater than the mean of the Rasch item measure. Based on our findings, we recommend that online teaching continue even after the pandemic.

However, this research has several limitations. While other studies that focus on developing online learning environment instruments also considered teachers (Trinidad et al. 2005), only students were sampled in this study. In fact, just as students must suddenly adapt to online lectures, professors also undergo the same experience. Including teachers or professors as participants in future studies is likely to produce more-complete information about the implementation of online learning in Indonesia. Another limitation relates to our findings of significant differences in certain aspects between traditional and online classrooms. These aspects (such as access) were investigated by previous studies (Yang et al. 2018), but they were not addressed in this research. Future researchers should investigate this further.