Introduction

The adequacy of laboratory equipment in High Schools varies and is often insufficient for setting up physical lab (PL) experiments so that students can conduct inquiry investigations. Apart from that, through the spread of e-learning courses worldwide and distance learning requirements during the COVID-19 pandemic, new challenges emerged concerning teaching science in a successful manner and, therefore, virtual labs (VL) gain popularity, not only as an alternative to PL but also for widening the pedagogical features of laboratory work.

Modern pedagogical approaches are based on the idea that students should acquire knowledge through interactive processes based on their previous experience, knowledge and skills. Moreover, for keeping students interested and actively involved, the learning process should be meaningful for them. Using examples from everyday life and making connections with school science provides students with the opportunity to perceive knowledge as a useful tool for strengthening their personal development skills (Přinosilová et al., 2013). Inquiry learning for teaching science has traditionally been implemented in a physical lab hosted in schools. The primary asset of a lab is to provide students with the tools to examine scientific phenomena through the active manipulation of physical materials as well as by doing scientific work in real environment settings (Jaakkola & Nurmi, 2008). However, some scientists support the idea that physical lab equipment can be replaced by virtual (Klahr et al., 2005) and tangible information is not necessary for conceptual understanding (De Jong et al., 2013) or even for developing experimentation skills (Lefkos et al., 2011).

Several benefits have been highlighted for the use of virtual labs in the educational process. VLs provide a secure and customisable environment where students can conduct experiments by changing variables and observing the outcome (Jaakkola & Nurmi, 2008) or the ability to present the virtual world in different scale sizes, simplifying the challenging real-world phenomena, thus making them easily comprehensible (Hsu & Thomas, 2002). For example, students can dive into invisible processes that are important for the deeper understanding of phenomena, e.g. the flow of electrons (Hennessy et al., 2006), while they can also focus on understanding the phenomenon rather than manually constructing the experimental setup (Taramopoulos et al., 2012). Additionally, Hamed and Aljanazrah (2020) stated that students using VLs acquired a deeper understanding of physics concepts and were also well prepared for carrying out real experiments.

Nevertheless, some researchers have criticised the use of simulations and virtual labs, claiming that the simulation environment sometimes differs from the ‘authentic’ settings of the real world (Potkonjak et al., 2016) or that students lack the physical world experience which is considered to be essential for learning (Zacharia & Olympiou, 2011).

Related Work

There is plenty of still ongoing research concerning the most appropriate laboratory setting for teaching physics, focusing either on students’ learning outcome or their attitudes towards different laboratory environments. However, research findings are contradictory.

Learning Outcomes

There are numerous studies promoting the contribution of virtual labs. For example, according to Olympiou and Zacharia (2012), students who experimented in virtual optics labs achieved better conceptual improvement than those who experimented in physical optics labs. Similarly, undergraduate students in physics, at Bhopal University, who were studying the photoelectric effect within a virtual lab condition achieved higher learning outcomes compared to the students from the physical lab condition (Bajpai, 2013). In another study by Tekbıyık and Ercan (2015), high-school students attended three different courses regarding ‘Simple Electric Circuits’, experimenting in both virtual and physical labs. In two out of three courses, ‘controlling and changing variables that affect the brightness of the bulb’ and ‘recognising circuit elements’, students who experimented in the virtual lab gained more knowledge compared to the students who experimented in the physical lab group, but no significant difference was highlighted between the two groups concerning the third course ‘forming closed circuits’. A similar research that has been conducted with undergraduate students in physics at the University of Cyprus, experimenting with virtual or physical lab workshops about electrical circuits showed that there was an equivalence of the two groups in the understanding of Simple Circuits, yet there was an advantage in those experimenting in virtual laboratories, within the framework of the course of complex circuits (Zacharia & de Jong, 2014).

However, a research which has been carried out with secondary school students in Taiwan concerning Boyle ‘s law indicated that students experimenting in physical labs had greater learning acquisition (Chen et al., 2014). Similarly, pre-graduate students in Australia, having been taught about electric energy storage, had slightly better learning scores during the hands-on experiments compared to the simulated ones. Moreover, Zacharia et al. (2012), while teaching a group of children—5 to 6 years of age—regarding Beam Balance, found out that children had to have previous physical experience in order to be taught efficiently in a virtual lab.

There are several researchers who estimate that there is no difference between a virtual and physical lab regarding the participants’ conceptual understanding. For example, Wiesner and Lan (2004) compared virtual and physical equipment on heat exchange, mass transfer and liquefaction to students of Chemical Engineering. No difference was revealed in tests concerning the basic principles of the phenomena. In addition, Klahr et al. (2007) compared high-school students’ ability to design car models with either real materials or simulation and no statistically significant difference was identified between the two groups. Moreover, Triona and Klahr (2003) conducted a similar study in primary education regarding the behaviour of springs and no difference between the two methods was detected. Zacharia and Constantinou were led to the same conclusion (2008) during a student survey about heat and temperature, as well as Taramopoulos et al. (2012) concerning a study about electrical circuits that has been conducted with junior high-school students. In addition, a study that has been carried out by Hawkins and Phelps (2013) in electrochemistry also indicated no significant difference between the two groups working in virtual and physical labs, respectively.

Finally, another study, which was conducted with middle school students showed that virtual laboratory environments are similarly effective to hands-on laboratory environments (Kapici et al., 2019). However, when using a combination of hands-on and virtual laboratories sequentially, students have higher knowledge acquisition and inquiry skills.

In general, there are various conclusions emerging from the above-mentioned studies, and thus, further research should be undertaken in the field of comparison. According to Ma and Nickerson (2006) the varied results are justified given the fact that comparisons are not often based on the same parameters.

Attitudes Towards Physical and Virtual Labs

Concerning students’ attitudes towards virtual and physical labs, limited research has been conducted, especially regarding physics in high-school students.

A number of studies revealed no difference between the lab modes. For example, Tekbıyık and Ercan (2015) highlighted no statistical difference in students’ attitudes towards electric circuit labs and the same was reported by Ratamun and Osman (2018) based on their study with secondary school students when teaching chemistry, where students evaluated the two methods similarly. Likewise, Hannel and Cuevas (2018) found no statistical difference concerning the motivation on earth science during a study with high school students in Atlanta.

Nevertheless, there are certain studies that are in favour of VLs. University students with no prior experience in any relevant method, having been taught about vibration, evaluated the simulated lab as easy to use and helpful to improve their conceptualisation and interpretation of the phenomena (Minda et al., 2018). Similarly, in a study that lasted more than 2 years, in the 1st grade of a secondary school, students showed preference towards the virtual mode of experimentation, claiming that VLs had higher equipment usability and a higher degree of open-endedness (Pyatt & Sims, 2012). Moreover, according to Lalley et al. (2010), secondary school students having enrolled in year-long life science classes in a suburban high school showed higher levels of motivation in a VL.

However, in a study of Steger et al. (2017), based on feedback solicited on each laboratory session, students expressed that they have acquired more new insights/comprehensions in hands-on mode. Similarly, the results reported by Rochelle et al. (2000) indicated more active engagement and participation of students in the PL. Furthermore, in the study of Corter et al. (2011) with polytechnic students using three kinds of laboratory (hands-on, remotely-operated and simulation-based), students who worked in a PL rated it higher in terms of effectiveness (according to their personal experience), sense of commitment and overall satisfaction compared to the other methods.

Finally, there are studies revealing mixed results. For example Chen et al. (2014) conducted a study with students from a secondary school in Taiwan concerning Boyle ‘s law. In this study, the satisfaction of the students who worked in a VL was less than the ones who worked in a PL, indicating that physical manipulation may have increased the enjoyment of the laboratory. However, during the same study, students’ detachment from the experimental process was much higher in the PL. Another study with mixed results was conducted by Taher and Khan (2015) amongst undergraduate engineering students at Illinois University in the field of Electric Circuits. Most students reported that VL was faster, simpler, easier and promoted articulation and design of complex circuits. However, they also claimed that PL was more interesting and had a higher connection to the real world.

It is evident that students’ attitudes towards these lab modes is a field open to further research, especially considering there is no extensive literature.

Present Study

The present study aims to contribute to the evaluation of the use of VL and PL in actual classroom settings regarding teaching basic physics subjects to high-school students. Based on the evidence mentioned in the introduction, the present study examines the attitudes and learning outcomes of students, comparing two modes of laboratory, physical and virtual. For this purpose, we have developed and implemented a series of learning scenarios in two different subject areas: Mechanics and Electric Circuits. These topics were purposely selected since the first one is very closely related to physical manipulations and setting up the experimental apparatus can be tiresome. On the contrary, the second one is more related to intangible concepts or invisible entities and the experimental setup is quite straightforward.

Our motivation for conducting this study is twofold. Firstly, we intend to investigate whether the learning outcomes or attitudes are related to the lab modality of each selected topic. Additionally, we aim to examine the connections between the learning outcomes and students’ attitudes concerning the VL or PL, respectively. This way, we hope to shed more light on this varied field of reports concerning these lab modalities.

We have conducted two sets of experiments. First, we followed the experimental set-up from previous studies and examined whether there are any differences between VLs and PLs. Hence, (a) we compared the effect of the lab mode concerning the learning outcome as measured through a pre- and post-learning quiz and (b) the attitudes towards the two labs measured through a relevant questionnaire administered to students after experimenting in both lab modes. Secondly, we compared all the above-mentioned subjects regarding Mechanics (Beam Balance and Pulleys) and Electricity (Simple Circuit and Voltage and Current Divider). Finally, we tested if students who had high learning outcome scores showed equally positive attitudes regarding the corresponding lesson and lab modality, placing emphasis on the students’ (self-reported) favourite lesson.

Hypothesis

The aim of this study is to compare the two laboratory modes (virtual and physical) in order to answer the following questions:

  • Is there a difference in students’ learning acquisition of physics concepts when they are practising experimenting in virtual and physical labs? (H1)

    As discussed earlier, the findings about students’ learning acquisition in physical vs. virtual labs are contradictory. Therefore, it is necessary to examine whether VL or PL enhances students’ learning acquisition after experimenting in it. Our alternatives are (a) students having experimented in a VL will outperform those having experimented in a PL, (b) students having experimented in a PL will outperform those having experimented in a VL, (c) there will be no statistically significant differences between the modes and (d) there will be differences in the students’ learning acquisition based on the subject’s nature (Mechanics or Electricity).

  • What are students’ attitudes towards the two methods of experimentation? (H2)

    Students’ attitudes regarding the comparison of physical and virtual labs are also inconsistent in relevant research studies. In this study, the alternatives are (a) students will have more positive attitudes towards VLs, (b) students will have more positive attitudes towards PLs and (c) students’ attitudes will be formed based on the subjects and lab mode.

    Additionally, we aim to examine in a more detailed view, the following sub-questions:

  • Will students’ attitudes towards the lab environment affect their learning outcome? (H2.1)

    Although high-school students are considered to be a homogeneous sample of people, being at the same developmental stage and carrying similar educational background; however, they have already formed their teaching style preferences. Studies have shown that when teachers present information by using students’ preferred teaching style, students are able to sufficiently connect with them and acquire more knowledge from the lesson (Keller, 1987; Tanner & Allen, 2004). Thus, we expect that students will acquire more knowledge from the most preferable lab condition.

  • Which lesson becomes their favourite? (H2.2)

High-school students acquire more knowledge from their favourite science courses while, based on self-assessment, they understand them better and it is more possible to apply them in the future (Rachmatullah et al., 2017). Students’ intention to follow a course as a future career is strongly correlated with the enjoyment of the learning procedure (Wang et al., 2021). Based on this, we expect students to evaluate as preferable the lessons taught in the preferable lab mode, those assigned larger scores in the attitude questionnaire as well as the lesson in which they had higher scores in the learning questionnaire.

Methodology

Participants

A total number of 38 boys and girls studying in the 3rd grade of a high school (14–15 years old), following a standard Greek curriculum course, participated in the experiments. Students were divided into two groups, Group A and Group B. Twenty students were randomly assigned in Group A and 18 students in Group B. When Group A attended the course, Group B attended another course and vice versa. Additionally, as the research was conducted during their standard classes, in order to avoid the Hawthorne effect—participants changed their behaviour knowing that they were being observed—students were fully informed about the purpose of the study after finishing the study/lessons and had a 7-day time frame to provide their parents’ consent or ask their teacher to completely delete their data and not be included in the research results. Additionally, all students’ answers were anonymised and encoded.

Design

The chosen topics were based on students’ prior knowledge level (evaluated by their former physics teacher), the connection to the physical world (to attract the students’ interest), the ease of conducting the experiments, considering the available school lab equipment and an easy-to-use VL (Fig. 2), having similar features and devices to the PL equipment. Moreover, following the contradiction in previous studies, we chose to replicate two different sets of courses: one about Electricity (Tekbıyık & Ercan, 2015; Zacharia & de Jong, 2014) and one about Mechanics (Steger et al., 2017). Each set of courses included two different sub-courses (lessons) with different content. Concerning Electricity, one lesson was about the ‘Simple Circuit’ (lamps in series or in parallel) and one about the ‘Voltage or Current Divider’. Similarly, concerning Mechanics, one lesson was about the ‘Beam Balance’ and the other one about ‘Pulleys’. Each lesson lasted 2 h, 1 h per week, as proposed in the standard weekly schedule regulated by the Ministry of Education. The engagement of the student groups is shown in Fig. 1.

Fig. 1
figure 1

Student groups switching from virtual to physical labs and vice versa, through topics

The educational scenarios used for the implementation of the courses were designed on the online platform Graasp (www.graasp.eu). This platform provides teachers with the ability to create original scenarios from scratch and also offers clear instructions to students on how to use the application. Additionally, this platform gives access to micro-apps to be used as ‘scaffolding apps’, supporting students during their inquiries.

The most important feature concerning our research is that this platform supports not only the creation of scenarios by teachers but also their implementation, guiding students during the entire learning process (like a set of digital worksheets). Thus, students use the platform and the apps to compose their hypotheses, conduct their experiments, record their data in tables, create their graphs, draw their conclusions and reflect on the outcomes. Moreover, all students’ choices and everything they write down (like data on a table or a conclusion) are stored in the platform, providing teachers with the opportunity to assess their learning and give proper feedback.

In our case, only the experimentation phase was different in the two groups, i.e. one group was conducting virtual experiments (on the platform), while the other group was conducting physical experiments (still using the platform for all other phases of their inquiry) (Fig. 2). In other words, in terms of research conditions, we were able to change solely the modality of the laboratory, controlling all other factors.

Fig. 2
figure 2

Virtual labs of A Electric Circuits and B Beam Balance and corresponding physical labs (C, D)

Inquiry-Based Learning Approach

The approach used in the experimental process for both the physical and the virtual experiments is based on the ‘Inquiry Cycle’, as proposed by Pedaste et al. (2015). The Inquiry Cycle presented in Fig. 3 shows a pedagogical approach following the principle of Inquiry Learning (de Jong & Lazonder, 2014). According to that process, the information is not given directly by the teacher to the students, but the research process is guided through a research question or hypothesis. Additionally, it requires interpretation of the results and conclusions as well as discussion of the results. The method of guided inquiry learning has been applied in the current study based on research findings which indicate that inquiry is more effective when combined with guidance (Eysink et al., 2009; Linn et al., 2006; Plass et al., 2012). The advantage of this method is ‘self-regulated learning’, in which students become responsible for their own educational process and deal with difficulties and problems on their own (Zacharia et al., 2015).

Fig. 3
figure 3

The Inquiry Cycle that was applied (Pedaste et al., 2015)

Procedure

The experimental design was between participants. The same educational scenarios were used for both labs. One group (A) of students first engaged in the VL (Mechanics) and then in the PL (Electricity), while the other group (B) first engaged in the PL (Mechanics) and afterwards in the VL (Electricity). Apart from changing the lab mode, they followed the same procedure. The students were working in groups of 2–3, following the recommendations of Corter et al. (2011), i.e. two students working together, sharing the same equipment or PC depending on the lab.

At the beginning of each lesson, the teacher directed the students to the corresponding lab, according to the condition (PC room for the VL or the School lab for PL) and did a brief introduction (orientation phase) about the topic to be studied and the method that they were going to use (virtual or physical). After this, students were asked to complete the pre-test before doing the lesson. The teacher was facilitating the whole procedure, providing the theoretical background of each topic and was on hand to answer any questions. During the second week, students engaged in the second part of each lesson and after finishing it, they also had to complete the post-test learning outcome questionnaire concerning the specific lesson. The same procedure was followed for the second lesson of the subject (Mechanics or Electricity). After finishing each subject, the groups were also inter-changing lab modes (Fig. 1). On the day of the final lesson, they also completed the attitude questionnaire, evaluating their experience with each subject and lab method. All questionnaires were filled in individually.

Research Tools

Learning Outcome Questionnaire

The learning outcome was estimated through a knowledge quiz given to the students before attending each course (pre-test) and after finishing each lesson (post-test). Students’ improvement was measured quantitatively by statistically comparing their scores in the pre- and post-test. The pre-test was given after the orientation phase, where students had their first contact with the subject but without knowing how the used terms are linked between them or knowing any details, to assess the pre-existing level of knowledge and comprehension of a subject. At the end of the lesson (i.e. the end of the second week for each lesson), a post-test was given, containing the same questions along with some more difficult and complicated ones, as shown in Fig. 4. The questionnaires consisted of research questions designed to focus on the main points of the subjects under study and similar activities were carried out during the experimentation phases in either VL or PL mode (as depicted in Fig. 2b, d). There were nine questions in total and the content validity was ensured with the agreement of the first author and two experts in the field of Mechanics and Electricity, who selected the most proper from an initial set of questions, suggesting further changes to the wording or figures, as appropriate.

Fig. 4
figure 4

Example of questions in Beam Balance, up and left, a question mentioned only in pre-test, down left an upgraded difficulty question only in post-test and right, a question mentioned in both pre- and post-test

The grading scheme was the same as a school quiz evaluation. The maximum score was 100%. Each question was assigned a different high score due to the level of difficulty or the extent of a fully-correct answer. Students were informed that the quiz would not be taken into consideration for their formal course evaluation and was only used as feedback to the teacher. The teacher was also responsible for grading the quiz without knowing the student or the lab condition (encoded questionnaires).

The reliability of the questionnaire was tested by Cronbach’s alpha, based on the total score achieved by each student on each subject. In the post-test for the four subjects of the survey, the index had values of 0.876, 0.724, 0.796 and 0.709, respectively. However, in the pre-tests, it was less than 0.7. This is probably because students had incomplete knowledge or vague perceptions of the subjects and their responses were not equable.

Attitudes’ Questionnaire

According to Liaw (2008), perceived satisfaction enhances learners’ perceptions of technology and promotes their participation in the learning processes. As there was no relevant questionnaire, for measuring the attitudes of students regarding the course of physics in virtual and real labs, found in the literature, a scientific group, consisted of high-school teachers and experts in the field of Mechanics and Electricity or Science Education at university level, created the questions (items) of the Attitudes’ Questionnaire by taking into consideration (a) the appropriate variations of Technology Acceptance Model Questionnaire (TAM), such as the model of acceptance-adoption of technology and (b) given questionnaires from previous, similar researches. TAM has been widely used in technology acceptance research and has been the basis for developing generalised models regarding IoT and informational systems. It is a widely-accepted basic model for evaluating ‘Perceived Usefulness’ and ‘Perceived Ease of Use’ of students’ acceptance about technological tools (Davis, 1989). For example, Lemay et al. (2018) used TAM to model college-level students’ perception towards the applied Simulation-based Learning techniques. Additionally, in relevant studies where students evaluated virtual environments, Taher and Khan (2015) focused on students’ opinion about the ease of implementation, Corter et al. (2011) on their opinions about conceptual understanding, which is also closely related to their opinion about effectiveness and Steger et al. (2017) on solicited feedback. Corter et al. (2011) and Chen et al. (2014) found a correlation between students’ opinion about how interesting a technology was and their satisfaction level, while Estriegana et al. (2019) applied a variation of TAM examining the variables of efficiency, playfulness and satisfaction.

The questionnaire was given to students only once, after finishing all taught lessons, switching from virtual to physical labs and vice versa. It was divided into two parts.

The first part consisted of four groups of questions, each group referring to one of the remaining lessons taught. The questionnaire focused on three major parameters, exploring students’ attitudes towards the virtual and physical labs, concerning (a) the ease of implementation, (b) the understanding of the examined subject and (c) the interest caused by each process. For each lesson, there were three Likert scale questions (1, minimum to 4, maximum) evaluating the taught subject, i.e. for the Voltage and Current Divider lesson, there were three questions: (a) Connecting the resistors seemed easy and understandable to me, (b) The experimental setup helped me understand how the Voltage and Current Divider works and (c) I found the experimental setup interesting. The total score was calculated for each student, each subject and each lab modality.

In order to test the questionnaire’s validity, we applied Lawshe’s validity methodology subject matter expert rating (SMEs) with the assistance of the five experts mentioned above. The Content Validity Ratio (CVR critical) of the attitudes questionnaire was 0.99, two-tailed p = 0.002, like the accepted value in the case of five experts to achieve a 0.05 significance level (Wilson et al., 2012). Moreover, Cronbach’s alpha was applied to the total score, as calculated for each student, to examine the reliability of each group of questions (a), (b) and (c) for the four subjects (Beam Balance, Pulleys, Lamp Connection, Voltage and Current Divider) per lab environment, as presented in Table 1. Cronbach’s a is below 0.7 in two cases, reducing the reliability regarding students’ answers in a VL in the Beam Balance lesson.

Table 1 Cronbach’s alpha of students’ attitudes questionnaire

In the second part, students were asked to report which of the taught lessons (Beam Balance, Pulleys, Lamp Connection, Voltage and Current Divider) was the most interesting and which was the least interesting according to their own opinion.

Data Analysis

Students’ learning outcome was examined by calculating the Mean Value (MV) and Standard Deviation (SD) for the total number of correct answers per lab (VL or PL) per lesson (4 lessons) for both pre- and post-test scores. We also calculated the percentage of students’ improvement in each subject after both VL and PL. Additionally, after applying a normality test, we conducted a t-test of the students’ scores before and after the lab exercises for Mechanics (Beam Balance and Pulleys) and Mann–Whitney U-test for Electricity (Lamp Connection and Voltage and Current Divider), stressing H1. Only data from students who completed both the pre- and the post-test were used, while absent students were excluded from that specific lesson data set. This also led to small variations in the number of participants observed in some of the tests.

For students’ attitudes towards the different labs, we calculated the MV and SD for each question per lab/per lesson. In order to estimate whether the observed differences were statistically significant, a normality test was first performed, following the Shapiro–Wilk criterion, because the sample size was N < 50. The observed level of statistical significance (p) was, in almost all cases, less than 0.005, so data were not normally distributed. Hence, the Mann–Whitney non-parametric test was applied. On the other hand, concerning the question ‘I found the use of the Beam Balance simulation/experimental procedure interesting’, where p = 0.05 for students who experimented in a VL and p = 0.29 > 0.05 for students who experimented in a PL, data presented normal distribution; hence, a t-test was applied, examining H2. As mentioned above, the duration of each lesson was two hours and thus the answers of the students who had participated in at least one of the two hours were included in the analysis.

For testing H2.1, the Pearson r correlation test was used to identify if students with higher scores appreciated the corresponding experimental condition (VL or PL).

In order to test H2.2, how the condition affected students’ preferences about the taught lessons, a logistic regression analysis was performed to assess the ability of a series of predictor variables: (a) the lab mode where students experimented in (VL or PL), (b) their attitudes’ scores, based on the attitudes questionnaire and (c) knowledge acquisition, based on the post-test score to predict students’ lesson preference (categorical variable) after finishing the whole experimental procedure (López et al., 2015). Additionally, a one-way ANOVA test was performed to find out the effect of students’ favourite lesson on their learning outcome (Rachmatullah et al., 2017).

Finally, we performed an effect size analysis to specify the study’s sample power (López et al., 2015). We applied Hedges’ g, due to the different sample sizes between the conditions. Results showed that despite the fact that there was a limited number of students, the sample size was sufficient to draw conclusions (Hedges’ g = 0.86).

Results

Learning Outcome

Pre vs. Post: Starting with the Mechanics lessons, students having experimented in the VL regarding the Beam Balance lesson improved their learning scores in the post-test in comparison with the pre-test; however, the difference was not statistically significant as indicated by t-test (t(18) =  − 0.94, p = 0.35, d = 8.33). Similarly, for the PL, students improved their learning scores in the post-test as compared to the pre-test, with t(13) =  − 0.85, p = 0.403, d = 10.46. In the Pulleys lesson, students in the VL improved their scores significantly in the post-test compared to the pre-test as indicated by the t-test (t(19) =  − 5.35, p =  < 0.001, d = 33.69). Similarly, concerning students in the PL, there was a statistically significant improvement between the pre- and post-test, at t(17) =  − 3.48, p = 0.001, d = 28.76 (Table 2, Fig. 5).

Table 2 Pre-post comparison for the learning outcome in Mechanics—descriptive statistics
Fig. 5
figure 5

Mechanics lessons, Beam Balance and Pulleys comparison between the pre- and post-test per lab (Vl and Pl), comparison of the post-test scores between Vl and Pl post-test scores

VL vs. PL: Concerning students’ improvement between the VL and PL regarding the Beam Balance lesson, no statistically significant difference was found (t(13) =  − 0.67, p = 0.51, d = 8.17). Similarly, concerning the Pulleys lesson, although students had higher scores in the PL, there was no statistical difference as indicated by t(17) =  − 1.34, p = 0.188, d = 12.6.

Pre vs. Post: Regarding Electricity lessons, students having experimented in the VL, regarding the Lamp Connection lesson, improved their scores in the post-test compared to the pre-test (at U = 60, p = 0.011, r =  − 0.635). In the PL lesson, students had comparable scores between the pre- and post-test, as shown by the U-test (U = 124, p = 0.896, r = 0.032Footnote 1). Concerning the Voltage and Current Divider, in the VL, there was a statistically significant improvement of the students’ scores in the post-test in comparison with the pre-test, at U = 64.5, p = 0.048, r =  − 0.5. Finally, students who experimented in the PL also improved their scores in the post-test comparing to the pre-test, but the difference was not statistically significant (U = 108.5, p = 0.471, r =  − 0.18) (Table 3, Fig. 6).

Table 3 Pre-post comparison for the learning outcome in Electricity—descriptive statistics
Fig. 6
figure 6

Electricity lessons, lamp circuit and Voltage and Current Divider, comparison between the pre- and post-test per lab (Vl and Pl), comparison of the post-test scores between Vl and Pl post-test scores

VL vs. PL: When comparing students’ improvement between the VL and PL regarding the Lamp Connection lesson, there is a statistically significant difference between the students’ scores after completing the lesson in favour of VL as indicated by U = 64, p = 0.017, r = 0.598. Similarly, students’ scores in the Voltage and Current Divider lesson were significantly higher in the VL in comparison with the PL post-test at U = 67, p = 0.037, r =  − 0.52. Figures 5 and 6 graphically represent the above-mentioned analysis.

By calculating the average improvement percentage both per lesson and lab mode, students in the Beam Balance lesson improved by 8.33% in the VL and 10.46% in the PL. Students in the Pulleys lesson improved their learning scores after experimenting with the VL by 33.68% and after experimenting with the PL by 28.76%. After the Lamp Circuit Lesson, students in the VL improved their scores by 18.25%, while in the PL they did not improve at all; on the contrary, they slightly worsened by 2.75%. Finally, after the Voltage and Current Division, students improved in both lab conditions by 12.13% in the VL and 6.75% in the PL.

The above-mentioned results might have been affected by the initial differences between the two groups of students compared; thus, pre-test scores could be examined as a covariate. This constitutes one of the limitations of our study.

Students’ Attitudes Towards Virtual/Physical Lab

As reported above, students’ attitudes towards the two modes of the laboratory were examined according to their opinion about the ease of implementation, the understanding of each lesson and the interest caused by each lab mode. Results showed that students similarly evaluated both lab modes without favouring one. The results are presented in Figs. 7, 8 and 9.

Fig. 7
figure 7

Ease of implementation as reported by the students

Fig. 8
figure 8

Concept comprehension as reported by the students

Fig. 9
figure 9

Interest as reported by the students

Regarding the ease of implementation, students evaluated the VL for Mechanics more positively than the PL. However, as indicated by the U-test, the difference was not statistically significant (U = 103, p = 0.242, r = 0.29 for Beam Balance and U = 102, p = 0.226, r = 0.3 for Pulleys). Students also evaluated slightly more positively the VL for Lamp Connection (U = 95.5, p = 0.15, r =  − 0.36), while regarding the Voltage and Current Division they evaluated as slightly easier the PL in comparison with the VL (U = 122, p = 0.624, r = 0.122). In either case, this was not statistically significant.

Concerning the issue of conceptual comprehension, students evaluated the VL and PL similarly for the Mechanics (U = 134, p = 0.968, r = 0.009 for Beam Balance and U = 127.5, p = 0.772, r = 0.072). Regarding the Lamp Connection, students expressed that they had better understanding during the VL (U = 113, p = 0.418, r =  − 0.2). This was not surprising, since in the VL students had the ability to observe the flow of electrons in the circuit. On the contrary, students claimed that they understood better the concept of Voltage and Current Divider during the PL in comparison with the VL (U = 104.5, p = 0.263, r = 0.28). The students’ former teacher reported that students in the elementary school had already been involved in a PL lab experiment within a similar context but with completely different learning goals. Thus, it seems that students were more familiar with a Voltage and Current Divider experiment in PL. Nevertheless, no statistically significant difference was found in all the above cases.

Regarding the triggered interest from different labs, students evaluated both as interesting, as shown in Fig. 9, the average value for each lesson was around 3 in a 4 scale Likert scale. More specifically, for Beam Balance as indicated by the t-test performed, t(16) = 0.526, p = 0.602, d = 0.19. Similarly for the remaining lessons as indicated by the U-test, U = 108.5, p = 0.332, r = 0.24 for Pulleys; U = 120, p = 0.575, r = 0.14 for Lamp Connection and U = 135, p = 984, r =  − 0.005.

In general, no statistically significant differences were highlighted between students’ attitudes towards the different lab modes. Therefore, students’ attitudes should not be the primary criterion for a teacher’s decision-taking about which lab mode has to be followed, but additional factors need to be taken into account. Furthermore, as our study design prompted students to switch lab modalities after each topic, each group might have used different criteria to evaluate each subject. This issue is clearly one of our study’s limitations.

It was expected that students will have gained higher knowledge from the most positively evaluated lab environment (H2.1). However, students evaluated the two labs equally. The lack of preference concerning the lab environments was also indicated in the applied Pearson r correlation between the knowledge acquisition scores and the attitudes’ score. Regarding the VL, r = 0.129, p = 0.309, while in the PL, r =  − 0.095, p = 0.469.

A one-way ANOVA test was performed to find out the effect of students’ learning and attitudes’ scores on their favourite science lesson chosen (H2.2). Based on the findings, no significant effect was found (F[2, 27] = 0.753, p = 0.482). Regarding the role of knowledge and attitude role in the students’ favourite lesson chosen, the regression analysis model showed that R =  − 0.019, indicating no relation between the variables. The coefficient of determination which is expressed by the r2 value was shown to be 0.059, meaning learning scores; attitudes towards modality (VL or PL) explained less than 1% of the variation in the total value of students’ favourite lesson (dependent variable), as shown in Table 4.

Table 4 Model summaryb

Taken together, the simulations seemed to prevail slightly in terms of students’ attitudes. However, this superiority was only marginal and may not be the main criterion for the selection of the appropriate method by teachers, since there are additional factors that have to be taken into account.

Discussion and Conclusions

In this work, we first examined the difference—if any—in students’ learning acquisition of physics concepts when they practice in a virtual or in a physical laboratory. The results indicate that there was no statistically significant difference in the improvement between the two groups of students, working in these two modes.

Our findings are consistent with a number of previous studies (e.g. Hawkins & Phelps, 2013; Klahr et al., 2007; Taramopoulos et al., 2012; Triona & Klahr, 2003; Wiesner & Lan, 2004; Zacharia & Constantinou, 2008) and further support the view that the cognitive results of the two lab modes are equivalent. Therefore, VLs can be used interchangeably with PLs, regarding the conceptual understanding. Broadly speaking, in terms of score percentages, PLs seem to be advantageous for Mechanics subjects. This might be indicative of the role of the embodied learning (Kelan, 2011) that is more prominent in these experiments. On the other hand, VLs seem to be advantageous for Electricity subjects. This might be due to the fact that students were able to visualise invisible elements (Finkelstein et al., 2005; Hennessy et al., 2006) or that they focused more on the concepts (Chen et al., 2014).

These above-mentioned factors play a vital role and have to be taken into consideration when teachers plan inquiry investigations and take decisions about the lab mode that they are going to use. Their decision-making might be determined by the type of subject and also their learning goals, i.e. if they would like to be oriented towards students’ manipulating physical objects or the deeper understanding of concepts. This could also be related to the age of the students, since the need for hands-on experience seems more prominent for younger students.

Additionally, we analysed students’ attitudes towards the two modes of laboratory in terms of perceived interest, ease of implementation and concept comprehension. In this case, our results indicate that students’ interest was practically the same in both lab modes, which could be related to similar research findings (e.g.Ratamun & Osman, 2018; Tekbıyık & Ercan, 2015).

Generally speaking, in terms of score percentages, VLs were considered to be slightly more interesting in the Mechanics topics, while PLs were in the Electricity topics. At the same time, in three out of four topics VLs were considered to be more easily implemented, fostering comprehension, except for the topic of Voltage and Current Divider, where students’ views were reversed in favour of the PL. This exception probably occurred due to the fact that this was the only topic in which students had reportedly previous PL experience, which was considered to be an important factor (Zacharia et al., 2012).

Thus, considering the similarity of results in students’ attitudes, teachers should not have any doubts about using a VL or a PL. Furthermore, prior experience with a certain lab mode might be a point of consideration, when it comes to deciding the mode to be used for their inquiry activities.

To sum up, following a twofold approach (not so common in this field, e.g. Husnaini & Chen, 2019), our study aimed to contribute to the discussion about the use of VL and/or PL in educational settings, by investigating not only the learning outcomes but also the attitudes of the students concerning these two lab modalities.

Our results demonstrate that both VLs and PLs can be equally beneficial for teaching both Mechanics and Electricity subjects and that our students’ attitudes towards both modalities are just as equal. These results should be critically evaluated within the perspective and limitations of our study; nevertheless, according to Conlin et al. (2019), the non-existence of statistically significant differences may be useful for teaching and bring about fruitful results.

Sharing the same view, we believe that this finding is important for whoever designs inquiry-based activities, because it justifies the claim that both lab modes can be used interchangeably. Moreover, the decision about using VL or PL might be taken based on a number of factors as reported above.

This means that teachers who face a lack of physical equipment, have huge university classes or give distance education courses might consider using VLs, taking into account that they might be constructive in terms of conceptual understanding and also favourable for their students. Additional factors that have to be taken into consideration include the following: the specific topic of interest, the age or previous experience of students and the learning goals.