1 Introduction

Progressive educators have advocated the use of project-based, student-centered pedagogies for decades (Freire 1970; Papert 1980, 2000; Piaget 1964; Ultanir 2012), and recently this work has received a renewed attention with the explosive growth of the “maker” movement (Blikstein and Krannich 2013; Blikstein 2013), FabLabs (Posch and Fitzpatrick 2012; Stacey 2014), and low cost technologies for rapid prototyping (e.g., Sipitakiat et al. 2004). As a result, the construction spaces and materials available for children are much richer and more sophisticated, allowing them to build not only more complex projects, but also to venture into new fields of knowledge previously only available to experts, such as microbiology, emergent systems, computer programming, and robotics.

The use of new technologies for project-based learning has sparked an effusive debate amongst educators about their effectiveness and integration with traditional classroom curriculum. Thus, researchers have been seeking empirical evidence to determine how effective different pedagogical approaches are, such as explicit teaching versus discovery learning, technology-based versus traditional media, and “tell-and-practice” versus hands-on approaches (see Bishop and Verleger 2013; Schneider et al. 2013). More specifically, some educational approaches may facilitate the use of exploratory strategies by students while others may induce an exploitation perspective. The exploration and exploitation dilemma (defined below) was proposed by March (1991) as an adaptive behavioral mechanism in organizational learning, and others have recently associated it to cognitive processes such as creativity (Chae et al. 2013; Chae and Lee 2011) and attentional engagement mediated by neuromodulatory mechanism and arousal activation in problem-solving tasks (Laureiro-Martínez et al. 2010; Jepma and Nieuwenhuis 2011).

Exploration is characterized by subjects experimenting with the affordances of the environment in a playful and flexible way (March, 1991), requiring an extensive search through the repertory of possible behaviors (Laureiro-Martínez et al. 2010). This behavior is akin to a discovery learning approach from an educational lens. On the other hand, exploitation is about establishing certainties, with a focus on implementing, executing and refining existing strategies (March 1991). The focus of exploitation behaviors is to accomplish a task or solve a particular problem (or set of exercises), which is more in line with a “tell-and-practice” kind of instructional sequence.

Although a great deal of attention is being drawn to the role and added-value of open-ended learning activities that focus on students’ exploration in the classroom, the dimensions along which these new technologies are being evaluated are still a topic of intense debate. As a result of what Seymour Papert (1990) called “technocentrism,” researchers oftentimes focus on the effect of a particular tool or technology for teaching a traditional topic, instead of examining the new possibilities, affordances, social arrangements, and even new content made possible by these new technologies. Despite these flaws, the use of computer technologies have been evaluated as positive in a 40 year second-order meta-analysis, with an effect size ranging from 0.30 to 0.35 (Tamim et al. 2011). In particular, for STEM (Science, Technology, Engineering and Math) fields, the most promising uses of technology have been shown to be designs that include a significant portion of experiential, “hands-on,” laboratory, and model/simulation-based activities (Schneider et al. 2013; Papert 1980), supporting exploration of complex STEM concepts using new technologies in the classroom.

However, due to the heterogeneity of technologies modalities and learning purposes, associated to individual characteristics of the students, the most effective approach to integrate technology-mediated activities remains unclear, especially concerning the optimal amount of scaffolding and open-ended, challenge-based activities (see, for example, Kirschner et al. 2006). In this study, we are particularly interested in analyzing the relationship between two types of pedagogical approaches (“step-by-step” instructions vs. challenge-based) associated to the use of a software resource and their outcomes in terms of physiological variables, exploration, and task performance in a challenge-based learning activity.

2 Theoretical Framework

2.1 STEM, Instructions, Creativity and Technology

The use of technology in education is in line with societal changes and the necessity of promoting students’ knowledge and skills related to STEM field at early ages (Brophy et al. 2008; Sorkin et al. 2007). Even considering more traditional studies, a small to moderate effect is suggested in favor of having technology-enhanced activities, i.e., computer technology, compared to traditional ones (Tamim et al. 2011). However, the simple “presence” of technology is too vague of a factor to examine such interventions. Our main concern here is less about its presence and more about how the activities and tools are engineered, considering, especially, the learning purposes.

Explicit and structured instruction are considered important components for effective learning when the goal is to develop scientific reasoning and heuristics, as in experiments based on the control-of-variables strategy. In this paradigm, one variable should be manipulated while the others remain constant, to facilitate interpretation of the results (Lazonder and Egberink 2014; Lorch et al. 2010). Nonetheless, this approach makes creative ideas more difficult to be selected for students (Rietzschel et al. 2010). On the other hand, learning and generating ideas in open-ended environments, such as makerspaces, was recognized by engineering students as being more difficult and time consuming than in traditional laboratories, since those spaces require more creativity and problem solving skills (Burkett et al. 2014), which are part of new curricular standards in the United States and elsewhere (NGSS Lead States, 2013).

The trade-off between exploration and exploitation is important for performance, especially in organizational environments (March, 1991). The substitution of exploitation for exploration can be detrimental for a business, whereas the emphasis on exploitation can potentially reduce employees’ cognitive flexibility. This rational can be applied to classrooms, where the balance between exploration and exploitation seems to be a challenge. New technologies can benefit students’ learning, but the details of implementing such technologies are crucial to their success, for instance in terms of the social engineering of the learning environment, sequencing of instruction/exploration, activity design, as well as the level of allowed discovery/inquiry. DeCaro and Rittle-Johnson (2012) have shown that when open-ended activities precede formal instruction (such as exploring quantities before receiving explanation about math equivalence), students are more able to solve math problems and achieve deeper conceptual knowledge. This approach was also useful to adults learning college-level content, as is pointed out by Schneider et al. (2013). They demonstrated that a group of students whose activities consisted of exploration of a tabletop environment followed by reading a textbook or watching a video achieved about 25 % higher scores, compared to the group that had the standard instruction (e.g., lecture) first.

Thus, the presence of technology is not simply about having a powerful tool in the hands of students, but mostly about how the tools potential and affordances can be maximized through efficient design. The compromise between direct instruction (known to reduce cognitive load, Kirschner, et al. 2006) and exploration (known to increase motivation, agency and creativity) is a key factor, but the investigation about their relationship with students’ physiological characteristics is still an under researched area. Our research purports to contribute to establish such relationships and examine how those two types of approaches influence students’ exploratory behavior in a technology-rich environment.

2.2 Arousal, Activation, Cognition and Creativity

The neuroscientific perspective on the exploration–exploitation trade-off complements the organizational learning theory. Some attentional mechanisms in the brain are believed to underlie the shift from exploring new possibilities or maintaining a previous known strategy for performance in a task, especially the locus coeruleus–norepinephrine system (LC–NE) (Laureiro-Martínez et al. 2010). This system is the major nucleus correlated with the regulation of the arousal and autonomic functions, as well as the orienting response elicited by a task-relevant or motivational stimulus. LC functions in a phasic and tonic mode: the first one corresponds to the acute release of norepinephrine elicited by the orienting response to task-relevant stimuli, increasing the attentional focus, and it has been associated in the literature to an exploitative behavior. On the other hand, the tonic mode is maintained through a sustained release of this neurotransmitter, facilitating the processing of a higher number of surrounding events, favoring exploration (Nieuwenhuis et al. 2011; Laureiro-Martínez et al. 2010).

The peripheral autonomic components of the orienting response, such as pupil dilatation and electrodermal responses, co-occur with some components of the central nervous system marker of this response (P300) since they share similar sympathoexcitatory pathways (Nieuwenhuis et al. 2011; Laureiro-Martínez et al. 2010; Samuels and Szabadi 2008a, b). Electrodermal response (or skin conductance) is a physiological response that has been used as a peripheral measure for autonomic arousal and activation induced by a task. The first one is understood as an energetic state, usually measured as baseline level, while the second one refers to a difference induced by a stimulus due to sympathetic activation of the nervous system (VaezMousavi et al. 2007a, b). A decrement in energy, known as habituation, is observed following stimulus repetition and is slower for more salient stimuli (Nieuwenhuis et al. 2011; VaezMousavi et al. 2007a, b).

Activation have been associated with orienting responses (Nieuwenhuis et al. 2011; Bernstein 1979; Bernstein et al. 1975) to either emotional states (Choi et al. 2010; Muldner et al. 2010), mental stress (Fechir et al. 2008), cognitive load (Shi et al. 2007) and performance (VaezMousavi et al. 2007b). The simple presence of a new stimulus can evoke orienting responses. However, instructing the subjects to perform a task associated to a presentation of a stimulus generates greater responses (Bernstein et al. 1975).

The activation effects on performance is reported to be positive by some authors and negative by others. For example, in a continuous performance task, the activation level affected reaction time positively, speeding up the response (VaezMousavi et al. 2007b). However, increases in levels of arousal generated by looking at tense pictures prior to a task were associated with a decrease in performance (Choi et al. 2010). Alexander et al. (2007), based on the LC activation mechanism, suggested that this difference is due to the nature of the task: performance in cognitive tasks that require attentional focus is facilitated by increased LC activation; however, tasks that require greater cognitive flexibility is impaired by the same mechanism (Alexander et al. 2007).

Cognitive flexibility is one component of creativity, which can be understood either as the process of generating ideas, solving a problem and having an idea that is useful for a particular outcome (Sternberg and Lubart 1999). Exploration, as defined by the neuroscientific or by the organizational perspective, tends to favor creativity, facilitating the sampling of different stimuli and increasing the repertory of responses (Laureiro-Martínez et al. 2010). According to the rational about the impact of the LC activation on exploration–exploitation balance, increased task-related activation should impair exploration. This paradigm was tested by Chae and co-authors (Chae et al. 2013; Chae and Lee 2011) in a decision-making task supported by a decision support system software. They found that exploitation was facilitated by stressful situations, while exploration was facilitated by lower levels of activation.

Thus, by taking in account this theoretical framework, our rationale for this comparison is that highly scaffolded instructions (i.e., step by step), often found in traditional classrooms, will lead to exploitation strategies and will be associated to an increase in the task-related activation level. In the other hand, minimally scaffolded activities (i.e., less structures or more challenge-based), often found in experiential and simulation-based activities, will lead to more exploratory behaviors.

3 Research Methodology

Collecting and associating different sources of data, thorough a multimodal analytics approach, have been shown to be a powerful method to evaluate learning processes in open-ended environments (Blikstein 2011, 2013; Worsley and Blikstein 2013; Gomes et al. 2013). Our contribution is to understand the impact of different computer-based instructional methods on the cognitive processes of students. In pursuit of this goal, our research focuses on psycho-physiological investigations of the effects of different types of pedagogical approaches on student’s behavioral patterns, especially with regard to STEM disciplines. We set out to determine the relative effectiveness of given detailed instructions, for the accomplishment of an engineering computer-based task using a physics microworld platform, versus given more general challenge-based instructions. We are looking for differences in behavioral patterns and task’s outcomes that can be induced by the instructions and if this differences are reflected in the activation level during the task. Our hypothesis is that detailed instructions increase students focus on task-related mindset, where completing the task is the main priority, thus leading to exploitation. Based on the theoretical framework presented, we would expect to see less exploratory behaviors and higher aroused physiological pattern in this condition. In the other hand, we expect general instruction to be associated with cognitive flexibility and creativity, leading to a more exploratory behavior and relaxed activation pattern.

We followed a crossover design, giving detailed instructions to half the students in the first task and general instruction in the second task, and the opposite for the other half of the students. Students’ task was to build a tower and a bridge, using the physics microworld software, which simulated the laws of Newtonian physics. After the two tasks, all students had time for “free play”, where they were allowed to use the software in a project of their own devising. Research design and the start screen of each task can be seen in Fig. 1.

Fig. 1
figure 1

Research design overview and the tasks. a research design steps, b layout of the start screen of the simulation software: Tutorial (upper left hand side), tower (upper right hand side), bridge (lower left hand side) and free (lower right hand side)

3.1 Participants

Twenty-one students from grades 10, 11 or 12 from a highly diverse (99 % minority), low-income public high school in California were randomly assigned to two groups, each of which received two types of sequence: detailed instructions for the Tower and general for the Bridge (labeled Detail.  Gener. from now on) or general instructions for Tower and detailed for Bridge (Gener.  Detail). In the “detailed” condition, students received scripted, step-by-step instructions about how to build the project. In the “general” condition, students received brief initial instructions and then just an explanation of their final goal. Data from three students were excluded due to technical issues that affected the physiological measures. Data from 19 students (nine male, 47.4 %) were analyzed, with ten participants in the group Detail. → Gener. (four male, 40 %) and nine in the group Gener. → Detail. (five male, 55.6 %).

3.2 Materials and Apparatus

3.2.1 Baseline

The baseline evaluation followed the Integrative Medicine Protocol (Biofeedback Federation of Europe 2013a) and was adapted for the purpose of this research. The full protocol consisted of 15 steps, with an overall duration of 22 min. After the selection of the tasks of interest for this project, the protocol duration was 9:05 min, and consisted of 6 steps: eyes closed (90 s), eyes open (120 s), Stroop task (65 s), recovery 1 (90 s), math (90 s) and recovery 2 (90 s). All the activities were performed in a sitting position in the experimental setting with the galvanic skin response sensor attached to the middle phalanges of both index and ring fingers (for more information about the electrodes placement, see Biofeedback Federation of Europe 2013b).

3.2.2 Simulation

The activities were conducted using Algodoo®, Footnote 1 (Algoryx Simulation AB 2013) 2D physics simulation software developed by Algoryx Simulation AB. Algodoo allows users to create interactive scenes in which objects are subject to the laws of physics such as gravity, friction, wind resistance, and restitution. The scenes were created in the “paused” mode, where simulated physics laws are suspended; when the “play” button is pressed, the physics laws are turned on and the simulation starts.

In the first activity, a standard-tutorial guided students through the main tools and techniques of Algodoo® (Algoryx Simulation AB 2013) (the tutorial is included in the software). This activity had no time limitation. The instructions for the Tower and Bridge tasks were created by the researchers and given to students as PDF documents, which were displayed on a window beside the Algodoo® (Algoryx Simulation AB 2013) software window. Both windows remained open on the computer screen throughout the task. The detailed instructions provided information on the basic characteristics that the Tower and Bridge ought to have, while general instructions presented the general requirements for the task. Examples of the slides used for detailed and general instruction for the Tower task are shown in appendix 1, Fig. 6.

The goal of the first activity was to build a tower strong enough to support a ball weighing 1 ton. The goal of the second activity was to build a bridge strong enough to support a car crossing from the left side of the screen (green area) to the right side (red area). A set of eight ready-made construction elements (wooden beam, brick, square block of rock, etc.) were created for each task and were available to the students in a special folder during the two tasks. The students also had the option of creating their own construction elements and had 6 min to complete each task. For the free play task, students were given 6 min of unstructured activity time to utilize the software as they wished, and they were free to use the ready-made elements in the custom folder or create their own construction elements.

3.3 Procedures

Parent consent forms of each student was received in order to participate. The activities were carried out individually for each student, in a room designed for this purpose. After being acquainted with the physiological measurement, each student had a brief explanation about the study. After each task was over, they were asked to save the scene without further modifications. They were also asked to “play” the saved scene before proceeding to the next task, even if they had done this spontaneously before. This allowed them to view the interactions between their scene’s elements when the effect of gravity and other laws of physics were applied to the environment. This step provided an opportunity for them to receive feedback on the gravity effect present in the “scene” they had constructed and, consequently, interact with the laws of physics.

3.4 Measures and Coding

Demographic survey information such as grade, age, gender and ethnicity were collected at the end of the experiment.

3.4.1 Scene Coding

The scenes created by the participants were evaluated based on:

  • the number of original objects (created by the student) versus ready-made elements from folders (with or without modifications),

  • if students finished the task in the allotted time,

  • if students simulated the gravity spontaneously prior the end of the task.

  • if students achieved the goal of the task.

After coding, a summary index based on the number and type of elements was developed to analyze exploration X exploitation patterns induced by the instruction. We focused on exploration because it is a main factor for creativity (Carroll 2011; Picciuto and Carruthers 2014) and we expected that students receiving generic instruction would be more spontaneous and exploratory, creating original objects or modifying the ready-made ones in order to fit in their own ideas for the Tower or the Bridge. Conversely, we expected that detailed instructions would induce to a more task-oriented approach or exploitation, increasing the use of ready-made elements, without modification. We called it as exploration index (EI), and it is defined by the formula:

$$EI = \frac{(c + m) - n}{\sum e },$$

where

  • EI = exploration index;

  • c = number of objects created by the student;

  • m = number of ready-made objects modified by the student;

  • n = number of objects not modified by the student;

  • ∑e = total number of objects in the scene

The index ranges from −1 to 1 and a higher index implies in more exploratory behavior, with positive values indicating prevalence of objects created or modified by the student.

3.4.2 Galvanic Skin Response (GSR) Measurement

Skin conductance (SC) was continuously measured during the activities with Procomp Infiniti encoder hardware and Biograph Infiniti software in a rate of 256 samples per second, with a snap style Silver/Silver Chloride electrodes. Following a procedure used by others, after visual inspection to remove artifacts, the SC level (SCL) was exported in a 2 min time window for each subject (Choi et al. 2010; Nourbakhsh et al. 2012), with exception of the Stroop test and the math baseline tasks that lasted 65 and 90 s, respectively. The unit of measurement is microsiemens (μS).

To evaluate activation level in the task, we defined a score based on the difference of the SCL during the task and the baseline measurement. This “difference score” has been used by a number of researchers and allows for a relatively simple interpretation, with positive scores indicating increased arousal levels with regard to the task, and negative scores indicating decreased arousal levels (Burt and Obradović 2013). We corrected this score to reduce individual differences using the same approach employed by Nourbakhsh et al. (2012), and in our paper is expressed by the formula:

$$ASs = \frac{{\overline{SCl} - \overline{SCb} }}{SCt} ,$$

where:

  • ASs = activation score in the segment;

  • \(\overline{SCl} = {\text{mean SCL for the specific segment}}\);

  • \(\overline{SCb} = {\text{mean SCL for the baseline}}\);

  • \(\overline{SCt} = {\text{mean SCL for all tasks}}\).

This formula was applied individually in order to obtain a distinct activation score for each subject in each segment.

3.5 Statistical Analysis

For data analyses, we used the IBM SPSS Statistics 20 package and adopted non-parametric tests, which are best suitable for analyzes of non-Gaussian data distribution, by visual inspection of the histograms, skewed data and for small samples (<25 per group) (Kitchen 2009). In these cases, the central tendency is usually better represented by the median, instead of the mean. To analyze differences in activation between groups, we run a Mann–Whitney U test, which is the non-parametric t test for independent sample. For within group analysis, Friedman Test, know as the non-parametric ANOVA for repeated measures, and Wilcoxon Sign-Rank Test, which is analogous to the paired-sample t test for, were used. Nominal and categorical variable were analyzed using Chi Square test. Correlations were investigated looking at the Spearman’s rank correlation coefficient.

The non-parametric and crosstabs statistics are based on the asymptotic method, in which the p values:

… are estimated based on the assumption that the data, given a sufficiently large sample size, conform to a particular distribution. However, when the data set is small, sparse, contains many ties, is unbalanced, or is poorly distributed, the asymptotic method may fail to produce reliable results. In these situations, it is preferable to calculate a significance level based on the exact distribution of the test statistic (Mehta and Patel 2011, p. 1).

Following this assumption, for the categorical variables analyzed by crosstab, for every appearance of the message “(a number of) cells have expected count <5” we presented the result from the exact test (two-tailed) instead of the Chi square. For the ordinal and scalar variables, both results will be shown when the exact test presented a significant value of p but the asymptomatic do not. We considered significant p value smaller than 0.05 (two-tailed).

Our main goal is to understand whether different instructions led to different level of activations and exploratory behaviors when students interacted with the Algodoo software. We start the result section with the analysis of the demographic variables to test whether our two experimental groups were balanced in terms of the male/female ratio and ethnic background (Sect. 4.1). Next, we present analyses of the tasks outcomes and test whether the types of instruction (general or detailed) helped students complete the task (Sect. 4.2). In Sect. 4.3, we analyze the GSR data and see if levels of activation differed between the two experimental groups. In Sect. 4.4, we look at whether different instructions influenced students’ creativity (as measured by the EI defined above). In Sect. 4.5, we further analyze the influence of arousal levels at baseline and their influence on student’s exploration.

4 Results

4.1 Demographic Variables

There were no significant differences between groups in any demographic variables: grade, age, gender and ethnicity (p > 0.05).

4.2 Task Performance and Time to Completion

A higher percentage of students following the Gener. → Detail. sequence finished both the Tower and Bridge task (88.9 % in each task), compared to those following the opposite instructional sequence (55.6 % for Tower and 77.8 % for Bridge). However, those differences were not significant (p = 0.294 in the first and p = 1.000 in the second task). For the other two variables analyzed—testing the structure for gravity effects and supporting the ball (for the Tower task) or crossing the car (for the Bridge task), there were smaller differences between groups. The experimental condition slightly affected the number of students testing the structure for gravity effects, regardless of the task, favoring those in the detailed condition [33.3 vs. 11 % for the Tower task (p = 0.576), and 44.4 vs. 33.3 % for the Bridge task (p = 1.000)]. Only one student in the generic condition achieved the goal of supporting the ball in the first activity and 44.4 % of each group in the second. In summary, our experimental manipulation showed no differences in students’ task performance or completion time. In the next section, we look at the effect of detailed versus generic instructions on activation patterns as measured by the SC sensor.

4.3 Was the Activation Affected by the Instruction?

To analyze the participants’ levels of activation, we performed both between and within group analyzes. Any of the baseline measures, including Stroop, Math and Tutorial, produced between-group differences. This result was expected since both groups performed the same activities (see Table 1). Visual inspection of the activation patterns along the tutorial (Fig. 2) shows that students in both groups had a reduction in activation in the first part of the task, followed by a re-activation at the end. The within-group differences were not significant (Friedman Test for Detail. → Gener.: X 2(2) = 0.667, p < 0.717; Friedman Test for Gener. → Detail.: X 2(2) = 0.250, p < 0.882) but suggests an increase in excitability at the end of the task when very specific and detailed instructions were given.

Table 1 Activation score for each task by group (Mdn)
Fig. 2
figure 2

Each dots represent the median of the activation level in a 2-min segment

In the Tower activity, students following detailed instructions showed a similar pattern as observed in the Tutorial: a reactivation at the end of the task, (Friedman Test: X 2(2) = 0.750, p < 0.011). However, a habituation process was observed for those following generic instructions (Friedman Test.: X 2(2) = 0.750, p < 0.687). When the instruction conditions were reversed, the activation pattern for each group changed as well: students now in the detailed condition showed a significant reactivation from segment 2–3 (Wilcoxon Sign-Rank Test: Z = −2.38; p = 0.017). The same pattern was observed for the students following the detailed condition in the previous tasks, and students in generic condition had a non-significant habituation along the task (Friedman Test.: X 2(2) = 2.889, p < 0.278). These changes increased the differences in activation between groups along the Bridge activity, with students in the detailed condition being significantly more activated than those in generic condition at the end of the task (Fig. 2; Table 1).

It seems that students started to habituate at the beginning of all tasks, from segment 1–2. However, given detailed instructions, including the step-by-step tutorial, led to a reactivation in the end of the task (U-shaped curve). The generic condition was associated to a habituation process suggesting relaxation along the task. Following this rationale, we would expect all the students to reduce their activation during the Free activity, showing habituation, because they had no pre-set goals to achieve. However, students showed the same pattern observed in the last task they followed, with a significant difference between the segments (group Detail. → Gener. Friedman’s Test: X 2(2) = 13, p = 0.002; group Gener. → Detail. Friedman’s Test: X 2(2) = 8.85, p = 0.012) and between groups for the last segment (Table 1). Up to now, activation appeared to be affected by different modes of instruction, but the result from the Free task suggests that other factors can be involved. This interpretation will be addressed again in the Discussion.

In the next section, we look at the effect of detailed versus general instructions on student’s creativity, as measured by the EI. As a reminder, we expect detailed instructions to generate a more rigid mindset and thus exhibit lower EI scores; generic instructions, on the other hand, should promote more a more creative mindset and higher EI scores, since students have more freedom to explore the problem space.

4.4 Was Student’s Creativity Affected by the Level of Detail of the Instructions?

Although the distribution of the number of elements in each task shows a different tendency between the two groups in the creation of objects and use of pre-made objects (Fig. 4), there are no significant differences between groups for the EI (Fig. 3) as well as for any other measure related to the type and number of elements used (see statistical tables for between-groups comparison in appendix 2). Within groups observations shows that the EI was significantly different between the three tasks for both groups with higher values in the free task, where no goal was given (Table 2).

Fig. 3
figure 3

Exploration index (EI) for each group and task (Tower, Bridge, Free). Bars represent means, whiskers show standard errors

Table 2 Exploration index for each task by group

The students who received detailed instructions in the first task (Tower) had the smaller index in this task, with significant difference from the Tower to the Free task in a post hoc pairwise comparison with Bonferroni correction (p = 0.040). Those students also had a higher number of ready-made elements in both Tower and Bridge task, instead of creating their own elements (Z = −2.67, p = 0.008 and Z = −2.49, p = 0.013, respectively). Students who received generic instructions at first showed the same mean rank for EI in both tasks with goal (Tower and Bridge) and no significant difference between those and the Free task (post hoc pairwise comparison with Bonferroni correction: p = 0.182). The quantity of ready-made or created elements were similar in both Tower and Bridge (Z = −0.350, p = 0.726 and Z = −0.269, p = 0.767, respectively; Fig 4).

Fig. 4
figure 4

Scene evaluation: median of the number of elements created or ready-made by category and task, T Tower B Bridge and F Free

Thus, there does not seem to be an effect of generic versus detailed instructions on students’ creativity. However, recall that we observed high variance in the activation levels prior to the construction task; in the next section, we investigate whether this baseline level of activation had an effect on students’ exploration.

4.5 Is There an Interaction Between Arousal Level at Baseline and Creativity?

Preliminary analysis of the data showed a high variance of the SC level at the eyes-open baseline across all subjects (Mean: 2.56, Median: 1.67, SD: 2.09), which suggests that, even in resting condition, some subjects were more activated than others. To understand if the type of instruction affected students with high or low arousal at baseline differently, we performed analyses considering both the experimental condition assigned to the students and the arousal baseline level. Taking the median value as a measure of central tendency, students were classified in low and high arousal at baseline and two subgroups were created: four low and four high aroused students in the Detail. → Gener. sequence, and four low and four high aroused students in the opposite sequence.

The main difference was found in the Free task: the high aroused students at baseline who followed the Detail. → Gener. sequence had the smaller EI compared to the other three groups (Kruskal–Wallis H test: X 2 (3) = 12.299, p = 0.006), as can be seen in Fig. 5. In addition, all of them used pre-made elements in this task, what was significant different from the other groups: while none of the low aroused students who followed the same sequence and only one low and one high aroused student from the opposite sequence group did it (Fisher’s exact: p = 0.011). The number of pre-made elements used without modification were also significantly higher for the high aroused students following the Detail. → Gener. sequence (Kruskal–Wallis H test: X 2 (3) = 11.56, p = 0.009).

Fig. 5
figure 5

Distribution of the EI by group based on arousal level at baseline

In sum, when given the opportunity to explore the software and express their creativity without any instruction, the students physiologically aroused in the baseline and who were given detailed instructions during the first task displayed the lowest EI. We will further explore this result in the Sect. 5

4.6 Correlations

For both groups, the EI in the Tower task was negatively correlated with the number of pre-made elements used in the following task, Bridge (Detail.  Gener. rs(9) = −0.731, p = 0.025; Gener.  Detail. rs(9) = −0.705, p = 0.034), suggesting that the higher the exploration in the first task, the less ready-made elements were used in the second task, regardless the instruction type. However, the EI in the Tower was positively correlated with the EI at Bridge only for the students who received generic instructions in the first task (rs(9) = 0.742, p = 0.022).

5 Discussion

The goal of this article was to compare the physiological and task outcomes of students in a computer simulation environment to better understand the effects of different instructional modes on learning new skills, as well as their interaction with individual characteristics. To address this question, we used a variety of sensors and measures to capture students’ arousal and tendency to come up with creative solutions. To our knowledge, very few studies have sought to investigate the effect of constructionist-inspired (i.e., generic) versus instructivist-inspired (i.e., detailed) instructions on students’ physiological states. We present a summary of our findings below.

5.1 Summary of the Findings

  1. 1.

    Task Performance We did not find any effect of our experimental manipulation (i.e., detailed vs. generic instruction) on students’ task performance and time to completion. This is likely because the task was too challenging for our participants: most of them did not succeed at building a tower strong enough to support a 1-ton ball or building a bridge to allow a car to cross a gap. Additionally, it is likely that the novelty of the software probably contributed to their difficulty completing the task. This produced a floor effect in terms of their performance, which unfortunately prevented us from drawing any conclusion regarding the effect of detailed versus generic instruction on students’ ability to build complex structures.

  2. 2.

    Students’ Exploratory Behavior (as measured by the EI) this measure was for the most part not affected by the type of instruction students followed. This is surprising, because we expected generic instructions to produce more explorative behavior. It is possible that the limited amount of time that students spent on the two tasks (6 min for each) prevented them for building structures beyond the pre-made elements provided by the software. However, we observed that students who were already aroused before the Tower task and who received detailed instructions first were the least creative participants during the Free task; it is possible that being mentally stressed by participating in an scientific experiment and receiving highly scaffolded directions put those students into a “exploitation mindset” where they focused on completing the task at hand with pre-made elements. Our findings suggest that this mental block for creative problem-solving can potentially be avoided by provided highly aroused students with more generic instructions during the first task.

  3. 3.

    Students’ Arousal Finally, our experimental manipulation had a strong effect on students’ level of activation: detailed instructions generated a U-shaped curve, while generic instructions exhibited a decreasing slope. Additionally, we observed carryover effects during the Free task (i.e., students following detailed instructions before this task kept exhibiting a U-shaped activation curve, even though they did not have specific instructions to follow). Since results related to GSR measured are the most significant in our study, we elaborate on those patterns in the next section.

5.2 Main Finding: Different Instructions Generate Different Activation Patterns

First, it is important to reiterate that all the tasks were challenging for students: both groups being activated above the baseline and beyond even the levels observed in the Stroop and in the Math tasks. This mental effort, which can be related to either mental stress or workload (Fechir et al. 2008; Staal 2004), led us to assume that all students were engaged and aroused by the construction activities. However, based on our physiological measures we cannot conclude whether students experienced the workload and stress generated by the tasks positively or negatively.

Nevertheless, the activation scores indicate that the mental effort was differently induced by the instructional mode. When students started performing the tasks, both groups had increased levels of activation, followed by a decrease from segment 1–2. However, from segment 2–3, the detailed instruction mode led to what we can call as a “recovery pattern,” characterized by a reinvestment of effort in the final segment toward the conclusion of the tasks. From a neuroscientific perspective of the exploration–exploitation trade-off, we assume that the detailed instructions elicited physiological responses oriented toward task relevant stimuli, which has been associated by the literature to exploitative behaviors (Nieuwenhuis et al. 2011; Laureiro-Martínez et al. 2010).

On the other hand, the general instruction mode induced a deactivation over the course of the task, suggesting habituation or relaxation along the activity. The decrease in the activation is in accordance with the habituation theories that have demonstrated this process to occur following stimulus repetition (VaezMousavi et al. 2007b), which reinforce our assumption that the two instructional modes affect students differently. Moreover, this result is in line with findings based on neurophysiology: considering that LC tonic activation impairs cognitive flexibility and favors exploitation, the opposite can be taken as likely, with reduced levels of tonic activation favoring exploration (Chae et al. 2013; Chae and Lee 2011).

Based on our findings, we reject the idea that the detailed instructional mode would induce a linear increase in the level of activation—instead, what we observed was a reinvestment of energy (U-shaped curve). Moreover, this activation pattern does not appear to be affected by the sequence of instructions in the Tower and Bridge tasks, and it is interpreted as an attribute of the mode of instruction given.

Considering that the activation response is a consequence of a subjective state toward some demand, including emotional effort (Choi et al. 2010; Muldner et al. 2010), cognitive load (Shi et al. 2007), or stress (Fechir et al. 2008) and favoring exploitation (Nieuwenhuis et al. 2011), our hypothesis concerning this particular finding is that the detailed instruction mode evoked an increase in activation near the completion of the task, increasing the behavioral propensity to implementation, possibly as the result of the perceived stress due to the deadline. That is, the necessity of meeting the detailed requirements given by someone else, within the allotted time, could be the key for generating pressure to complete the task and the perception that students are far from done, increasing the arousal in the final segment, even though it does not surpass the activation level in the beginning of the task. On the other hand, in the circumstances generated by the other instructional mode, the opposite hypothesis appears valid: given the same amount of time to finish the task, students in the generic condition felt freer to complete the task without the pressure of a deadline. However, the detailed instructional mode appears to induce a task-related cognitive mindset, or orienting response, disposed to the fulfillment of the task utilizing the materials already provided by the software, while generic instructions seem to lead to a more exploratory approach. This hypothesis is better supported by the finding that the students receiving detailed instruction in the first task had the smaller EI and both groups had the higher index in the Free task, where no instructions or goals were given.

One of the main findings of this article is that the first instructional approach given seems to be a strong factor influencing students’ behavior and activation along the activities that follow, with students showing in the second task the same preference for the type of elements observed in the first task. Moreover, the primacy of the first instruction given also affected differently students with low and high arousal at baseline. High-arousal students receiving detailed instructions at first showed less cognitive flexibility, persisting in the first strategy learned even when the instructions changed or no instruction were given. This result come into Alexander et al. (2007) findings that high-aroused subjects, as well as of people under stress show less cognitive flexibility. However, for both groups, a greater EI in the Tower task was positively correlated with more usage of student-generated construction elements in the second task, Bridge.

5.3 Limitations

This study has some limitations that are worth mentioning. The small sample size constrained us to use non-parametric tests, which reduced our statistical power. This conservative approach may have increased our chances of finding false negatives (type II errors) and possibly prevented us from findings interesting relationships between our independent variable (type of instructions) and dependent measures (task outcome, creativity, activation). However, we note that the total time with each participant was approximately 1.5 h (for preparation, questionnaires, and activity), and thus despite the fact that a large sample size is always better, we consider that the number of subjects that we had was enough for an exploratory study, and a good first step for follow up studies. In terms of the validity of our measurements, we operationalized students’ success and creativity in a very specific way (i.e., by computing a binary outcome—success of failure—for the task performance, and by using our EI for creativity). We acknowledge that those measures are somewhat limited in their scope, and that they don’t fully capture the construct we were interested in. Finally, the brevity of the task likely contributed to students’ performance: with more time, we would expect students to successfully build a tower or a bridge to support the required weight. Indeed, it is possible that students need more time to get used to the software, master ready-made elements and then adventure themselves into building their own modules.

6 Conclusion: Implications for Constructionist Learning Environments, Makerspaces, and FabLabs

In this article, we set out to investigate one main research question: in a “maker” learning environment in which students are engaging in “hands-on” activities, such as building physical and virtual 3D objects, constructing robots, and programming computers, what is the impact of giving students “step-by-step,” scripted instructions, as opposed to generic, challenge-based ones? This question is significant in a variety of such learning and design spaces. It could be relevant as a guideline for curriculum design for hands-on learning, for the structure of activities in makerspaces and FabLabs, and for how we integrate such activities and more traditional school curricula. On both sides of the spectrum, there are well-defined views about how instruction should be structured in such spaces: scripted and very well scaffolded (Kirschner et al. 2006), as well as open-ended (Papert 1980). Even though one could argue that “more freedom is always better,” it has been shown (Blikstein 2013) that completely open-ended environments can privilege students with a higher level of autonomy and previous knowledge, since beginners could get frustrated and get easily “stuck” without clear goals and detailed steps. On the other hand, the literature that advocates against discovery learning due to its high induced cognitive load, focuses on traditional school topics and not activities that are more open-ended in nature such as building a robot, where guiding students toward one right answer contradicts the very goals of introducing such activities in classrooms. In addition, the research on preparation for future learning (Chin et al. 2010) has challenged the idea that learning activities should start with well-defined steps, and then move towards “open-endedness.” As counterintuitive as it might sound, the very idea of gradually letting children explore more might be at odds with contemporary cognitive research.

Due to the lack of research in this field, makerspaces are each devising their own curricula and curricular principles, without any sound scientific grounding. At first sight, the idea that students would benefit from detailed instructions first, and general instruction next, seems to be in conflict with our results. In our study, we found an effect that we term “epistemological persistence,” by which the pedagogical approach chosen to start an activity biases students to continue with that same approach even when after prompt has changed (i.e., carryover effect). In addition, we found that highly-aroused students were much more sensitive to detailed instruction in the beginning while less aroused students demonstrated more cognitive flexibility.

These findings could have significant implications for practice. First, they suggest that the sequence of pedagogical approaches makes a difference in students’ behaviors. If a teacher starts out with a step-by-step activity, it is likely that students will expect the same type of instructions moving forward, especially in similar activities, which can induce them to putting themselves in a “tell and practice” mindset. On the other hand, students who received general, challenge-based instructions first might be more exploratory and adaptive, especially for those being in a state of alertness before starting the hands-on learning task. The use of one or another approach should be balanced with the goals of the activities as well as with individual characteristics of students, in order to reduce overload and anxiety with excess of demands.