Introduction

The concept of “learning by making” has gained tremendous popularity and recognition in education over the past 15 years, with hundreds, if not thousands, of dedicated makerspaces being established in educational institutions across the United States (Melo et al., 2023). As a result, millions of students across the United States now have the opportunity to work in these makerspaces over the course of their K-12 education (Peppler et al., 2015). This growth has been driven by the belief in the potential of making to fundamentally change the way people teach and learn (Dougherty, 2013; Martinez & Stager, 2016). Makerspaces, as opposed to more traditional K-12 learning environments, promote student autonomy and agency, provide opportunities for hands-on activities, and encourage a shift towards a less hierarchical teacher–student dynamic (Martin, 2015; Martinez & Stager, 2016). While the varied nature of makerspaces and the activities that take place within them present a number of challenges to researching the learning that occurs in these environments, there is preliminary evidence that makerspaces may foster learning of STEM concepts (Vossoughi et al., 2013) and the development of critical skills such as problem solving (Blikstein, 2013; Marshall & Harron, 2018).

Because making presents students with multiple opportunities to work on authentic problems that emerge from and are situated within their projects, we hypothesize that students who work on maker projects over longer periods of time will improve at solving problems with mechanistic systems (e.g., mechanical or electronic devices). However, this potential learning outcome has received relatively little attention. A handful of studies have reported positive changes in problem-solving skills (Galaleldin et al., 2016; Harnett et al., 2014; Hartry et al., 2018), but these were based on student self-reports and provided few details about the types of problems students were better able to solve or what aspects of making might have led to these changes.

The current study seeks to fill this gap by examining the impact of makerspaces on problem-solving skills in K-12 students, specifically exploring how working through multiple cycles of the engineering design process affects high-school students’ mechanistic problem-solving skills. To investigate this we developed two hands-on problems involving mechanistic systems and asked a group of 19 high-school students to work on the problems before and after taking part in a year-long course on digital fabrication. Additionally, we asked 17 experts (graduate students in mechanical engineering) to work on the same set of problems. The main finding was that participation in the course had a positive effect on the high-school students’ problem-solving abilities. By comparing the expert mechanical engineers to the high-school students, it was possible to show that participation in the year-long course made the high-school students more like experts. Furthermore, by examining the differences in problem-solving processes between groups it was possible to hone in on the nature of this change: experts and post-course students focused more on the mechanistic relationships between components, while pre-course students seemed to ignore these relationships. In other words, the better performing participants seemed to be able to “see” aspects of the problem that made them better able to understand and solve them.

Overall, the results of this study suggest that makerspaces have the potential to foster the development of problem-solving skills in students, leading to better preparedness for the technological challenges of the twenty-first century. The research highlights the importance of working on long-term design projects in makerspaces and sheds light on how the development of problem-solving skills can be measured and characterized. These findings contribute to the growing body of knowledge on makerspaces and learning, and have significant implications for the design and use of makerspaces in schools.

Background

What are students learning in makerspaces?

While schools perform many functions, their primary one is to foster learning. Thus, as the number of makerspaces in K-12 institutions continues to grow, the natural question is “What are students learning in makerspaces?” (Petrich et al., 2013; Timotheou & Ioannou, 2019; Vuorikari et al., 2019). This question is particularly difficult to answer due to the varied nature of makerspaces and the varied nature of making activities that students engage in. This wide variety of experiences is a feature, not a bug, of educational making, since providing students with the autonomy to choose what projects to work on and what roles to take on is one of its most celebrated aspects (Cohen et al., 2017; Harron & Hughes, 2018; Kajamaa & Kumpulainen, 2019). For example, even within a single group working on a project, one student might specialize in programming, another might specialize in CAD, and a third might specialize in using the digital fabrication tools. Nevertheless, this variety of experiences is precisely what makes it difficult to identify the things that students in general are learning in makerspaces, since when students are free to follow their interests different students in a single class will learn different things and develop different areas of expertise.

One activity that most students working in makerspaces have in common in is working through the iterative design process while working on their projects (Petrich et al., 2013). This is the process through which students develop and test prototypes, encounter unexpected problems with their designs, hypothesize solutions to these problems, and develop improved prototypes. This activity provides students with opportunities to make mistakes and deal with failure in a way that is healthy, expected, and even celebrated (Martin, 2015; Vossoughi et al., 2013). However, as Martinez and Stager note, the point of this process is not simply to encounter failure, but to overcome it by figuring out how to solve the problems that arise (2016). This echoes Papert’s debugging philosophy, which is the viewpoint that “Errors benefit us because they lead us to study what happened, to understand what went wrong, and, through understanding, to fix it” (Papert, 1980, p. 114).

Since the iterative design process lies at the heart of making, and since problem solving lies at the heart of the iterative design process, this suggests that problem solving may be an activity that many students who work in makerspaces will gain experience with, regardless of the types of tools or materials available to them. Despite the fact that students already have many opportunities to engage in problem solving during their time in K-12 schooling, we argue that students working in makerspaces are likely to gain substantial experience with a class of important problems that they would not otherwise encounter in their K-12 education, and that makerspaces support unique ways of approaching and working on these problems.

The problems students encounter in makerspaces and digital fabrication labs are often ill-defined, in the sense that it is obvious that something is wrong, but understanding the precise nature of the problem requires further investigation and testing (Robertson, 2003). The failure to provide students with opportunities to work on ill-structured problems has been identified as a weakness in other problem-based pedagogies, since these pedagogical methods typically provide students with well-defined problems which bear little resemblance to those encountered in non-schooling situations (Jonassen et al., 2006). Solving ill-structured problems requires much more than simply finding the right answer. Students must work to identify and characterize the nature of the problem, to hypothesize solutions, to implement these possible solutions during prototyping, and to evaluate their solutions by testing and observing their prototypes. This type of problem solving extends over longer periods of time, and is capable of producing deeper and more nuanced understanding of the problem (Sheppard et al., 2006).

The ways that these problems are framed, and the types of support and guidance that students receive during problem solving, are also not typically encountered in K-12 settings. In makerspaces, mistakes and failures are framed as an important part of the design process, essential to achieving the goal of arriving at a satisfactory design, as opposed to something to be avoided in fear of receiving a bad grade (Vossoughi & Bevan, 2014). This has been described as one of the distinguishing features that separates making from other problem- and project-based approaches (Martin, 2015) Furthermore, students have many opportunities to encounter and work on these types of problems in makerspaces. Digital fabrication tools dramatically increase the speed at which students can iterate through the iterative design process. By rapidly making changes to digital design files using CAD/CAM software and sending them to digital fabrication tools (e.g., 3D printers, laser cutters) to be physically produced, it is possible for students to quickly test and refine their designs, sometimes making multiple iterations within a single class period. Each iteration of the design cycle provides students with opportunities to gain more practice in encountering and solving problems.

Defining and operationalizing problem solving in makerspaces

There is little prior work on the development of problem-solving skills in makerspaces. The work that does exist has relied on student self-reports to measure changes in problem-solving abilities. Harnett et al. (2014) found that some university students who spent a semester working in a community hackerspace reported increased confidence in their problem solving and project-planning abilities, and Galaleldin et al. (2016) reported that 60% of university engineering students reported feeling “more confident in their engineering knowledge and skills to solve a complex engineering problem”. At the K-12 level, Hartry et al. (2018) found that students working as interns in a museum makerspace self-reported increases in problem-solving skills, attributing this to their experience working on open-ended problems during their internship. However, to date there is no research that directly measures a change in students’ abilities to solve problems after working in makerspaces.

The goal of the current study was to address this gap in the literature. In order to measure such a change directly, a specific class of problems and associated problem-solving skills needed to be identified so that appropriate assessments can be developed. We observed that many of the projects that students work on in makerspaces involve the design and construction of mechanistic systems, such as electronic circuits, mechanical systems, or objects with multiple interlocking parts. Formally, a mechanistic system consists of (a) a phenomenon or phenomena that can be explained or understood by (b) decomposing it into parts or components that (c) are organized in such a way that (d) they give rise to the phenomenon through their causal interactions (Illari & Williamson, 2012). In makerspaces, when students encounter problems with these systems, they engage in a hands-on process of tinkering and debugging that is different from the types of problem solving typically used in schools. Because of this, we hypothesized that makerspaces would engender a set of problem-solving skills involving mechanistic systems that students would have little opportunity to develop in other educational settings.

In order to measure differences in students’ abilities to solve problems involving mechanistic systems, we designed a set of hands-on problems that involved building and troubleshooting mechanistic systems called the Gearbox-Assembly Task and the Flashlight-Repair Task. The Gearbox-Assembly Task involved figuring out how to reassemble a differential without any instructions or information about what was being assembled (Fig. 2), and the Flashlight-Repair Task involved figuring out how to repair a flashlight that had been deliberately broken in three ways (Fig. 3). These problems were designed to be similar to the types of problems students might encounter in a makerspace, so that a failure to measure changes in problem-solving skills could not be blamed on a lack of construct validity. At the same time, it was important to present students with problems that they had not yet solved, otherwise the assessment would simply be testing their ability to remember previously-encountered solutions. For this reason the problems involved mechanistic elements that the students would not directly work with in the year-long course (geared mechanisms and electronic circuits).

These tasks were also designed to make the students’ problem-solving processes visible. This way, we could more easily capture all of the actions participants took while working on the problem, and record all of the problem states that they visited. Our goal was to use this information to identify distinct problem-solving approaches [conceived of as paths through the problem space (Newell & Simon, 1972)] and to examine the details of these approaches so that we could not only understand if there was a change, but also what the nature of this change might involve.

Analysis of problem-solving processes

Historically, the study of problem solving has focused more on the process than the outcome, and this approach has led to a series of important discoveries in the field. Meticulous analyses of problem-solving processes, derived from detailed observations of individuals working on carefully designed problems, led to the seminal discoveries of functional fixedness (Duncker & Lees, 1945), where individuals tend to limit themselves to only using objects in established ways, and the Einstellung effect, a phenomenon where individuals prefer a familiar solution approach even when better alternatives exist (Luchins, 1942). Soon after, Newell et al. proposed a formalization of problem solving that viewed it as a search through a problem space—a conceptual landscape of all possible problem states along with the actions that transform one state into another (1972). A cornerstone of their approach was the think-aloud method, which made it possible to study the internal, mental conceptualizations and processes involved during problem solving.

Building upon this foundation, and utilizing a similar methodology of process analysis, Chi et al. showed how domain-specific knowledge could impact problem solving approaches. Experts, possessing a deeper and more structured understanding of a domain, don’t just approach problems differently; they perceive them differently. They categorize problems based on their deep structures and the underlying principles involved, allowing them to navigate the problem space more efficiently. In contrast, novices, whose understanding of a subject is more fragmented and superficial, tend to fixate on the surface features of a problem. This leads novices to miscategorize problems, resulting in the construction of inaccurate and misleading problem spaces. Novices searching through inaccurate problem spaces are “off the map”, searching down pathways that rarely end in successful solutions. This work underscores how expertise fundamentally alters the way problems are perceived, categorized, and approached.

More recently, as computer-based problem-solving environments have become more prevalent, a new method for analyzing and understanding process data has emerged: unsupervised clustering of action sequences to identify different problem-solving approaches (Antonenko et al., 2012). Noteworthy contributions include the employment of sequence mining to investigate students’ strategies in interactive simulation tasks (Wang & Wieman, 2022), clustering of behavioral patterns in PIACC problem-solving items (He et al., 2019), and identification of discussion patterns in computer-support collaborative learning environments (Kapur, 2011). This method has the advantage of efficiently handling large amounts of data, providing more objective and replicable insights, and reducing the human biases and labor-intensive processes often associated with manual coding and analysis of observation and think-aloud data.

While sequence mining and clustering methods are commonly used to analyze student behavior in computer-based environments, their use in analyzing hands-on problem-solving activities has been limited. Collecting process data in dynamic real-world environments such as makerspaces and workshops is substantially more difficult than simply logging actions taken in computational environments. Multi-modal learning analytics (MMLA) (Blikstein & Worsley, 2016; Schneider & Blikstein, 2015) attempts to address this gap by instrumenting real-world environments with sensors that log students’ actions. While this approach has proven successful in many cases, it was not appropriate for the study of hands-on problem solving, as existing sensors are incapable of reliably measuring small-scale manipulations of pieces and parts. Instead, we built on an approach pioneered by Suomala which combined qualitative video coding methods with clustering techniques (1996). This approach overcomes the limitations of MMLA by combining the acute observational abilities of humans to code video data with the robust and objective abilities of unsupervised clustering techniques to identify common patterns across participants.

Research questions

The research questions in the study were as follows:

  1. 1.

    Does working in a makerspace impact students’ abilities to solve problems involving mechanistic systems?

  2. 2.

    By examining the process data generated during the problem-solving activities, ...

    1. (a)

      ...can we identify the different problem-solving approaches taken by the students?

    2. (b)

      ...can we identify meaningful differences between expert and novice problem solvers?

To answer these questions, we designed a set of hands-on problems to measure changes in problem-solving skills, and recruited a group of high-school students to take part in a year-long course in digital fabrication. The students worked on the problems before and after the course, and by analyzing the differences in their approaches we were able to measure changes in their hands-on problem solving skills. Additionally, we recruited a group of expert mechanical engineers to work on the hands-on problems. The engineers’ performance on the assessments provided a kind of ground truth that made it easier to determine if any changes in the high-school students’ problem-solving approaches were random, or reflected a change in expertise.

Methods

Participants

We recruited 19 high-school seniors (15 females, 4 males) to take part in this study. Given their lack of exposure to formal engineering courses or training, we considered these students to be “novice” engineers. These students participated in study activities during their normal school day schedules and were not compensated for their participation.

We also recruited 17 graduate students (8 females and 9 males, mean age = 24.67, SD = 2.13) in mechanical engineering (ME) from an R1 university to take part in this study. Having completed a bachelors degree in engineering and been granted entry into a top 25 graduate engineering program, we considered these ME graduate students to be expert engineers. The graduate students received a $20 gift card as compensation for their participation.

Study design

The study employed a between-subjects design with the high-school students being randomly split into two groups at the start of the study. Group A (N = 10) worked on the Gearbox-Assembly Task before the course and the Flashlight-Repair Task after the course, while Group B (N = 9) worked on the Flashlight-Repair Task before and the Gearbox-Assembly Task after (Fig. 1). On the Gearbox task, students in Group A served as the control group for students in Group B, who worked on the Gearbox problem after taking part in the course. On the Flashlight-Repair task, students in Group B served as the control group for students in Group A, who worked on the Flashlight-Repair problem after the course. The mechanical engineering experts worked on both tasks in a single session and did not take part in the course on digital fabrication. The choice to use a between-subjects design was made to avoid test-retest effects; however, this effectively reduced the sample size and the power of the study. The dependent variable in this study was change in problem-solving skills, and the independent variable was participation in the course.

Fig. 1
figure 1

Study design. This does not show the experts, who worked on the Gearbox and Flashlight-Repair tasks in a single session

Materials

The Gearbox-Assembly Task

The Gearbox-Assembly Task was designed to measure the participants’ ability to assemble a complex geared mechanism with no instructions, images, or information about the completed object. The Gearbox was presented to the students in ten pieces (Fig. 2a). Each of the pieces contained magnets that stuck together when two pieces were correctly assembled. It is important to note that there were many ways of magnetically connecting the pieces that were not correct, while there was only one way of correctly assembling the Gearbox (Fig. 2b).

Fig. 2
figure 2

The Gearbox-Assembly Task. Participants received the Gearbox in the unassembled state and had 5 min to assemble it with no instructions or images of the final assembly

The Flashlight-Repair Task

The Flashlight-Repair Task was designed to measure the participants’ ability to troubleshoot and repair a faulty device. Two flashlights—one green, one red—were presented to each participant (Fig. 3). The green flashlight was working and the red flashlight was broken. Three errors were present in the red flashlight: one of the batteries was reversed, the electrical contact in the base of the battery was inverted and disconnected, and the bulb was burnt out. The participants were not provided with any information about the broken flashlight, nor were they presented with any additional resources.

Fig. 3
figure 3

The Flashlight-Repair Task. Students were presented with two flashlights, one working (green) and one broken (red), and were tasked with fixing the broken flashlight. The red flashlight is shown disassembled here for illustration purposes, but both flashlights were fully assembled at the start of the task (Color figure online)

Procedure

The high school students’ work took place in three distinct phases: pre-course, course, post-course. The experts’ work occurred after the end of the high school course and was independent of these phases.

Pre-course

On the first day of the course each high-school student was randomly assigned to work on either the Gearbox-Assembly Task or the Flashlight-Repair Task. The student was seated at a table and the task was placed in front of them. The participant was instructed not to touch the task until the timer was started, at which time they had 5 min to try and solve the task. When time expired, the task was removed from the table, and the student returned to class. Video was collected using a GoPro camera facing the student.

If the student had been assigned to the Gearbox-Assembly Task, they were told that the object in front of them had been disassembled, that it was their job to try and put it back together in 5 min, and that they should try their hardest and not be frustrated if they were unable to solve the puzzle. They were not given any further information about the object (i.e., no instructions on how to assemble the object).

If the student was assigned to the Flashlight-Repair group, the two flashlights were placed in front of them. The participant was told that the red flashlight was not working, and that it was their job to repair it. They were shown how to turn on the green flashlight by twisting the head, which also demonstrated that the green flashlight was working.

Course

This course took place in a makerspace on a university campus and was facilitated by the students’ physics teacher and the lab managers. The high-school students visited the makerspace twice a week for roughly 1 h per visit. In total, students spent between 30 and 40 h on the course.

Students worked on two multi-week design projects that, aside from certain specified goals, were designed to allow for creative freedom otherwise. The students worked on the first project, the Omni-Animal, during the first 2 months of the course. Administrative issues then resulted in a 2 month break, after which the students returned to the makerspace to work on the second project, a Rube Goldberg machine, for an additional 2 months.

The Omni-Animal project required students to design a three-dimensional creature out of multiple two-dimensional pieces. Outside of the requirement to use four types of specified connectors, students had full creative freedom. This project was designed to give students experience using two-dimensional vector-drawing software (CorelDRAW) to create a multi-part, three-dimensional construction. Students received a template with vector drawings of the required connectors (see Fig. 4a), an example Omni-Animal that had been cut out of wood and assembled, and direct instruction on using CorelDRAW.

Fig. 4
figure 4

Design files for the Omni-Animal project

The second project was the creation of a Rube Goldberg machine, a mechanical contraption that uses a complicated series of interactions to perform a simple task. This project was designed to give students experience with designing a multi-component, mechanistic system though a collaborative engineering design process. The students were broken into small groups, and each group was tasked with designing one components in a Rube Goldberg machine. Groups received constraints having to do with the start and end actions of their component (e.g., activated by heat and trigger the following stage with a loud noise). The groups were required to collaborate with one another in order to ensure that the components they were designing would interact properly. Aside from satisfying these constraints, the groups were given full creative freedom. Like the Omni-Animal project, the Rube Goldberg project was intended to guide students through multiple iterations in the design cycle, and required students to make connections between the function, behavior, and structure of their stage in the machine.

Post-course

In the final week of the course, the high-school students were asked to leave class for a short period of time to work on a hands-on problem. If the student had been assigned to the Gearbox-Assembly Task in the pre-course phase, that student worked on the Flashlight-Repair Task in the post-course phase. Similarly, if the student had worked on the Flashlight-Repair Task in the pre-course phase, they worked on the Gearbox-Assembly Task in the post-course phase. Like the pre-course phase, each student was seated at a table with the task in front of them, and after time expired the student returned to class. During the task, video was collected using a GoPro camera facing the student.

Experts

After the course had concluded, 17 graduate students in mechanical engineering (experts) were recruited to work on the hands-on tasks. Each expert was seated at a table and tasked with working on both the Gearbox-Assembly Task and the Flashlight-Assembly Task. The order of the tasks was randomized, and each expert had 5 min for each task. Video of their activity during both tasks was recorded using a GoPro camera.

Video coding schemes

We had two objectives which resulted in the development of two distinct video coding schemes. First, we were interested in evaluating how close each participant came to the correct solution on each task. The Correct-Combination Coding Scheme was developed for this purpose. Second, we were interested in each participant’s sequence of actions and the corresponding sequence of problem states. We designed the Actions-in-Time Coding Scheme for this purpose.

Correct combinations

In order to meaningfully compare the participants on each task, it was necessary to develop a metric that accurately measured how close each participant came to the correct solution. This provided more information about each participant’s progress than a binary complete/incomplete coding scheme. While completing each task required a different number of correct part combinations—11 for the Gearbox and 4 for the Flashlight—a similar coding procedure was separately followed for each task. Each correct part combination was assigned a distinct code. If the participant performed an action that matched one of the codes, they received 1 point. If the participant’s action matched a code partially, they received a half-point (0.5). No time information was recorded, so two participants who carried out the same actions in different orders would receive the same score. The scores for each combination were summed within each task to create a single index of how close that participant came to solving that task. A score of 0 meant the participant made no progress on the problem, while a score of 11 for the Gearbox or 4 for the Flashlight meant the participant completely solved the problem. The higher the score, the closer to finishing successfully.

Actions in time

In addition to comparing how close participants came to each solution, we were also interested in comparing differences in the participants’ sequences of actions. We developed two time-based coding schemes for this purpose, one for the Gearbox problem and one for the Flashlight problem.

The Gearbox-Assembly Actions-in-Time coding scheme was developed to categorize and track the different types of actions participants carried out while attempting to solve the Gearbox problem. The final coding scheme contained 16 codes, each representing a type of action that a participant could take (Table 1).

Table 1 Actions-in-time coding scheme for the Gearbox problem

Similarly, the Flashlight-Repair Actions-in-Time coding scheme tracked each participant’s actions as they worked through the problem. The coding scheme also contained 1–3 prefixes to: (1) keep track of which flashlight the participant was working on—functioning (G–green) or broken (R–red) and (2) denote which specific component(s) the participant was focusing on. The resulting codes took the form of: <flashlight> <component> <component> <action code>. See Table 2 for an example of how this coding scheme was applied to one participant’s series of actions in the Flashlight problem.

Table 2 Example of applying the actions-in-time coding scheme on the flashlight problem

We designed custom software to streamline the process of coding the videos. Each time the participant carried out an action, the appropriate code was entered and linked to the video using a timestamp. After coding a participant’s video, we were left with a full sequence of the participant’s actions during the problem. This process transformed video of the participants’ actions into a time-stamped sequence of codes.

Findings

Proximity to solution

The Gearbox-Assembly Task proved to be particularly challenging for the participants in this study. Few were able to completely solve the problem, regardless of their level of expertise: Only 2 out of 17 experts solved the problem, and none of the high-school students were able to solve it.

However, by counting the number of correct combinations that each participant carried out, we were able to measure how close each participant got to finding the correct solution. 11 unique part combinations were required to solve the Gearbox-Assembly Task. The maximum score a participant could receive was an 11, and the minimum score was zero.

The post-course students (\(M=3.69, SD=1.85\)) got significantly closer to the solution than pre-course students (\(M=1.94, SD=1.15\)), \(t(11.69) = -2.27, p < .05\), Cohen’s \(d = 1.14\). Additionally, the experts (\(M=6.82, SD=2.44\)) outperformed post-course students, \(t(17.88)=-3.56, p<.01\), Cohen’s \(d = 1.38\) (Fig. 5).

Fig. 5
figure 5

Correct combinations on the Gearbox-Assembly Task split between pre-course students, post-course students, and experts. The minimum possible score was 0 and maximum possible score was 11

The Flashlight-Repair Task was not as difficult as the Gearbox problem, but it still posed a significant challenge to the participants. 9 out of 17 experts solved the flashlight-repair problem, 1 out of 10 post-course students solved the problem, and 0 out of 9 pre-course students solved the problem.

In this problem, the broken flashlight had three sources of error: the batteries were inserted incorrectly, the spring contact in the cap was upside-down and failed to close the circuit, and the bulb was burned out. Successfully completing the task required repairing all three sources of error. By counting the number of errors corrected by each participant, it was possible to construct an index to measure distance to the solution. The minimum score a participant could receive was a zero (no sources of error corrected), and the maximum score was a three (all sources of error corrected).

The experts (\(M=2.47, SD=0.62\)) got significantly closer to the solution than the post-course students (\(M=1.9, SD=0.57\)), \(t(20.52)=2.43, p<.05\), Cohen’s \(d = 0.94\). Additionally, the post-course students made it marginally significantly closer to the solution than the pre-course students (\(M=1.22, SD=0.97\)), \(t(12.61)=1.83, p < .1\), Cohen’s \(d = 0.86\) (Fig. 6).

Fig. 6
figure 6

Closeness to flashlight solution

Grouping problem-solving strategies

To better understand whether different approaches to solving the problems could explain the differences in performance, we used an unsupervised clustering method to identify groups of participants with similar approaches to the problems. For each problem we computed the edit distance between all participants’ Actions-in-Time sequences using TraMineR’s optimal matching algorithm (Gabadinho et al., 2011), which allowed us construct a symmetric distance matrix that captured the similarity between all pairs of participants. After constructing this matrix, we used agglomerative hierarchical clustering (Maechler, 2018) to identify groups of participants who were most similar to each other.

Problem-solving approaches on the Gearbox Assembly Task

On the Gearbox task, we identified two distinct approaches to tackling the problem. The first approach was adopted by 16 experts and 3 post-course students. The second approach was used of all 7 pre-course students, 5 post-course students, and 1 expertFootnote 1 (Fig. 7). Subsequently, we will refer to the first approach as the expert approach (as it contains 94% of the experts) and the second approach as the novice approach (as it contains 100% of the pre-course high-school students).

Fig. 7
figure 7

Makeup of the two groups found using hierarchical agglomerative clustering on the Gearbox problem

To better understand the nature of each approach, we visualized the proportion of actions within each cluster and identified a number of differences (Fig. 8). First, we identified four actions that the expert cluster performed at a higher frequency than the novice cluster: meshing gears (mesh), rotating pieces (rot), mounting axles (axle), and making correct magnetic connections (mag). In subsequent analysis we call these four actions “mechanical actions”. Second, we identified two actions that the expert cluster performed at a lower frequency than the novices: incorrect plastic connections (plas) and incorrect magnetic connections (magx). We call these “structural actions” in subsequent analysis.

Fig. 8
figure 8

Proportion of actions for each cluster. Note the higher proportion of axle-related actions (green), meshing gears (dark purple), and rotation (fuchsia) in the expert cluster, and the higher proportion of incorrect magnetic connections (sky blue) and incorrect plastic connections (beige) in the novice cluster (Color figure online)

We created a mechanical-action index for each participant by dividing the sum of the four mechanical actions by the total number of actions for each participant. We performed 2 two-tailed t-tests to compare the differences in mechanical-action frequency between the pre-course students and the post-course students, as well as the differences between the post-course students and the experts. The post-course students (\(M=0.24, SD=0.14\)) performed significantly more mechanical actions than pre-course students (\(M=0.12, SD=0.06\)), \(t(11.69)=-2.27, p<.05\), Cohen’s \(d=1.14\). The experts (\(M=0.38, SD=0.10\)) performed significantly more mechanical actions than post-course students (\(M=0.24, SD=0.14\)), \(t(17.87)=-3.56, p<.01\), Cohen’s \(d=1.38\) (Fig. 9).

Fig. 9
figure 9

Proportion of mechanical actions taken by pre-course students, post-course students, and experts

There is one action in particular that we want to highlight: meshing gears (mesh). Incredibly, none of the pre-course students meshed any of the gears during the 5-min task, despite the fact that five of the ten pieces included gears.

We compared the proportion of mesh actions using 2 two-tailed t-tests between pre-course students, post-course students, and experts. The post-course students (\(M=0.02, SD=0.02\)) performed significantly more mesh actions than pre-course students (\(M=0.0, SD=0.0\)), \(t(7)=-3.5, p<.001\), Cohen’s \(d=1.69\). The experts (\(M=0.08, SD=0.05\)) performed significantly more productive actions than post-course students (\(M=0.02, SD=0.02\)), \(t(21.34)=-4.88, p<.001\), Cohen’s \(d=1.53\). It is worth highlighting that none of the pre-course students meshed any of the gears during the 5-min task despite the fact that five of the ten pieces included gears.

The final analysis examined whether the problem-solving approach was related to performance on the problem. To determine this, a correlation analysis was done to compare the proportion of mechanical actions to closeness to the solution. These two measures were significantly correlated, \(r(33)=0.82, p<.001, r^{2}=0.67\), indicating that the proportion of mechanical actions taken by a participant was a good predictor of how close they would come to solving the problem (Fig. 10).

Fig. 10
figure 10

Comparison of scores on the Gearbox problem to the proportion of mechanical actions for each participant. These were significantly correlated, \(r(33)=0.82, p<.001, r^{2}=0.67\)

Problem-solving approaches on the Flashlight Repair Task

On the Flashlight task a cluster analysis identified two distinct approaches to the problem. The first approach was adopted by 13 experts, 8 post-course students, and three pre-course students, and the second approach was adopted by four experts, one post-course student, and six pre-course studentsFootnote 2 (Fig. 11). We refer to the first approach as the expert approach since it was adopted by 76% of the experts, and we refer to the second approach as the novice approach since it was adopted by 67% of the pre-course students.

Fig. 11
figure 11

Makeup of the two groups found using hierarchical agglomerative clustering on the Flashlight-Repair task

In order to understand the differences between approaches, we visualized the interaction histories within each cluster across the entire task (Fig. 12). The expert histogram showed that the experts interacted with a more-uniform set of components, with more attention being paid to the batteries, cap, head, and bulb than the reflector, replacement bulb, and spring. This stood in contrast to the novice histogram, where the majority of interaction was weighted on a small number of components—the cap, batteries, and spring—with very little attention paid to the other components.

A second, longitudinal plot of interaction with components over the course of the task (Fig. 13) provided more insight into this difference. Throughout the task, the expert cluster fluidly shifted their attention across components, presumably searching and testing for sources of error. In contrast, the novice cluster becomes increasingly fixated on a single source of error: the cap and spring. Additionally, the novice cluster paid little attention to the bulb, indicating that they had failed to consider it as a potential source of error.

Fig. 12
figure 12

Cross-sectional plot of time spent attending to different components during the Flashlight-Repair Task

Fig. 13
figure 13

Longitudinal plot of attention paid to different components during the Flashlight-Repair Task

This analysis suggested that a primary difference between the two problem-solving approaches might be related to the number of components interacted with. To test this, we created an index to measure breadth of interaction by counting the number of components each participant interacted with during the problem, where the maximum number of components a participant could interact with was 14 (7 in each of the flashlights). Post-course students interacted with a significantly higher number of components (\(M=8.7, SD=1.34\)) than pre-course students \((M=6.56, SD=1.01); t(16.56)=3.96, p<.01\). However, experts (\(M=9.24, SD=1.30\)) did not interact with a significantly larger number of components than post-course students, \(t(18.56)=-1.01, p<.33\).

The final analysis examined the relationship between the problem-solving approach and performance on the problem. The participants’ scores on the Flashlight-Repair problem were compared to the number of unique states visited. These were significantly correlated, \(r(36)=0.46, p<.01, r^{2}=0.21\), indicating that the number of components interacted with was a good predictor of their ability to solve the problem (Fig. 14).

Fig. 14
figure 14

Comparison of scores on the Flashlight-Repair task to number of components interacted with. These were significantly correlated, \(r(36)=0.46, p<.01, r^{2}=0.21\)

Discussion

This study was designed to learn more about how taking part in a year-long digital-fabrication course could affect high-school seniors’ problem-solving skills. We found that that after taking part in the course, students were significantly better at solving a set of hands-on, mechanistic problems, with the post-course students making significantly more progress towards the solutions than the pre-course students. Additionally, by examining the process data we were able to identify and characterize two distinct problem-solving approaches for each problem, one adopted primarily by experts (the expert approach) and one adopted primarily by pre-course students (the novice approach). We found that post-course students were significantly more likely to adopt the expert approaches than pre-course students, providing evidence that participation in the course made them more like expert engineers. Furthermore, we found that each of the expert approaches was strongly associated with better performance on each of the problems: the higher the proportion of mechanical actions a participant took during the Gearbox task, the closer to the solution they came; and on the Flashlight-Repair problem, the more components that a participant interacted with, the closer they came to the solution.

Despite the fact that the high-school students did not learn about or work with geared mechanisms or electrical devices in the course, they still performed significantly better on both the Gearbox-Assembly Task and Flashlight-Repair task after taking part in the course. This suggested that the students had experienced a change during the course that affected their ability to solve a class of problems involving mechanistic systems. To understand the nature of this change we examined the similarities and differences between the expert and novice approaches to each problem.

On the Gearbox problem, experts performed a higher proportion of mechanical actions, such as meshing gears, while novices (i.e., pre-course students) performed a higher proportion of structural actions, such as stacking pieces on top of one another. The post-course students were significantly more likely to perform mechanical actions than the pre-course students, indicating that participation in the course made the high-school students more like experts in this regard. A particularly striking finding was that none of the pre-course students meshed the gears during the 5-min task, despite the fact that 5 of the 10 components were gears, while 7 of the 8 post-course students performed this action. On the Flashlight-Repair problem, the experts were more likely to interact with all of the components in the flashlight that could have been potential sources of error, fluidly shifting their attention across components. In contrast, the pre-course students became increasingly fixated on a single source of error—the cap and spring—while failing to attend to the burned-out bulb.

In both cases, it was as if the post-course students were better able to “see” the various components and their ways of interacting than the pre-course students, and this way of “seeing” made them more like expert engineers. On the Gearbox problem, focusing on mechanical relationships had the practical effect of restricting the problem space by reducing the number of free parts and ways of combining them, while simultaneously producing a more coherent understanding of the object-to-be-constructed. On the Flashlight problem, focusing on the mechanical (i.e., causal) relationships between components ensured that all of the sources of error were inspected, including the bulb, and helped participants avoid getting stuck repeatedly examining a single component. In both cases, the ability to “see” the mechanistic relationships between components had the practical effect of guiding the search for a solution down more productive avenues.

This finding is consistent with nearly a century of research on problem solving that has linked the ability to solve problems with perceptual acumen. Gestalt psychologists, cognitive scientists, and learning scientists have all identified ways that experts are able to better solve problems by perceiving things that novices do not, attributing their perceptual differences to their profound and well-structured domain knowledge (Chase & Simon, 1973; Chi et al., 1981; Duncker, 1939; Luchins, 1942).

However, these established explanations do not fully explain our findings. The hands-on problems that were used to assess problem-solving skills were specifically chosen because they involved mechanistic systems that students did not gain experience with during the course. Thus, improvement in students’ ability to solve these problems can’t be attributed to a change in domain knowledge, suggesting that other factors must be responsible for this change.

Theories of professional vision (Goodwin, 1994) and disciplined perception (Stevens & Hall, 1998) provide an alternative explanation that better agrees with our findings. Both theories hold that these new ways of “seeing” are learned through participation in authentic, situated social activities, where experts help novices learn to see in new ways. Thus, seeing is not merely a mental or perceptual process, but a socially organized activity “accomplished through the deployment of a range of historically constituted discursive practices” (Goodwin, 1994, p. 606). Fostering this “vision” is not solely about imparting domain knowledge but also about immersing students in authentic experiences where they can socially cultivate and refine their perception.

Theories of situated cognition argue that learning is tightly bound to the activities and contexts in which it takes place, and that traditional classroom environments and ways of learning are too different from everyday life to provide useful, robust knowledge (Brown et al., 1989). A more ideal learning environment is one where people can work together in authentic activities (i.e., the ordinary practices of a culture). While makerspaces may not be perfectly authentic learning environments, they support types of activities that are more aligned with theories of situated cognition than other environments in K-12 institutions. Students typically have the autonomy to choose their projects, define their roles in group work, and decide their daily tasks. The student–teacher dynamic is often fundamentally altered, with teachers and students collaborating on problems which don’t have an obvious solution. The iterative design approach mirrors methods used by experts, and the tools that students work with are practically identical to those used in engineering workshops, companies, and factories.

Thus, the primary value of incorporating makerspaces in schools may be that they offer students a situated, authentic learning environment, conducive to developing genuine, applicable knowledge and skills for non-academic settings. And given the unique nature of makerspaces within the K-12 landscape, they stand out as a fertile ground for enabling the cultivation of new ways of “seeing” mechanistic systems and problems that are more like those of experts. The ability to perceive and work on these types of problems is not just theoretically significant but has practical implications. Not only are these types of problems commonly encountered in STEM domains, but they are also encountered in everyday situations, such as when a household appliance breaks or when one needs to assemble a piece of furniture or a children’s toy. Thus, educational makerspaces may not only prepare students for future studies in STEM disciplines, but may also empower students with problem-solving capabilities invaluable in everyday life, bridging the gap between academic learning and real-world applicability.

Limitations and next steps

The analyses of the process data were valuable in identifying and characterizing the distinct approaches taken to work on the hands-on problems. However, because we did not assess the students’ knowledge of the mechanisms used in these problems, it was not possible to conclude with full certainty whether the differences in approaches were due to differences in knowledge, skill, or some combination of both. One way of determining this would be to use a think-aloud protocol during the problems. This method would not only make it possible to determine each participant’s familiarity and prior knowledge about the problem, but it would also provide more insight into the problem-solving strategies that each participant was using. In a future study, using a think-aloud protocol to compare experts and novices on the same set of hands-on tasks could provide deeper insight into how the differences in actions reflected differences in strategy.

A second limitation was that this study used a between-subjects design to assess changes in problem-solving skills. This choice that was made to avoid test–retest effects; however, this effectively reduced the sample size and the power of the study, leaving open the possibility that the effects were not due to changes in problem-solving skill, but simply due to an uneven distribution of students within each group (e.g., the students with prior electrical knowledge ended up in one group, and the students with prior knowledge of gears ended up in another). A within-subjects design would have avoided these problems, and would have also made it possible to investigate the effects of the course on individual students, as opposed to only being able to examine group effects.

Even a within-subjects design would not be able to account for the possibility that observed changes were due to something that occurred outside the course. This would only be possible with the use of a control group who did not participate in the course, which the current study did not use. Because of this, there is a possibility that the changes we observed were due to other experiences that the high-school students had over the course of the school year. For example, there is the possibility that the students learned about circuits or geared mechanisms in another course, which could explain the differences found in this study. Use of a control group sampled from the same population would make it possible to establish a causal link between participation in the makerspace and the development of problem-solving skills.