1 Introduction

In recent years, new interaction styles have emerged, which aim at leveraging human skills in interaction with technology. Many studies (e.g., [30, 32]) in the area of both embodied interaction [12] and tangible user interfaces [39] have revealed potential benefits of such interaction styles for learning and development (also see [31]). O’Malley and Stanton-Fraser [32] state that tangible systems encourage discovery and participation. Zuckerman et al. [42] developed multiple tangible learning systems, promoting self-reflection when learning in abstract problem domains. Antle [4] states that embodied interaction engages children in active learning, which can support cognitive development.

In a previous study [6], we have explored a whole-body interaction learning system that implements an interaction model based on embodied metaphors; the mapping between action and output relied on embodied metaphors, metaphorical extensions of embodied schemata, which are cognitive structures that are applied unconsciously in learning. This study indicated that the implementation of such embodied metaphors may enable children to reason about abstract concepts in an interactive environment by leveraging or applying embodied knowledge which is formed through early experiences in the physical world.

Incorporating embodied metaphors in learning systems therefore seems promising. However, to effectively support learning through embodied metaphors, successful interaction design is crucial. Not only must one identify the embodied metaphors children use in their understanding of the targeted abstract concepts, these metaphors must also be effectively translated into interaction models and incorporated in interactive systems.

In this paper, we introduce a people-centered, iterative approach to the design of interactive learning systems with embodied metaphor-based mappings. This approach consists of five phases and mainly relies on user involvement during the design research process. As a design case, we implement this approach in the design of Moving Sounds (MoSo) Tangibles, a tangible system for learning abstract sound concepts, and report on the different phases of the design process. Although the design process and the results of this design case are closely connected, the main focus of this paper is describing the approach rather than the specific results. The design case we present builds on and extends previous work [6] by focusing on tangible interaction rather than whole-body movement. This also enables us to discuss design considerations for both interaction styles regarding learning systems. First, we will look into the theoretical background of embodied metaphors and learning theories as well as related systems in the area of tangible and embodied interaction.

2 Theoretical background

2.1 Embodied metaphors

As became evident in the previous Sound Maker study [6], learning may benefit from interaction models based on embodied metaphors. This is grounded in theory suggesting that the cognitive structures of higher-order thinking emerge from recurrent patterns of bodily or sensori-motor experience [34]. Such recurrent patterns in bodily experiences are also referred to as image or embodied schemata [25]. An example of an embodied schema is the inout schema (see Fig. 1). From the day we are born, we have numerous physical experiences related to in and out: we put food into our mouth, poor milk out of a bottle, go into a room, etc. All these experiences share the same structure: a container and a movement in or out of this container. This basic structure forms the embodied schema inout.

Fig. 1
figure 1

The relation between bodily experiences, embodied schemata, and embodied metaphors

Such embodied schemata are used to reason about abstract domains. For example, when we say “I am in love,” we (unconsciously) apply the embodied schema inout to structure our understanding of the abstract concept love, viewing love as a container and ourselves as an entity being in or out of this container. This human ability to project the structure of bodily originating schemata onto a conceptual domain is what is meant by metaphor [28]. A metaphor allows us to understand or experience one concept (target domain) in terms of another (source domain). When the source domain involves schemata that have arisen from bodily experiences, we call them embodied schemata and the metaphors, embodied metaphors.

2.2 Learning theories

Since the beginning of the twentieth century, several psychologists have studied the use of physical objects for learning and development. Vygotsky and Galperin, for example, state that “mental acts originate in material acts” [33, pp 21] (quote translated from Dutch). Bruner [11] has shown that physical objects play a major role in bridging the abstract and the concrete. Both theories [11, 33] underline the importance of combining experience and reflection. This happens for example when learning about addition with an abacus. In such a case, a child will first start sliding the beads (experience), after which he or she will look at the results and notice the beads being regrouped (reflection). Learning and knowledge acquisition (e.g., gaining a symbolic understanding of the concept addition) takes place when frequently shifting between experience and reflection [2].

Similar to the role of physical objects in mathematics education is the role of body movement in the process of learning and understanding abstract concepts related to musical sound [24, 27], which are the focus of the interactive learning system developed in our design case. Wessel [40] for example emphasizes that rich sensory-motor engagement enhances the experience of music. Based on children’s early experiences combining movement and sound perception, Juntunen and Hyvönen [27] suggest a metaphorical link between body movement and abstract sound concepts such as pitch or volume. This link relies on embodied metaphors, which enable children to understand abstract (sound) concepts in terms of concrete (embodied) concepts. For example, children can understand the concept volume (soft versus loud) in terms of concrete, movement-related concepts (for example, slow versus fast or up versus down). Various movement-related metaphors are used in music education [24, 27], bridging the physical to the abstract. This also enables shifting between experience and reflection, which forms the basis for knowledge acquisition.

3 Related work

3.1 Interactive learning systems

As the field matures, a growing number of studies are exploring the design and evaluation of tangible and embodied interaction to facilitate learning and development (e.g., [29, 32]). An early example of a tangible interface developed for learning was introduced by Resnick et al. [35]. They presented “digital manipulatives,” computationally enhanced toys that enable children to explore scientific concepts in a playful manner. Several other examples of tangible or embodied interaction for learning focus on learning of abstract (mainly mathematical) principles. Zuckerman et al. [42] describe “Montessori-inspired Manipulatives” (MiMs); technology enhanced building blocks that enable children to physically explore abstract concepts. An example of a MiM is “SystemBlocks”; building blocks that simulate system dynamics. Hashagen et al. [18] present “Der Schwarm,” a full-body interaction environment enabling children to learn about swarm or flock behavior. Horn and Jacob [19] present “tern,” a tangible system consisting of jigsaw puzzle like artifacts used to create simple computer programs. Girouard et al. [16] describe SmartBlocks, a tangible interface designed for exploring the volume and surface area of 3D objects. As underlined in several studies (e.g., [4, 29]), tangible interfaces seem particularly valuable for learning in abstract problem domains by relating abstract concepts to physical experiences or concrete examples.

Tangible and embodied interaction is also a frequently explored interaction style for manipulating sound and music (e.g., [26, 36]). Some of these systems target children, such as “Marble Track Audio Manipulator” [9], a tangible system for creating musical compositions, and “Pendaphonics” [17], a large-scale tangible interface usable as a musical instrument and performance tool. Body Beats [41] uses whole-body interaction to help children recognizing and creating sound patterns. Birchfield et al. [10] presented SMALLab, a whole-body interactive environment that can be used for several educational purposes, including movement and sound teaching. The previously mentioned Sound Maker system [6] was designed to study the benefits of embodied metaphor-based mappings in interactive environments for children. The present paper extends this work by introducing a design process for the development of embodied metaphor-based (learning) systems, as well as by presenting a design case focusing on tangible interaction rather than whole-body movement.

3.2 Approaches to designing embodied metaphor-based interactions

The aim of the study described in this paper is to explore an iterative design approach to the design of learning systems with embodied metaphor-based interaction models. An interaction model specifies the mappings between input action and output response. Although metaphorical mappings are promoted by several others (e.g., [22, 23, 37]), little is known about how to identify embodied metaphors, or how to implement them effectively into interaction models for new systems.

Fels et al. [14] use metaphors in their interface design for musical expression. However, the motivation for choosing particular metaphors is not mentioned. In their design of intuitive interactions, Hurtienne et al. [22] suggest relying on metaphors that are already documented, or using a systematic user-centered design process, in which metaphors are identified through contextual interviews. This latter method seems sufficient when existing interactions are redesigned. However, the design of new (metaphorical) interactions is required in many cases, such as when aiming at activities for which currently no interactive systems exist. In such cases, an analysis of current interactions is not possible. For interactive learning systems, it is furthermore crucial to identify the embodied metaphors that underlie how we structure and reason about the targeted abstract concepts (i.e., identify metaphors that are used to “make sense”). Relying on documented metaphors can help ensuring a bodily basis for the chosen mapping; however, choosing the most suitable metaphor for a learning system seems difficult. Though some design knowledge is derived from example metaphor-based systems [5], no literature on specific approaches to the design of interaction models based on these metaphors is known to the authors.

In the Sound Maker [6], embodied metaphors were identified in consultation with choreography experts and elicited through workshop style pilot studies before they were implemented in a whole-body interaction model. One of the findings of the studies with this system was that the “discoverability” of the mappings, which is the likelihood of participants discovering a mapping by chance, turned out to play a major role in learning to use the system [6]. Clearly, both selecting the right metaphors and implementing them effectively are key to the successful design of metaphor-based learning systems.

4 Design approach

Although incorporating embodied metaphors in learning systems seems promising, effective interaction design is crucial to the potential success of such systems. However, identifying the embodied metaphors children use in their understanding of the targeted abstract concepts as well as translating them effectively into interaction models is not straight forward [5]. Particularly when new interactions are designed, rather than existing interactions redesigned, current literature offers few guidelines to the approach of such design processes. When looking at approaches recommended for the design of future intelligent systems [21], as well as those suggested in tangible interaction research [20], user involvement in many stages of the process is often recommended. Therefore, we propose a people-centered, iterative design approach to the design of embodied metaphor-based interaction models. This approach describes a process in which selection and implementation of metaphors is based on and evaluated through iterative user studies. Our approach consists of five phases:

  1. 1.

    Enactment studies to identify applicable embodied metaphors

  2. 2.

    Creating low-fidelity prototypes based on embodied metaphors, to explore the input design space

  3. 3.

    Evaluating low-fidelity prototypes to validate the input design space in terms of affordances which support embodied schematic movements

  4. 4.

    Creating high-fidelity interactive prototypes with suitable affordances, to explore the mapping between the input design space and metaphorically linked output responses

  5. 5.

    Evaluating high-fidelity interactive prototypes, to validate the input design space, embodied interactional mappings and output responses

In the coming sections, we will discuss these five phases in detail. In order to illustrate our approach, we will elaborate on a design case, in which we implement this people-centered, iterative approach in the design and evaluation of the interactive learning system Moving Sounds (MoSo) Tangibles.

5 Design case

Extending our previous work [6], our goal for the design case we present here was to design an interactive system for learning about abstract sound concepts. We developed this system, called Moving Sounds (MoSo) Tangibles, in the context of a research study on how to design metaphorical interaction models as well as on how such systems can support learning. This latter research aim is beyond the scope of this paper. Regarding this agenda, MoSo was designed to enable research, rather than to be directly applicable in a classroom context. However, the approach we propose is applicable to both design and research-through-design processes.

The interaction models incorporated in Moving Sounds (MoSo) Tangibles were based on embodied metaphors. In previous work [6], we found evidence that in some cases, more than one embodied metaphor was suitable to reason about a particular abstract sound concept. For example, changes in pitch can be understood in terms of lowhigh, but also in terms of slowfast schemata. If abstract musical concepts can be understood in multiple ways, implementing more than one embodied metaphor-based mapping in an interactive learning system may benefit the learning process of certain abstract concepts. Compared to a system with a single mapping, a system with multiple mappings could make learning easier, as children can be supported in reasoning about the same concept in more than one way. This may result in a more comprehensive understanding of the concept that is potentially more easily transferable to other contexts. This also corresponds to a frequently used approach of using multiple representations when teaching complex scientific concepts [3, 38]. The research goals of the MoSo design case therefore were (1) to explore whether multiple embodied metaphors were applicable to single sound concepts and to identify these specific metaphors, (2) to explore how these metaphors could be implemented in the design of interactive systems, and (3) to explore how children interacted with such systems.

To enable this study, we designed MoSo tangibles; a set of interactive artifacts in which multiple embodied metaphor-based mappings were implemented to support children in learning about a set of single sound concepts. Similar to the previously mentioned Sound Maker prototype [6], embodied metaphor-based movements were mapped to sound changes, enabling the children to structure their understanding of each sound concept in terms of movement-related concepts. Unlike the Sound Maker system [6], the MoSo system presented in this design case relied on movement with tangible artifacts rather than whole-body movement. This provided a clear distinction between different mappings, as each different mapping between movement and sound change is integrated in a different tangible artifact. Furthermore, this enabled us to compare the two interaction styles.

In music education, one of the first learning goals is for children to become acquainted with sound concepts. In consultation with music teachers, we found that starting from the age 4 or 5, children learn about concepts such as volume, tempo, pitch, and timbre. To avoid using language, which could make the concepts too abstract, they are generally first explained in terms of movement. For example, as children listen to a melody played slowly and then one played quickly, they may be encouraged to respond to changes in tempo with changes in the speed of their movement. This activity helps children gain a preliminary understanding of the concept tempo in terms of their experiences of movement. Another beneficial activity, which is not often employed in music education, is to have the music react to the children’s movement. As stated by the music teachers we consulted, having children control the music through movement requires that they have mastered a basic understanding of sound concepts (e.g., pitch, volume, tempo). Typically, this occurs in preschool or kindergarten. We therefore targeted our system to children aged 7–9 who have a basic conceptual understanding of the ways sounds can vary or change. The learning goal for these children is then to gain a more comprehensive understanding of these sound concepts, which includes being able to generate as well as reason about changes in sound parameters. This could be seen as a step toward knowledge transfer to other contexts, such as the understanding of how musical notation (i.e. an abstract symbol system) represents sound changes.

6 Enactment study to identifying embodied metaphors (user study 1)

We will now elaborate on each phase of the proposed design approach. Each phase will directly be illustrated by the design case in which we present the MoSo Tangibles interactive learning system.

The aim of our design approach is to effectively design interactive learning systems that implement embodied metaphors in their interaction models, enabling children to leverage embodied knowledge in their understanding of abstract concepts. First, a specific set of abstract concepts that children are to be taught about by means of an interactive learning system should be laid out. Once the targeted abstract concepts have been selected, the next step is to identify the specific embodied metaphors that underlie how we structure and reason about these abstract concepts. To find empirically grounded evidence for relevant embodied metaphors, we propose to conduct an enactment study with children in the target age group. This means asking children to make up movements with which they enact changes in the abstract concepts one is designing for. The goal of this enactment study is to identify metaphorical mappings between actions and changes in these concepts. These metaphors can be used to inform the development of low-fidelity prototypes in the next phase of the design process. Furthermore, this enactment study can be used to validate the extent to which children already have an understanding of the used set of abstract concepts, which can inform the choice of abstract concepts incorporated in the interactive system.

6.1 MoSo Tangibles design case

In the MoSo design case, we conducted the enactment study (user study 1) with 65 children of 7–9 years old (35 girls and 30 boys). These participants were asked to enact changes in sound concepts. This study is extensively reported in [8] and will be summarized in this section.

The user study covered eight abstract sound concepts: volume, pitch, tempo, rhythm, timbre, harmony, articulation, and tone duration. During the user study, children were placed in groups of five to seven children. Each group listened to a short sound sample in which one of these concepts changed from one extreme to another (e.g., slow to fast music or rhythmic to non rhythmic music). The children were first asked to explain what they had heard, in order to verify their initial understanding of the concepts. After that, the sample was played several times, and the children were asked to make up movements to enact the sound change. Since our design focus is on tangible systems, some groups used an artifact (a flexible ring) to enact the sound change with, while other groups employed full-body movement. See Fig. 2 for an impression of the study.

Fig. 2
figure 2

Impression of user study 1, aiming at identifying embodied metaphors: whole-body movement (left) and moving with an artifact (right)

As a result of these exercises, it became apparent that the children in the targeted age group (7–9) had a basic understanding of the concepts pitch, volume, and tempo. They were able to recognize the related sound changes, and some could even name them in terms of their parameter values (low–high, soft–loud and slow–fast). The understanding of the other concepts explored in this study was much lower. As advised by music teachers, concepts of which children do not have a basic understanding should first be taught by reacting to music with movement, rather than manipulating it through movement (as was intended with our interactive system). Therefore, we decided to include only the concepts pitch, volume, and tempo in our interactive learning system.

In order to identify the embodied schemata (that form the source domain for metaphorical interpretations of sound changes) used in the children’s enactments, we analyzed the video captured during the study via open coding. We searched for behaviors (e.g., sequences of actions) that enacted or reflected the schematic origins of embodied metaphors. Although no pre-defined coding scheme was used, being familiar with literature on embodied metaphors (e.g., [22, 25, 28]) has likely supported our observations. We found evidence for two types of metaphors: those based on quality of body movement and those based on changes in location. Metaphors related to body movement are those in which the qualities of body movement are mapped to sound parameters. For example, a child might wave slowly to enact soft volume and wave fast for loud volume. Metaphors related to location are those in which the change in location of an artifact or body (-part) is linked to a sound parameter. For example, a child might hold a ring low for low pitch and high for high pitch. Besides the type of metaphor, we also identified and recorded the embodied schemata associated with each metaphor through analysis of the children’s movements. See Table 1 for an overview of the embodied metaphors identified for pitch, volume, and tempo.

Table 1 Results from user study 1 for volume, tempo, and pitch: the identified metaphor types, the embodied schemata they are based on and examples of enactments (number of cases between brackets)

The embodied metaphors described in Table 1 extend the embodied schemata smallbig (movements that occupy small or large space), slowfast (slow or fast movements), quietwild (movements performed with low or high energy or low or high force), and lowhigh (low or high location). For tempo, we only found metaphors based on slowfast, which we subdivided into “succession” (when a movement was repeated slowly or fast) and “speed” (when the actual speed of the movement was linked to the tempo of the music).

When we look at the metaphors used in previous studies [6], we see that the metaphor we identified for tempo (slowfast) matched the one used in the Sound Maker prototype. For volume, the Sound Maker [6] implemented the schema activeinactive in the mapping for volume, which can either correspond to small and big movements or quiet and wild movements. Despite this similarity, in this study, we have decided to distinguish smallbig and quietwild as the resulting movements were rather different. Furthermore, both metaphors may in different ways support structuring your understanding of the concept volume; either in terms of small and big movements which is often used by the music teachers we interviewed or as low or weak force, which literally results in soft or loud sound (e.g., clapping with low force versus clapping with high force). For pitch, the Sound Maker implemented the embodied schema nearfar mapped to pitch (near corresponding to high pitch and far corresponding to low pitch). Interestingly, in this study, this metaphor was not seen in the children’s enactments of pitch.

The results of this user study (see [8] for the results of all sound concepts) confirm that children enact multiple different embodied metaphors in their understanding of single abstract sound concepts. When comparing the groups that employed whole-body movement in their enactments to the groups that were given artifacts to move with, we saw no major difference in the observed embodied schemata.

7 Designing low-fidelity prototypes

After the embodied metaphors used by the target group to structure their understanding of the abstract concepts have been identified, the next step is to implement them into interaction models. The goal of this phase is to explore the possibilities in the input design space. This requires iteration in the design process. We therefore propose to develop low-fidelity prototypes that can be moved according to the identified metaphors. These prototypes should enable the intended metaphorical movement, but do not need to include technology. Such low-fidelity prototypes may have generic form so that multiple movements are possible, but these prototypes can also be designed to afford specific metaphorical movements. Building several different low-fidelity prototypes enables efficient exploration of different ways of implementing the selected schemata and metaphors into interaction models.

7.1 MoSo Tangibles design case

In the first design iteration of the MoSo design case, we used the embodied schemata identified in user study 1 as a starting point, as well as the nearfar schema that was used for pitch in the Sound Maker [6], to enable comparison. Based on these schemata, 14 low-fidelity prototypes were created that could each be used to link one or more of these schemata to metaphorically related sound changes.

For tempo, we found only one metaphor, extending the embodied schema slowfast. This metaphor is clearly very strong, possibly because the dynamics of action and sound are isomorphic (fast movements are directly related to fast sound). When this schema is implemented in different mappings (for example, one as “succession” and one as “speed”), children will still be enabled to structure their understanding of the concept tempo in more than one way. The fact that the metaphor is so prevalent enables an interesting comparison to the concepts pitch and volume, for which more different embodied metaphors were found. See Fig. 3 for pictures of the low-fidelity prototypes.

Fig. 3
figure 3

Fourteen initial low-fidelity prototypes, inspired by the embodied schemata smallbig, slowfast, quietwild, lowhigh, and nearfar

Some of these low-fidelity prototypes were more generic than others in terms of the kinds of actions they afford. For example, a simple stick-shaped artifact can be moved in many different ways, whereas other low-fidelity prototypes afford a single movement (e.g., rotating). Although the eventual goal is to design artifacts that each have one clear interaction possibility, the more generic artifacts were important in the design process because they may inspire the design process and support exploration of different metaphors children use when interacting with such artifacts.

To evaluate how these artifacts may be used, three informal evaluation sessions were set up, each with one (adult) participant. These adults were given the low-fidelity prototypes and were asked how they would move these artifacts to manipulate the sound concepts. The usage as well as advantages and disadvantages of each artifact was informally discussed. As a result, we found that some low-fidelity prototypes were used differently than intended, indicating that either the interactions intended by the designs did not match the participant’s idea of how to enact the sound change, or that the artifact did not afford the intended movement. This exercise also revealed some interactions that were not thought of before. As a result of these evaluation sessions, we developed an improved set of 12 low-fidelity prototypes, see Fig. 4.

Fig. 4
figure 4

Twelve improved low-fidelity prototypes

8 Evaluating low-fidelity prototypes: how affordances support schemata (user study 2)

Having developed several low-fidelity prototypes that can map the identified embodied schemata to the targeted abstract concepts, the third phase involves the evaluation of these low-fidelity prototypes with the target group in a second user study. This enables determining whether the metaphors are implemented in the interaction models of the low-fidelity prototypes in a way that affords the intended movement. Results of this second user study will inform the design of the interactive learning environment.

8.1 MoSo Tangibles design case

In the MoSo Tangibles design case, we identified multiple embodied metaphors, which were used unconsciously by children to structure their understanding of musical sound concepts. Based on these metaphors, we developed 12 low-fidelity prototypes that could be moved to trigger changes in pitch, volume, or tempo. To evaluate the implementation of embodied metaphors in the low-fidelity prototypes, a second user study was performed with 50 children (7–9 years old). To avoid bias, none of these children had participated in user study 1. The participants were divided over 13 groups. For time reasons, each of these groups worked with only one sound concept (pitch, volume, or tempo). During the study, the children first listened to a short sound sample in which the concept changed from one extreme to another (e.g., soft to loud volume). Next, each child was given a different low-fidelity prototype and was asked to move it in such a way that the sound change was enacted, while the sample was played again. After this enactment, the children exchanged their low-fidelity prototypes, and the exercise was repeated until all children had played with all low-fidelity prototypes. See Fig. 5 for an impression of user study 2.

Fig. 5
figure 5

Impression of user study 2, aiming at evaluating the implementation of the embodied metaphors in the low-fidelity prototypes

To evaluate the metaphors children used when moving the low-fidelity prototypes, the experiment was captured on video. In an analysis of this video, we noted for each enactment (and thus for each artifact) which movement the child made and which embodied schema may underlie this movement. See Table 2 for an overview of the children’s movements and the embodied schemata that were identified. Note that the numbers of children mentioned in Table 2 represent the numbers of children that were captured on video. As we did not have permission to film all children and some children incidentally performed their tasks out of sight of the camera, the numbers were not equal for each musical concept or for each low-fidelity prototype.

Table 2 Results from user study 2 for pitch, volume, and tempo, for each low-fidelity prototype: the most common movement (in the gray rows) and the embodied schemata evident in the children’s movement with the low-fidelity prototypes (in the white rows). The number of children that performed the enactment, in relation to the total number of children that moved the artifact, is shown between brackets. The embodied schema slowfast has for the concept tempo been subdivided in slowfast speed and slowfast succession (shortened to succ.)

In the analysis of user study 2 (see Table 2), we saw consistent patterns of interactions and enactments of metaphors with some artifacts, but less consistency with other artifacts. For example, the two rotating artifacts (bottom and top left in Fig. 4) were rotated by 17 out of 18 participants when enacting changing volume, and the embodied schema slowfast was enacted by 16 participants. On the contrary, the stick with beads attached to it (top right in Fig. 4), designed based on the embodied schema lowhigh, was moved in several different ways to enact changing volume, none of which implemented the lowhigh schema. Interestingly, most of the children working with pitch did move this artifact low and high, showing that the low-fidelity prototype does afford low and high movements. This may indicate that a metaphor extending the lowhigh schema may be less appropriate for volume when implemented in a tangible artifact, even though it was identified in enactments of changing volume in user study 1.

As mentioned before, the schema nearfar was not seen in enactments during user study 1, but was used in the previously performed Sound Maker study [6]. Some low-fidelity prototypes were therefore based on this schema, and many children made near and far movements when enacting pitch with these prototypes (see Table 2). In user study 2, the nearfar schema was seen even more often in enactments of changing pitch than the schema smallbig. This could be related to the affordances of some of the objects. On the other hand, although nearfar is location based and smallbig is movement based, the two metaphors are rather similar and could even easily be confused. Holding your hands close to each other and gradually moving them away from each other is clearly based on a nearfar schema. However, when one is jumping or moving his arms up and down simultaneous to the near and far movement, possibly as a reaction to the rhythm of the playing music, this same movement could also be interpreted to be based on a smallbig schema. This means that the smallbig schema that became evident for pitch in user study 1 [8] may in a number of cases actually have been a nearfar schema or a combination of both. This may explain why we saw many smallbig enactments for pitch in user study 1, but hardly any in user study 2. The fact that we saw quite some nearfar enactments in user study 2, but none in user study 1, may also indicate that the interpretation of some of the movements was not consistent between user study 1 and 2. This will be further discussed in the discussion section.

9 Designing high-fidelity prototypes: Moving Sounds Tangibles

Once low-fidelity prototypes have been evaluated, the results can be used to inform the final design of the embodied metaphor-based interactive system. This involves determining which metaphors to implement (based on the results of user study 1) and how to implement them in terms of affordances (based on the results of user study 2).

9.1 MoSo Tangibles design case

The aim of the design case described in this paper is to design a tangible learning system to enable research in the area of embodied metaphor-based learning systems. For the purpose of this research, we decided to select three mappings for each abstract sound concept, which were realized as interactive tangible artifacts forming the learning system “Moving Sounds (MoSo) Tangibles.” The design of MoSo Tangibles will be described in this section.

The metaphors we found for pitch in the first experiment were based on the embodied schemata lowhigh, smallbig, slowfast, and quietwild. However, as discussed in the previous section, the smallbig schema may in a number of cases be mistaken for the nearfar schema, which was also used in our previous study [6]. To enable comparison to our previous work, we have decided to use the mappings lowhigh, nearfar, and slowfast for pitch in the Moving Sounds Tangibles system. We developed three interactive tangible artifacts based on these embodied schemata (see Fig. 6). To enable studying the effects of (multiple) embodied metaphor-based mappings on learning, it is important that the different mappings are easily distinguished. For this reason, we have implemented each mapping in a separate tangible artifact. The “puller” artifact is based on the design of the accordion-shaped low-fidelity prototype that showed to afford near and far movements as a result of user study 2. The “stick” design is based on the low-fidelity prototype of a stick with beads attached, which showed to afford low and high movement. The beads were not included in the final artifact to avoid rotating movement, which could be confusing. The “rotator” artifact is based on one of the rotating low-fidelity prototypes, as these objects showed to afford rotating movement.

Fig. 6
figure 6

The three tangible artifacts designed for manipulating pitch and a description of the intended interactions, implemented embodied schemata and mappings

For volume, we found the schemata smallbig, quietwild, slowfast, and lowhigh as a result of user study 1. However, from user study 2, it became apparent that even though some low-fidelity prototypes afforded low and high movements, the lowhigh schema was rarely used to enact volume. Therefore, we have decided to incorporate the schemata smallbig, quietwild, and slowfast for the three artifacts that can be used to manipulate volume. This resulted in the tangible artifacts depicted in Fig. 7. The “squeezer” is based on the smaller low-fidelity prototype with a spring in it. Although some other objects were moved according the schema quietwild by more children, the spring low-fidelity prototype was consistently squeezed by 9 of 11 children. This shows that the affordances of this low-fidelity prototype were clear. The “waver” is based on the stick with a ribbon attached; this low-fidelity prototype was moved according to the smallbig metaphor by 3 of 8 children, and the waving movement was used by 5 of 8 children. The “rotator” is based on the rotating low-fidelity prototypes used in user study 2.

Fig. 7
figure 7

The three tangible artifacts designed for manipulating volume and a description of the intended interactions, implemented embodied schemata and mappings

As mentioned before, we only found the metaphor slowfast for tempo. To enable children to learn about tempo in multiple different ways, we decided to design three different artifacts that can all be moved according to the schema slowfast, but through different kinds of movements. As a result of user study 1, we subdivided this schema into slowfast (succession) and slowfast (speed). As slowfast (succession) was more often seen in user study 1, we decided to design two artifacts based on this mapping and one based on slowfast (speed). See Fig. 8 for the tangible artifacts designed for tempo. The “accordion” artifact was based on the design of the accordion-shaped low-fidelity prototype, as this low-fidelity prototype was moved similarly by all but one child in user study 2. The “shaker” design was based on the ring-shaped low-fidelity prototype, which was chosen because the shaking movement is clearly different from the movements made with the “accordion” artifact. We decided to make some changes to the design to improve the affordance for shaking movement. The “rotator” artifact was based on the design of the rotating low-fidelity prototypes as these were the ones that clearly afforded the schema slowfast (speed).

Fig. 8
figure 8

The three tangible artifacts designed for manipulating tempo and a description of the intended interactions, implemented embodied schemata and mappings

The tangible artifacts depicted in Figs. 6, 7 and 8 together form the interactive learning environment Moving Sounds Tangibles. As seen in these figures, the artifact “rotator” is used for all three sound concepts. Resulting from user study 2, a rotation movement seems a very clear and sensible way to map the schema slowfast to sound changes. This metaphor turned out to be applicable to all three concepts. As time constraints did not allow designing three different rotating interactions, we decided to use the same artifact for all three purposes. Given the research aims of this design case however, it would have been ideal to have three different interaction models for rotating movement, as this would enable equal comparison to other artifacts. However, if the aim is to design interactive artifact for classroom use, one may choose to use only one artifact for multiple purposes, if metaphorically feasible, to reduce the number of artifacts in a set.

The Moving Sounds tangible artifacts contain basic sensors that measure the movements that the artifacts were intended to evoke. For example, the rotator contains sensors to measure rotation speed, and the squeezer contains a pressure sensor to measure applied force. The sensor data are wirelessly transmitted to a computer and processed by a specifically designed program (written in Processing [1]). This program determines the appropriate change in pitch, volume, or tempo and generates sound accordingly. To enable clear perception of the changes in sound parameters, we used basic tones rather than complicated melodies. The technical implementation of MoSo Tangibles is described in detail in [7].

10 Evaluating high-fidelity prototypes (user study 3)

When a (set of) working prototype(s) with embodied metaphor-based interaction models is available, a third and final user study can be set up in order to evaluate the design. The set up of this experiment may largely depend on the intention of the design. However, we propose assessing how easily users learn how to use the design for the intended purpose. This will reveal how successful the implementation of embodied metaphors in the interaction models was.

10.1 MoSo Tangibles design case

To evaluate how well children were able to interact with high-fidelity MoSo Tangibles, we performed a third user study, for which we recruited 39 participants (age 7–9, 25 girls and 14 boys) from two different elementary schools. The participants were divided over two conditions. Children in the one-artifact-condition played with one MoSo Tangible to learn about each of the concepts pitch (either puller, stick, or rotator), volume (either squeezer, waver, or rotator), and tempo (either accordion, rotator, or shaker). The children in the three-artifact-condition were given all three artifacts to explore each concept. Note that the choice of these conditions is related to the research agenda for which we developed MoSo Tangibles, which also included a comparison of learning effects between the two conditions. However, as this learning-analysis is beyond the scope of this paper, the two conditions do not play a major role in this section. Here, we aim to evaluate the extent to which children were able to effectively interact with MoSo, in order to validate the design of our high-fidelity prototypes. Furthermore, we wanted to compare these results with the results of our previous study in which a whole-body interaction environment was used.

10.2 Procedure

Since the MoSo Tangibles system was designed to be used by one child at a time, each child participated in an individual session of about 20 min. A pilot study with eight children (7–8 years old) was conducted to verify the experiment procedure. The results were, among other things, used to refine the introduction to the children. The procedure for the final study was defined as follows:

  1. 1.

    Exploration (3 min): Each child had 3 min to explore one (one-artifact-condition) or three (three-artifact-condition) MoSo Tangibles to manipulate either pitch, volume, or tempo. No explanation was given regarding how to move the artifact(s), or which musical parameters to manipulate. The child was only told that moving the artifact(s) would cause the sound to change.

  2. 2.

    Three reproduction tasks (3 min): After the exploration, the child performed three tasks in which he or she was asked to reproduce a sound sample with (one of) the explored tangible artifact(s). The children in the one-artifact-condition used the same tangible for each task, while the children in the three-artifact-condition were given a different tangible for each task.

  3. 3.

    Interview (three-artifact-condition only, 1 min): the children in the three-artifact-condition were asked which of the three tangibles they preferred, as well as which one they thought fit the related sound change best.

These activities were repeated for all three sound concepts.

For each sound concept (pitch, volume, and tempo), three different interactive tangibles are available. To have objective results, these artifacts were equally divided over the one-artifact-condition sessions (e.g., one-third of these children used the waver for volume, one-third used the squeezer, and one-third used the rotator). As this turns the one-artifact-condition into three separate conditions, we assigned 27 children to the one-artifact-condition (9 for each tangible) and 12 children to the three-artifact-condition. The order in which the sound concepts were explored was counterbalanced over the different sessions. The same holds for the order in which the tangible artifacts were handed to the children in the three-artifact-condition.

The user studies were either performed in a separate classroom or in the school’s auditorium. In both cases, no other children but the one participating was present. The user study was captured on video.

10.3 Results

From the video taken during the study, we analyzed whether the children succeeded in reproducing the sound samples using the MoSo Tangibles. In this analysis, we found that all children were able to reproduce the sounds within the set timeframe of 1 min, although some additional explanation was needed in approximately 25% of all cases. Such explanation did not involve telling the children what to do but consisted of giving hints such as “what did you hear in the sound sample?” or “how did you move the artifact before?” Therefore, all children based their interactions with MoSo on their own reasoning.

The relative number of times additional explanation was required did not differ greatly between the two conditions; it seemed more related to the individual child (possibly depending among other things on attention span). In other words, even though the children in the three-artifact-condition only had 1 min to explore each object while the children in the one-artifact-condition had 3 min, both groups were equally able to execute reproduction tasks within the set timeframe.

Apart from the video analysis, other qualitative results were gathered from the interviews taken in the three-artifact-condition. In these interviews, the participants were asked to indicate, for each sound concept, which of the three artifacts they thought fit the sound change best (i.e. which mapping made most sense to them). As a result of these interviews, we have seen that the personal preferences were approximately equally divided over the different artifacts (e.g., for volume, five children preferred the rotator, four children choose the squeezer, and three children thought the waver fit the sound change best).

The results from both the reproduction tasks and the interviews revealed that different designs were equally effective. This provides evidence to validate successful implementation of the selected embodied metaphors in tangible artifacts, developed through an iterative and people-centered design process.

11 Discussion

We have shown that our people-centered, iterative design and evaluation approach used in the design of an embodied metaphor-based learning system (MoSo Tangibles) was effective in both identifying and implementing embodied metaphors. In this section, we will discuss what we have learned from the implementation of this approach and what we can generalize in order to inform other researchers and designers. Furthermore, we will discuss a comparison with previous work.

11.1 Identifying and selecting embodied metaphors

The design approach described in this paper consists of five phases and relies mainly on user involvement during several iterations. In the enactment phase (user study 1), children enacted sound changes either through whole-body movement or with a generic and non-interactive artifact (a plastic ring). In both the low-fidelity evaluation (user study 2) and high-fidelity evaluation (user study 3) phase, newly designed artifacts were used. When comparing the results of these different studies, we see that design constraints (e.g., generic versus specifically designed artifact) are of great influence on the results of such studies. As mentioned before, the nearfar schema was not identified in user study 1, but nevertheless, we implemented it in artifacts evaluated in user study 2 as well as in our final design. The nearfar schema was also used in the (whole body) interaction model of the Sound Maker [6], where children worked in pairs. This indicates that “restricting” the children to use one artifact (the ring) or one body (their own body) can result in different metaphors than the ones found when specifically designed artifacts are used or when multiple artifacts or bodies are involved in the exploration. Enactment studies (user study 1) may benefit from offering participants several different materials to perform the enactments with (e.g., use multiple artifacts, encourage collaboration).

Furthermore, the difference in results between the enactment study (user study 1) and the low-fidelity evaluation phase (user study 2), particularly regarding the nearfar schema, may have resulted from a coding mistake in user study 1. The nearfar schema has likely been interpreted as a smallbig schema in a number of cases. Many children made jumping movements in their enactments as a reaction to the rhythm of the music, which may have caused near and far movements (e.g., with the hands) to appear to be small and big movements, while such movement could indicate a combination of the two schemata. This may explain that nearfar was not found in the enactment study, whereas it was found multiple times in the evaluation of low-fidelity prototypes. This coding mistake could have occurred because we used open coding, meaning that we did not set up a coding scheme in advance but clustered the movements during the analysis. Semi-open coding (i.e., pre-defining a number of likely embodied schemata while leaving opportunity for identifying new schemata) could likely have solved this problem. Enactment studies aimed at the elicitation of embodied metaphors could therefore benefit from a list of embodied schemata, such as the one proposed by Hurtienne et al. [22]. This way, one would rely on documented mappings, ensuring a bodily basis for the chosen mapping. Furthermore, direct user involvement as well as an opportunity to identify new metaphors ensures choosing the metaphor that underlies how the target group reasons about the abstract concept, which is particularly challenging when new interactions are designed rather than existing interactions being redesigned.

Although the database proposed in by Hurtienne et al. [22] does likely not contain all possible schemata and is still under construction, it would be interesting to compare the embodied schemata implemented in MoSo to the documented schemata. When doing this, we see that all our schemata correspond to documented ones (although lowhigh is documented as updown), except for quietwild. However, the schema strongweak is documented, which seems rather similar to the way we have implemented quietwild, namely as applying weak force versus applying strong force. Furthermore, the smallbig schema is included in the database in the “attribute” category, referring to small and big as a property of an object or entity rather than as a quality of movement. Comparable movement-related schemata are not documented in [22]. A possible explanation is that this database [22] documents image schemata rather than embodied schemata. Although these two types of schemata are comparable and an overlap may exist, the major difference is that image schemata are often identified through linguistic analysis (also see [25, 28]) while embodied schemata arise from bodily experiences. Therefore, movement-related schemata may be documented as embodied schemata (e.g., [6]) and not as image schemata. This illustrates that leaving an opportunity for identifying new schemata in an enactment study such as user study 1 is essential. However, this exercise has also shown that semi-open coding using a coding scheme based on documented schemata (e.g., [22]) may help distinguishing schemata that may otherwise be confused. Furthermore, such an approach encourages using commonly known names for the identified schemata, which is useful to the generalization of gained knowledge.

The approach we applied differs from other approaches to the design of metaphor-based interaction models described in literature [6, 14, 22], because we involved users at several stages of the process. In our previously performed Sound Maker study [6], we mainly relied on experts to select the appropriate metaphors. When comparing the two studies, we see an interesting difference regarding the nearfar mapping for pitch. In our presently described study, children mapped near to low pitch and far to high pitch, while it was implemented the other way around in the Sound Maker. Clearly, different approaches may lead to different implementations of embodied schemata. Multiple iterations can help finding the implementation that makes most sense to the target group.

11.2 Using this approach in other contexts

In this paper, we present a people-centered iterative approach to the design of embodied metaphor-based learning systems. In the MoSo Tangibles design case, we involved over 150 children. Though large numbers of participants will increase the reliability of the results, it also complicates and lengthens the design process. Therefore, involving such large numbers of children will not be feasible in all contexts and for all purposes, particularly in design rather than research-through-design contexts. In this subsection, we discuss how this approach can be applicable in a broader context.

In the MoSo design case, a system was developed in which three different musical concepts could be manipulated, each through three different metaphorically related schematic input actions. Given our research-through-design focus, our goal was not to design a system that is directly usable in a class-room context for a particular learning goal. Therefore, the design process started with eight musical concepts. Not all participants in user study 1 were subjected to all concepts. Furthermore, due to the large number of low-fidelity prototypes resulting from our approach to implement multiple metaphorical mappings for each concept, each participant in user study 2 was only subjected to one of the three concepts. Therefore, we had about 13 (user study 2) to 25 (user study 1) participants for each musical concept (pitch, volume, and tempo). In designated design processes, however, it is likely that the number of targeted abstract concepts is clear from the start of the process, as is the number of metaphorical mappings needed. As these numbers are likely to be lower than was the case in the MoSo example, fewer participants would in many other studies be required to come to similar results. Eight to ten participants for each abstract concept or for each metaphor (one participant can work with multiple abstract concepts and metaphors in an evaluation session) are likely to result in sufficient knowledge to inform the next phase in each of the three user studies.

Though fewer participants may be required, setting up and performing three user studies may not be realistic in many design processes. The approach we propose consists of five phases: (1) identifying embodied metaphors, (2) creating low-fidelity prototypes based on these metaphors, (3) evaluating the implementation of metaphors in these prototypes, (4) creating high-fidelity prototypes, and (5) evaluating these prototypes. Though all these phases are important to successful design of metaphorical mappings, we suggest that particularly evaluating which metaphors the target group may use in their understanding of the abstract concept in question is an essential part of the design process. When the approach we propose is to be shortened, one could integrate the identification of embodied metaphors and the evaluation of low-fidelity prototypes into one user study, which could be comparable to the study in the low-fidelity evaluation phase (user study 2). This would require selecting embodied schemata that may likely be metaphorically appropriate from literature (e.g., [22]). These schemata can inform the development of low-fidelity prototypes, which are to be evaluated through a user study. This user study should then primarily aim at validating the selected schemata and secondarily on the evaluation of the designs.

The approach we propose in this paper is illustrated by the MoSo design case, which focused on learning abstract concepts in musical sound. However, the approach could also be applied when designing interactive systems for learning abstract concepts in other fields than music education or even for systems with other purposes than learning (e.g., [13]). An obvious limitation of our approach is that it is only appropriate when abstract concepts that are to be manipulated are potentially understood metaphorically (when concepts in a target domain are understood in terms of concepts in a source domain [28]).

11.3 Tangible interaction versus whole-body interaction

In the previously performed Sound Maker study [6], a whole-body interaction environment was developed. In the present design case, we relied on tangible interaction. When comparing the two prototypes, we see interesting similarities as well as differences. To give an example, in the enactment phase (user study 1) of our MoSo Tangibles design case, we did not find the mapping nearfar for pitch. Nevertheless, as a result of the evaluation of our low-fidelity prototypes (user study 2) as well as to enable a comparison to pervious work, we did implement it in MoSo Tangibles. The resulting artifact (the puller) can potentially also be moved via the mapping lowhigh, simply by rotating the artifact 90 degrees. Although we must conclude from user study 1 that the lowhigh mapping is dominant over (and should thus make more sense than) nearfar, none of the children used the puller in the lowhigh manner. Apparently, the affordance [15] of the puller, clearly pointing toward near and far movement, has determined how children use the artifact, rather than their implicit embodied knowledge.

The children interacting with MoSo Tangibles were all able to successfully execute reproduction tasks, whereas some of the participants in the Sound Maker study did not achieve this as the implemented mapping was not discovered within the set timeframe. This difference is obviously due to the different interaction styles, although both styles constrain the interaction possibilities, tangible artifacts allow much clearer and more direct affordances compared to whole-body interaction environments. This particularly holds for the environment used in [6] in which the only constraint was given by lines on the floor indicating boundaries of the interaction space. One could argue that whole-body interaction environments, such as the one used in the Sound Maker study, therefore encourage relying on embodied knowledge more than tangible systems. On the other hand, if tangible artifacts are correctly designed, they will afford movements based on embodied knowledge, which ensures that children apply embodied knowledge in their understanding of the targeted abstract concepts. The affordances of tangible artifacts may jump start this process, while more discovery is required in whole-body interaction. This shows the importance of successful design of tangible artifacts, which is in our view best achieved through an iterative process in which user involvement plays a major role.

12 Conclusions

In this paper, we present a people-centered, iterative design approach to the design of interactive learning systems with embodied metaphor-based mappings. In a design case, we have applied this approach to the design of MoSo Tangibles; a tangible system for learning about abstract sound concepts (pitch, volume, and tempo). In this design case, we identified the appropriate embodied metaphors, implemented them in interactive artifacts, and evaluated children’s interactions with MoSo Tangibles. This case revealed that the proposed approach was successful in eliciting and helping us identify an appropriate set of embodied metaphors that children may use in their reasoning about abstract concepts related to sound parameters. Furthermore, verifying the implementation of these metaphors by conducting a second user study ensured effective design of interactive tangibles. The evaluation of MoSo Tangibles has shown that all participants were able to successfully interact with the artifacts after a few minutes of exploration.

Comparing our study to previous work has revealed that although full-body interaction encourages relying on embodied knowledge, tangible systems can provide clarity in interaction by means of affordances and therefore jump start the process of applying specific embodied schemata in reasoning about abstract concepts. This also highlights the importance of successful interaction design, which is in our view best achieved through an iterative and people-centered approach. By proposing and applying such an approach, this paper tries to create a basis for future work on leveraging embodied knowledge in supporting the process of learning abstract concepts.