We have shown that our people-centered, iterative design and evaluation approach used in the design of an embodied metaphor-based learning system (MoSo Tangibles) was effective in both identifying and implementing embodied metaphors. In this section, we will discuss what we have learned from the implementation of this approach and what we can generalize in order to inform other researchers and designers. Furthermore, we will discuss a comparison with previous work.
Identifying and selecting embodied metaphors
The design approach described in this paper consists of five phases and relies mainly on user involvement during several iterations. In the enactment phase (user study 1), children enacted sound changes either through whole-body movement or with a generic and non-interactive artifact (a plastic ring). In both the low-fidelity evaluation (user study 2) and high-fidelity evaluation (user study 3) phase, newly designed artifacts were used. When comparing the results of these different studies, we see that design constraints (e.g., generic versus specifically designed artifact) are of great influence on the results of such studies. As mentioned before, the near–far schema was not identified in user study 1, but nevertheless, we implemented it in artifacts evaluated in user study 2 as well as in our final design. The near–far schema was also used in the (whole body) interaction model of the Sound Maker , where children worked in pairs. This indicates that “restricting” the children to use one artifact (the ring) or one body (their own body) can result in different metaphors than the ones found when specifically designed artifacts are used or when multiple artifacts or bodies are involved in the exploration. Enactment studies (user study 1) may benefit from offering participants several different materials to perform the enactments with (e.g., use multiple artifacts, encourage collaboration).
Furthermore, the difference in results between the enactment study (user study 1) and the low-fidelity evaluation phase (user study 2), particularly regarding the near–far schema, may have resulted from a coding mistake in user study 1. The near–far schema has likely been interpreted as a small–big schema in a number of cases. Many children made jumping movements in their enactments as a reaction to the rhythm of the music, which may have caused near and far movements (e.g., with the hands) to appear to be small and big movements, while such movement could indicate a combination of the two schemata. This may explain that near–far was not found in the enactment study, whereas it was found multiple times in the evaluation of low-fidelity prototypes. This coding mistake could have occurred because we used open coding, meaning that we did not set up a coding scheme in advance but clustered the movements during the analysis. Semi-open coding (i.e., pre-defining a number of likely embodied schemata while leaving opportunity for identifying new schemata) could likely have solved this problem. Enactment studies aimed at the elicitation of embodied metaphors could therefore benefit from a list of embodied schemata, such as the one proposed by Hurtienne et al. . This way, one would rely on documented mappings, ensuring a bodily basis for the chosen mapping. Furthermore, direct user involvement as well as an opportunity to identify new metaphors ensures choosing the metaphor that underlies how the target group reasons about the abstract concept, which is particularly challenging when new interactions are designed rather than existing interactions being redesigned.
Although the database proposed in by Hurtienne et al.  does likely not contain all possible schemata and is still under construction, it would be interesting to compare the embodied schemata implemented in MoSo to the documented schemata. When doing this, we see that all our schemata correspond to documented ones (although low–high is documented as up–down), except for quiet–wild. However, the schema strong–weak is documented, which seems rather similar to the way we have implemented quiet–wild, namely as applying weak force versus applying strong force. Furthermore, the small–big schema is included in the database in the “attribute” category, referring to small and big as a property of an object or entity rather than as a quality of movement. Comparable movement-related schemata are not documented in . A possible explanation is that this database  documents image schemata rather than embodied schemata. Although these two types of schemata are comparable and an overlap may exist, the major difference is that image schemata are often identified through linguistic analysis (also see [25, 28]) while embodied schemata arise from bodily experiences. Therefore, movement-related schemata may be documented as embodied schemata (e.g., ) and not as image schemata. This illustrates that leaving an opportunity for identifying new schemata in an enactment study such as user study 1 is essential. However, this exercise has also shown that semi-open coding using a coding scheme based on documented schemata (e.g., ) may help distinguishing schemata that may otherwise be confused. Furthermore, such an approach encourages using commonly known names for the identified schemata, which is useful to the generalization of gained knowledge.
The approach we applied differs from other approaches to the design of metaphor-based interaction models described in literature [6, 14, 22], because we involved users at several stages of the process. In our previously performed Sound Maker study , we mainly relied on experts to select the appropriate metaphors. When comparing the two studies, we see an interesting difference regarding the near–far mapping for pitch. In our presently described study, children mapped near to low pitch and far to high pitch, while it was implemented the other way around in the Sound Maker. Clearly, different approaches may lead to different implementations of embodied schemata. Multiple iterations can help finding the implementation that makes most sense to the target group.
Using this approach in other contexts
In this paper, we present a people-centered iterative approach to the design of embodied metaphor-based learning systems. In the MoSo Tangibles design case, we involved over 150 children. Though large numbers of participants will increase the reliability of the results, it also complicates and lengthens the design process. Therefore, involving such large numbers of children will not be feasible in all contexts and for all purposes, particularly in design rather than research-through-design contexts. In this subsection, we discuss how this approach can be applicable in a broader context.
In the MoSo design case, a system was developed in which three different musical concepts could be manipulated, each through three different metaphorically related schematic input actions. Given our research-through-design focus, our goal was not to design a system that is directly usable in a class-room context for a particular learning goal. Therefore, the design process started with eight musical concepts. Not all participants in user study 1 were subjected to all concepts. Furthermore, due to the large number of low-fidelity prototypes resulting from our approach to implement multiple metaphorical mappings for each concept, each participant in user study 2 was only subjected to one of the three concepts. Therefore, we had about 13 (user study 2) to 25 (user study 1) participants for each musical concept (pitch, volume, and tempo). In designated design processes, however, it is likely that the number of targeted abstract concepts is clear from the start of the process, as is the number of metaphorical mappings needed. As these numbers are likely to be lower than was the case in the MoSo example, fewer participants would in many other studies be required to come to similar results. Eight to ten participants for each abstract concept or for each metaphor (one participant can work with multiple abstract concepts and metaphors in an evaluation session) are likely to result in sufficient knowledge to inform the next phase in each of the three user studies.
Though fewer participants may be required, setting up and performing three user studies may not be realistic in many design processes. The approach we propose consists of five phases: (1) identifying embodied metaphors, (2) creating low-fidelity prototypes based on these metaphors, (3) evaluating the implementation of metaphors in these prototypes, (4) creating high-fidelity prototypes, and (5) evaluating these prototypes. Though all these phases are important to successful design of metaphorical mappings, we suggest that particularly evaluating which metaphors the target group may use in their understanding of the abstract concept in question is an essential part of the design process. When the approach we propose is to be shortened, one could integrate the identification of embodied metaphors and the evaluation of low-fidelity prototypes into one user study, which could be comparable to the study in the low-fidelity evaluation phase (user study 2). This would require selecting embodied schemata that may likely be metaphorically appropriate from literature (e.g., ). These schemata can inform the development of low-fidelity prototypes, which are to be evaluated through a user study. This user study should then primarily aim at validating the selected schemata and secondarily on the evaluation of the designs.
The approach we propose in this paper is illustrated by the MoSo design case, which focused on learning abstract concepts in musical sound. However, the approach could also be applied when designing interactive systems for learning abstract concepts in other fields than music education or even for systems with other purposes than learning (e.g., ). An obvious limitation of our approach is that it is only appropriate when abstract concepts that are to be manipulated are potentially understood metaphorically (when concepts in a target domain are understood in terms of concepts in a source domain ).
Tangible interaction versus whole-body interaction
In the previously performed Sound Maker study , a whole-body
interaction environment was developed. In the present design case, we relied on tangible interaction. When comparing the two prototypes, we see interesting similarities as well as differences. To give an example, in the enactment phase (user study 1) of our MoSo Tangibles design case, we did not find the mapping near–far for pitch. Nevertheless, as a result of the evaluation of our low-fidelity prototypes (user study 2) as well as to enable a comparison to pervious work, we did implement it in MoSo Tangibles. The resulting artifact (the puller) can potentially also be moved via the mapping low–high, simply by rotating the artifact 90 degrees. Although we must conclude from user study 1 that the low–high mapping is dominant over (and should thus make more sense than) near–far, none of the children used the puller in the low–high manner. Apparently, the affordance  of the puller, clearly pointing toward near and far movement, has determined how children use the artifact, rather than their implicit embodied knowledge.
The children interacting with MoSo Tangibles were all able to successfully execute reproduction tasks, whereas some of the participants in the Sound Maker study did not achieve this as the implemented mapping was not discovered within the set timeframe. This difference is obviously due to the different interaction styles, although both styles constrain the interaction possibilities, tangible artifacts allow much clearer and more direct affordances compared to whole-body interaction environments. This particularly holds for the environment used in  in which the only constraint was given by lines on the floor indicating boundaries of the interaction space. One could argue that whole-body interaction environments, such as the one used in the Sound Maker study, therefore encourage relying on embodied knowledge more than tangible systems. On the other hand, if tangible artifacts are correctly designed, they will afford movements based on embodied knowledge, which ensures that children apply embodied knowledge in their understanding of the targeted abstract concepts. The affordances of tangible artifacts may jump start this process, while more discovery is required in whole-body interaction. This shows the importance of successful design of tangible artifacts, which is in our view best achieved through an iterative process in which user involvement plays a major role.