Introduction

In mathematics education research, the use of transcripts is quite pervasive and has been enabled by the now ubiquitous availability of audio and video recording devices. These have been extraordinary in enabling researchers to listen to and watch what students and teachers do and say and to notice events and patterns that would never have been evident before. Transcripts are a technology of research that enable researchers to replicate an event that has occurred in the past. They have become increasingly sophisticated over time, with mechanisms for capturing not just spoken words but also pauses and actions, as well as tone of voice (Roth, 2011) and even rhythms (Staats, 2008). While valuable for many kinds of research endeavours, I wonder whether they have inadvertently made it difficult for researchers working with theories of embodiment to adequately validate experience. Consider a (made-up) transcript such as this:

figure a

Compared with the actual experience of an event or even the experience of watching a video, the transcript makes hard and upfront distinctions. For example, there is the separating out of two actors, who might seem plucked out of their setting—the researcher and the student are front-loaded in the event. Time is also separated out into sequential ‘turns’, which impose a linearity that may have little to do with the temporality of the event. Words are separated from actions, with the latter taking backseat (in parenthesis as they are) and removed from the flow of actual experience. The transcript functions to represent experience, presumably so that the reader does not need to watch the video themselves, let alone be present during the original event. While it can often add to the experience of watching the video, by highlighting certain uses of language, for example, it does little to help the reader feel what it was like to be there or to experience what it is like to manipulate objects on the screen. Many transcripts fall short on what Lather (1993) calls embodied validity.Footnote 1 Reading through a transcript can also be difficult, trying to piece together the words and the images and the descriptions of actions back into a coherent whole.Footnote 2 It is the highly verbal and artificially separable nature of transcripts that can give it power but that can also prevent researchers from expressing the spatio-temporal-material reality of experience itself—the touching, dragging, responding to feedback, sensing of proximity, feeling of objects and relations. And since it is this spatio-temporal materiality that is at stake in theories of embodiment, and so central to dynamic mathematical technologies, it may not be surprising that both the role of the body and of digital technologies in mathematical thinking and learning continues to be undervalued.

Part of my goal in this article is to experiment with some different methods for extending and validating experience. The issue of validity is one that many qualitative researchers in education have discussed, critiqued, overthrown and extended—in this paper, I connect particularly to Lather’s early work, which reframes validity as a problematic, allowing her to invent and explore alternatives to the practices that are currently taken to be legitimate in the field. Within mathematics education, this article builds on recent work of Güneş, Paton and Sinclair (in review) on researcher re-enactments of videos, which in turn draws on the re-enactment of transcript work found in Vogelstein et al. (2019), as well as the earlier and on-going work of Rogers Hall, Charles and Marjorie Goodwin, Elinor Ochs, Noel Enyedy, Beth Warren and Ann Rosebery, who all engaged in transcript re-enactment as part of their research methods—though they tended to report on the results of their analyses rather than on the process of re-enactment (Ricardo Nemirovsky, personal communication).

The particular context of the current research centres around a video clip of young learner engaged with me in using the multi-touch application called TouchCounts (Jackiw & Sinclair, 2014). In this article, I consider the arm, hand and finger actions in the child’s mathematical experience to investigate how they function and change over the course of the video clip as the child interacts with the gesture-based addition design of TC. I begin with some theoretical considerations, which will enable me to articulate the perspective on embodiment I am taking here and to motivate the need for the methodological approaches with which I experiment in this article.

Theoretical Considerations

Broadly speaking, this article seeks to speak to those interested in theories that do not ascribe to the mind–body binary—it assumes that bodily actions, sensations and feelings are constitutive of any experience, including mathematical ones, and, further, refuses to see the body (its movements or senses, visible or not) as a mere support or scaffold for thinking (or learning or reasoning). Similarly, with respect to technology, it does not impose a strict separation between body and tool and, instead, sees the student–tool as an assemblage that functions in co-ordination, much like an individual and their eye-glasses function in co-ordination to see the world. This non-dualist and non-hierarchical view of experience has been articulated in de Freitas and Sinclair (2014) and shares much in common with the approaches of scholars such as Roth (2011), Nemirovsky et al. (2013) and Ferrara and Ferrari (2017).

Stating one’s theoretical orientation does not always make clear one’s axiological perspective. In this case, my interest in non-dualist, non-hierarchical perspectives that refused to separate and sequentialise experience arises from a conviction that dualist, hierarchical theories severely constrain what it means to know mathematics—and, as a consequence, for whom that knowing is available and inviting. In designing digital technologies, such as TC, which is the focus of this article, I am interested both in pushing the envelope of what it means to know mathematics and also in questioning categories and sequences of knowing that are taken for granted in Western culture.

As the large body of literature on theories of embodied cognition makes evident, gestures are a significant form of bodily engagement in mathematics (see, for example, Edwards et al., 2017). Gestures have primarily been studied from a linguistic point of view, as forms of communication ‘in the air’ with the hand, which are often used to complement speech. While some see gestures as entirely communicative, the work of Streeck (and followers in mathematics education—see Sinclair & de Freitas, 2014) expands the category of gestures to include the touching and handling of things. Indeed, Streeck (2009) defines gestures as follows:

not as a code or symbolic system or (part of) language, but as a constantly evolving set of largely improvised, heterogeneous, partly conventional, partly idiosyncratic, and partly culture-specific, partly universal practices of using hands to produce situated understandings. (p. 5)

I find this broader interpretation of gestures more relevant to my context of research since I am interested in the movement of the hand (and arms and fingers) while manipulating objects as well, including the use of the hands with the digital technology, as well as the use of the hands to count, to show, to explain, etc. Additionally, Streeck’s willingness to take seriously the way in which gesture couples with and intervenes in the material world enables me to avoid the doing–thinking binary and hierarchy in which touching and manipulating things are separate from and subordinate to thinking and communicating.

Many different categories of communicative gestures have been posited by scholars, such as the iconic and metaphoric gestures of McNeill (2005), and used in mathematics education research. These can be relevant for studying the kinds of gestures that certain experiences can produce, particularly if one believes that certain types of gestures are more mathematical than others (as in Edwards (2009)). As this is not my intention in this paper, I will be less interested in classifying gestures as types (for their kinds of meanings) than in studying how they function, as well as how they change. In experience, few people intentionally plan to make one type of gesture over the other or even become aware of these analytic categories; instead, they produce situated understandings. Additionally, gestures are not experienced as separate categories of action but involved in a complex network of sensory activities, including seeing, touching and hearing.

Indeed, as Sinclair and de Freitas (2014) have illustrated, it can be insightful to study the co-ordination of hand and eye as children engage in mathematical activity. Following process philosophers such as Alfred North Whitehead and Tim Ingold and in keeping with the work of Ricardo Nemirovsky (see Nemirovsky, 2011; Nemirovsky et al., 2013), I find it important to resist making artificial divisions unless the payoff is clear. For that reason, and as I will elaborate in the methodology section, I will resist the normative practice of using transcripts, which often artificially separate experience into words and actions and emotions. I will further try to resist separation by enabling the reader to experience the experience and not just to read about it.

Mathematics education researchers have studied both the use of intentional gestures (by teachers or researchers) and spontaneous gestures (that students might make in the course of solving a problem, for example). In this article, which involves the use of TC, in which interaction gestures have been specially designed to support mathematical meaning-making (Jackiw & Sinclair, 2017), gestures have a slightly different status. Like a tea cup, whose design compels the drinker to hold their hands in a certain way (making a gesture that can then be used in the future in the absence of the actual cup, perhaps to order tea in a restaurant), TC requires that users move their hand(s) in particular ways—not because a teacher has taught or demonstrated it but in order to act on an object. Of interest to me in this article is what becomes of that designed interaction.

Methodology

I begin this section with a small detour to the seventeenth century. Shapin’s (1984) article Pump and Circumstance: Robert Boyle’s Literary Technology traces the technologies of scientific research that were devised by Robert Boyles during his air-pump experiments in the 1660s and that have since become normative in empirical research. Boyles, who was doing novel experiments that required expensive machines, found himself needing to construct a form of communication that could make his experiments known to a large number of people who could not be present during the experiment (as had been the case prior to Boyles) and would not be able to replicate these experiments. Further, he did not want to rely on the credibility of authority figures (such as the members of the Royal Society) or on personal opinion. Shapin argues that Boyles created a literary technology of virtual witnessing, which was ‘the production in a reader’s mind of such an image of an experimental scene as obviates the necessity for either its direct witness or its replication’ (p. 491).

In order to gain trust and credibility, Boyles’ scientific text was fashioned so as to help the reader indirectly witness the experiment. This included the use of highly descriptive text full of circumstantial detail that enabled the reader to see that the experiment had actually taken place in a specific location, which meant that images, for example, were not just representations or schematics but evoking real objects (like the air-pump and the laboratory setting). This was meant to give the reader a ‘vivid impression of the experimental scene’ (p. 492)—it had to be specific to be trustworthy. As a consequence, writing the experimental report became just as important as the experiment itself, since it enabled the establishment of matters of fact by the public.

This is like what we do now in our journal publications. Our writing is circumstantially dense as we provide descriptions of the research setting and even more specific and detailed accounts of the events that occurred, including the things said as well as the actions made. Transcripts are part of this technology of knowledge production, enabling the kind of virtual witnessing that Shapin identifies. They are trustworthy because as mathematics education researchers we are part of a community of sense that accepts these transcripts as credible. However, I wonder whether transcripts really obviate the necessity for direct experience? Before Boyles, scientists would have gathered around together to witness an event collectively, but that event was physical in nature, such as the result of chemical experiments. Perhaps seeing was sufficient, in which case, images and descriptions might have sufficed as well. But in the case of accounting for experience in which multiple senses are at stake, like the kinetic and the tactile, seeing seems insufficient. As virtual witnesses, readers might be called upon to feel and touch and hear as well—in which case, virtual re-enactments might be more credible, more effective ways of not only feeling like the reader was at the experiment but also in the experiment as well.

After providing a brief description of the context of the research, in which I will remain in ‘virtual witnessing’ mode, I will engage in two methods that are both attempts to enable the reader to experience the event under consideration. The first will be to provide a narrative account of the event, which is to provide a sense of the whole event (see also Nemirovsky & Duprez, 2023). This narrative approach avoids the separation and sequencing of a transcript, thus aligning with researchers such as Nasheeda et al. (2019), who insist on the way stories can better present research participants’ lives. The narrative also situates the event, disclosing my own subjectivity (I narrate it from my point of view), which interrupts the role of the researcher as the ‘great interpreter’ and opens up the potential for the reader to create their own interpretations—this aligns with Lather’s (1993) notion of voluptuous validity, consistent also with a Baradian (2007) onto-epistemology of entanglement.

The second will be to invite the reader to re-enact aspects of the video, so that they can both do and say for themselves what was done and said in the video, explicitly mimicking the event rather than reading about it. Given my interest in the involvement of arms/hands/fingers in learning, I selected specific images of the video that involve the use of arms/hands/fingers and use them as re-enaction prompts. Here, the images function less to represent what happened in the video (as in the images in the opening transcript) and more as tool or apparatus for producing an experience for the reader. These images then, following Parikka (2023), who draws on the work of Harum Farocki, are operational because they are not simply there to be seen but to be acted upon—in this case, I invite readers to act upon them by using them as prompts to move and talk. Unlike the narrative, which aims to convey a wholeness, the re-enactments focus on a particular aspect of the video, which is the moving of arms, hands and fingers. While prior researchers have engaged in re-enactments, these have involved the reseachers re-enacting transcripts and videos (Vogelstein et al., 2019); I am inviting readers to re-enact images taken from videos.

Context of the Research

The video clip I will be discussing was taken in a Headstart Indigenous school located along Rainy River in western Ontario, Canada. Along with other researchers, I was participating in a spatial reasoning project that the school (and community) had requested. At that time, TouchCounts (TC) was relatively new, and the idea arose that it could be translated into the community’s language of Ojibwe. The teachers at the school wanted to learn more about how it worked and so invited me to work with a group of four-year-old children. The whole session was video-taped by one of the other members of the research team. I never watched the whole video, however, the colleague who had video-taped it sent me this small clip a few weeks after we had left the research site, saying ‘Wow, you can see the learning happening!’.

TC is a multi-touch application for the iPad that was designed to provide learners with an experience in gestural arithmetic. Making and transforming numbers are done by touching the screen in specific ways that were designed to provide conceptual support. It contains two worlds, an ordinal one and a cardinal one. Since the video only involved the latter, I will limit the description of TC to this world. In the cardinal world, a single finger tap produces a circular disk named ‘one’ and numbered (1); the word ‘one’ is also said aloud. Two finger taps together create two distinct ‘unit chips’ beneath the fingers that are circumscribed into a single number value object, which we call a herd, numbered 2 and named ‘two’. Ten fingers at once produce the number ten (10), named aloud ‘ten’, which appears as a visually ‘larger’ number than one (1) or two (2). Numbers can be created by putting herds together. Since any existing herd can be dragged, two or more numbers can be dragged simultaneously. When two (or more) herds are dragged into each other, they are temporarily circumscribed by a new disk that encompasses both (or all) of the herds. Continuing to drag the constituent values out of overlap erases this temporary new disc, but finalising the drag in this ‘combined’ value causes the constituent herds to dissolve into the new larger herd, with their constituent number chips reassembling themselves into a single new sum. Figure 1 illustrates the evolution of two numbers being ‘pinched together’ to form a third. (I invite readers to do this themselves in TC, failing which, place your two index fingers on the 4 and 10 in the image below and drag them towards each other, as shown.)

Fig. 1
figure 1

Pinching fingers together to add herds in TouchCounts

TC thus has two explicitly designed gestures relevant to this article: the first is a kind of subitizing gesture that creates a herd of given size, depending on the number of fingers touching the screen ‘all at once’. The second, the pinch gesture, is the action required to add two (or more) herds together. It can be accomplished by many different movements of the hand, including by the thumb and index finger of the same hand, by the two index fingers of each hand and by two people pushing their herds together. Each expresses addition as a gathering of herds, thereby instantiating one of Lakoff and Núñez’s (2000) conceptual metaphors for addition.

Experiencing Bringing Two and Two Together

In this section, I will provide two renderings of the video. The first will be a narrative account, which is a re-telling of what happened, seeking to avoid as many artificial separations and sequences as possible. In the second, I provide a re-enactment prompt that invites the reader to re-enact some of the significant actions of the video involving arms, hands and fingers, as an additional way to experience the video. I have chosen to present these two renderings separately (and not side-by-side) because I think they are easier to read that way, leaving each modality (the narrative and the images/actions) their own, unique experience. I also let this narrative stand on its own and only engage in analysis after presenting each re-enactment. I am assuming that the validity of those analyses will depend on the presence of the narrative, so that the discrete elements of the re-enactments can be evoked alongside the more continuous, holistic narrative.

Narrative Account of the Video

It is our first time together and Ruby’s first time using TC. We are seated on the carpeted floor, with other people around. She is cross-legged and I am kneeling, folded over my legs so I can be close to the ground. The iPad is on the carpet between us. There is lots of noise around but we are both ignoring it, as are the teacher and other students seated close by. Ruby has made two herds of two, each made by placing two fingers on the screen simultaneously. I put an index finger on each herd and ask her, ‘What will happen if we put the two and the two together?’. Facing me, she has her two arms outstretched, with her right index finger ready to touch the screen. She has rocked up onto one leg and is looking at the screen closely. She tries to touch the screen, but I ask her to wait, so she rocks back and pulls her hands away.

I ask her again to tell me what she thinks will happen, looking up at her, while she continues to look at the screen. She says ‘Hmmmm’, bringing her left hand to her mouth, falling back to the ground on her bottom, and says ‘It will turn into twenty’. I laugh and repeat ‘twenty’ and invite her to try. She places her two index fingers, one on each herd on the screen—we are both looking at the screen—and drags the herds together. The iPad creates a herd of 4 and also says ‘four’ out loud, which I repeat. There is now one herd of 4 in the middle of the screen. She lifts onto her knee, now hovering over the iPad, our heads almost touching. She then sits back and says ‘yes!’, as if she had succeeded at something, while smiling and looking at me.

Holding out two fingers on my right hand, and then two on the left, I say, ‘two and two together’. She looks at my fingers, then lifts both of her hands, extending three fingers on her right hand and three on her left, but looking at her right hand. She then turns her hands palm out, now looking at both hands, and then raises the pinky on her right hand, drops her left hand, raises her right arm higher and looks down at the screen. I say, ‘two and two together yeah, it makes that’, having seen that she has four fingers lifted, and a few seconds later I say ‘four’.

She presses the reset button and I ask her if she can make another four. She eagerly obliges. First, she makes two herds of 1 by tapping the screen with her right-hand index finger. She then drags the two herds together, using the pinching gesture, to make 2. I comment her actions, ‘you made a one and a one and you put it together and you got two’. She then makes a herd of two by putting her two index fingers on the screen simultaneously. She then uses the pinching gesture to drag the two herds of 2 together to make a herd of 4. She smiles, leans way back onto her feet and falls onto her bottom. I say ‘Oh, that’s how you did it? You put two and two together?’ She returns to being cross-legged and says ‘yeah!’ and then rocks back onto her knees, still smiling, as am I.

I ask if I can try, to which she responds ‘sure’. I use two fingers—index and middle finger—to make two herds of 2, and comment my actions as I go, and then ask, ‘And then what do I do?’. She looks at the screen, rolls back onto her bottom, spreads her arms apart and says, ‘you just put them together’, while moving her two hands towards each other, as if making a whole-armed pinching gesture, still looking at the screen. She then looks at me, saying ‘and then it makes’, moves her hands away from each other so they are quite spread out again, pauses, drops her right arm and then says, emphatically ‘Four!’, with her left arm raised.

Re-enactment

This sub-section contains four parts. The first (Fig. 2) goes from the point where Ruby first acts on the two herds and ends with her reaction to the outcomes of that first manual gesture. The second begins right after I have put my two hands, each with two outstretched fingers, out and ends with Ruby’s four-finger gesture (Fig. 3). The third (Fig. 4) begins after Ruby has made two herds of two, after I have asked her to make another four. The fourth (Fig. 5) begins after I ask Ruby what she thinks will happen when the two and the two come together. I have separated the re-enactments into four parts because they highlight different functionings of gestures. For each re-enactment, the reader is invited to carry out the instructions provided, which include making actions either directly on the image provided. The images thus provide a space for actual re-enactment (touching the image as if touching the screen) of a posture/position—in other words, rather than just read and see the image, act it out as well. Optimally, the reader can re-enact the image using TC, thereby combining the actions with the reactions of TC, that is, what TC produces as objects on the screen, sounds and skin contact. The goal is for the reader to engage in a sensorially rich experience—actually touching the screen, moving arms, raising fingers, bending over, in addition to seeing, hearing and feeling the output—as Ruby did.Footnote 3 Readers are encouraged to go back to the narrative to recall how the actions fit within the whole event.

Fig. 2
figure 2

Re-enactment 1: Ruby making the first four

Fig. 3
figure 3

Re-enactment 2: Ruby making four on her fingers

Fig. 4
figure 4

Re-enactment 3: Ruby’s second on-screen pinching gestures

Fig. 5
figure 5

Re-enactment 4: Ruby’s remembering of the pinching gesture

After having predicted that putting together the two herds of two would make twenty, Ruby begins to interact with TC. Begin re-enactment 1 (Fig. 2).

Ruby places her two index fingers on the screen, each one touching a herd, and drags her fingers towards each other. The herds follow along until they overlap, one almost on top of the other, at which point Ruby lifts her fingers off the screen. She watched as the two herds of two transform into one herd of four (and hears TC say ‘four’). She then says ‘yes!’, as if that was what she had expected. The re-enactment will evoke the simultaneous focus on two different herds, each controlled by a different finger. The action is therefore something that involves two different objects, which must both be moved. The re-enactment will also convey the tangibility of the pinching gesture, with the fingers not only touching the screen but also moving along it, so that the act of combining occurs over time. The feedback of the movement of the fingers on the screen shows the two objects getting closer and closer together, which eventually becoming one object when the fingers are lifted. This manual gesture may look communicative, if one focusses only on the images, but the re-enactment should confirm not only its manipulative function (to make something happen on the screen) but also its multiple sensory feelings: the touch, the sense of two things becoming one and the temporality of combining.

After Ruby makes her herd of four, I extend my own two fingers on each hand, first keeping my hands apart but then bringing them together so that the four fingers are all side-by-side. This seems to cue Ruby to use her fingers as well, but not in a way that copies my own hand and finger movements. Begin re-enactment 2 (Fig. 3).

Ruby initially extends three fingers of each hand, palms facing her. She then turns her hands around, with palms facing away from her, and lifts the pinky finger of her right hand, still holding her left hand in the air. Then, she drops the left hand and lifts the right hand slightly higher, showing four fingers extended. Although the image at the right of Fig. 3 looks like a traditional ‘in the air’ gesture, the re-enactment should emphasise the movements of the hands and fingers over time, eventually leading to the final configuration—it is therefore not a standalone position, but part of a sequence of movements, like a choreography.

My offering of extended fingers, which I used to re-express the two herds of two on the screen, seems to have led Ruby to use her own fingers. Since I was not asking any question, it appears that Ruby had a question of her own. Perhaps she was trying to connect this new four she had encountered with prior experiences of making four with the four fingers of one hand. Rather than go from the three extended fingers on each hand to two extended fingers, she instead shifts to one hand. Since she starts with three fingers extended, it is possible that she is thinking/acting ordinally, wanting to get to four by first making three—perhaps enacting a count-to-four on her hand. The four-gesture she makes, then, is no longer the four as a sum of two that I had made. Her use of fingers is therefore not only communicative (to me and to herself) but also epistemic, since she is operating on and with her fingers—not only to count up but also to move across ordinal and cardinal expressions of four. In this re-enactment, we can therefore see epistemic and communicative functions of gestures. The communicative functions are outward facing (her hand is lifted in the air for me to see in the), while the epistemic gestures are inward facing—she is the audience for her finger movements.

I ask Ruby to make four again, which she does by first making two herds of two. Begin re-enactment 3 (Fig. 4).

Ruby does almost the same thing as in the first re-enactment (Fig. 2). She first puts each of her index fingers on the herds and then drags her fingers towards each other so that the two herds overlap. But this time, instead of inspecting what she has made, she sits back on her bottom and smiles. Further, she is intentionally making the four on TC, with the pinching gesture—or intentionally repeating what she has done before—rather than just dragging the two herds of 2 together. The first re-enactment gesture is therefore a gesture-to-combine, while this one is a gesture-to-make-four. The function of the pinching gesture is not just to manipulate but to create a specific outcome.

I then ask Ruby what will happen when I put my two herds of 2 together. Begin re-enactment 4 (Fig. 5).

Ruby looks at the screen, then raises her arms apart from each other. She then brings her two arms towards each other. This air-pinching gesture clearly mimics the touch-pinching gesture she has previously made twice already, though in a larger version, with arms instead of index fingers. This air-pinching gesture is made as she is trying to respond to my question, so the gesture can be seen as a rememberingFootnote 4 of what has been done before—in other words, to answer my question, she actually has to repeat her prior gestural choreography. Indeed, she does not immediately answer ‘four’ but makes the gesture, accompanied by the words ‘you just bring it together’ and then, after her hands have come together, she stretches them out again as she says, ‘and then it makes’, then pauses again, before saying, ‘four’.

Ruby’s whole arm air-pinching gesture seems to function as a way for her to remember prior actions (and their consequences, as in the output of ‘four’). The use of a whole arm air-pinching gesture, rather than a smaller finger-based pinching gesture, is interesting. Based on Gerofsky’s (2011) findings, in which students who made larger gestures when working with graphs of functions tended to have a better understanding of these graphs, Ruby’s larger air-pinching gesture might also suggest that she is confident about the operation on the two numbers and so expresses it boldly. In any case, what I find significant is the change of her arm/hand/finger movement which started as a having a manipulative function (in re-enactment 3) to having a remembering one (in re-enactment 4), where the remembering one, which enabled her to anticipate what would happen, also functions as communicative.

Discussion

Rather than focus on the type of gesture used and how mathematical they are (with metaphoric gestures often considered more mathematical than iconic or indexical ones, as in Arzarello et al., 2015), I am interested here in the way in which the TC-designed pinching gesture, which Ruby has to use in order to bring two herds together and which is thus a learned gesture (not spontaneous), is used not only to communicate (to show me what will happen) but also to remember a action. Indeed, Ruby’s in-the-air whole arm action is a remembering of putting two and two together. This remembering did not rely solely on her visual memory (seeing the herd of 4 on the screen) or her audible memory (hearing the word ‘four’ pronounced by TC, as well as by me) but also on haptic memory. I would suggest that, if she had not been allowed to move her hands, she may not have been able to say the result. In a sense, the movement and the memory triggered each other, a point I will return to shortly.

The triggering of a past event is a central point of concern in Sfard’s recent commognitive work (see Lavie et al., 2019), which focusses on the notion of precedents, which are features of a previous situation deemed relevant by the learner in a current situation. The notion of precedent is embedded within the commognitive theory, which considers thinking to be communicating, but distinct from acting. In contrast, my insistence on the manipulative and epistemic functions of gestures refuses to dualise and hierarchalise acting and thinking, therefore foregrounding the significance of precedent actions. Additionally, in contrast to the intentional selection of precedents, I see this selection as being often subconscious, triggered by situational cues that the learner may not even be aware of. Indeed, it is in the work of Barsalou (2020) and his theory of grounded cognition that I find interesting resonance, particularly with respect to his notion of simulation. Barsalou uses simulation to describe the construction of a conceptualisation on a specific occasion. This simulation draws on a simulator, which is ‘the entire body of accumulated knowledge for a category’ (p. 9). In Ruby’s case, the categories at stake are small—she is conceptualising both four and the adding of two and two. Ruby’s simulator for four would consist of all the experiences that she has had related to four. She would have heard the word at school, but also at home—and in both these situations ‘four’ would have been related to age, which would have come with some excitement. She might have played with other iPad number games before her experience with TC, which may well have involved the use of fingers. All of these experiences, with their physical, sensory, social, emotional, contextual specificities, make up the simulator, which is then a superposition of many, many layers of experience.Footnote 5 According to Barsalou, with all of her experiences of four, each generating an associated pattern across her neural systems, there will be some overlaps that will become more available the next time Ruby encounters four. However, how Ruby conceptualises four at any particular moment—how she answers questions about four—will depend on how the particular situation triggers aspects of past experiences.

One might ask, however, what counts as ‘an experience’. In the ongoing flow of life, where does an experience start and stop, and therefore become layered in the way Barsalou describes? In his work on broadening the scope of aesthetic experiences beyond the high-art museum context to more everyday ones, Dewey (1934) characterised an experience by its ‘internal integration and fulfillment’ (p. 42), reached through a developing organisation of meanings and energies. Further, an experience features emotional intensity together with ‘cumulation, tension, conservation, anticipation, and fulfillment’ (p. 152). The video can be read in terms of distinct experiences, each related to the four re-enactments. The first occurred when Ruby makes her first four, at the end of which her exclamation of ‘yes!’ signals some kind of fulfillment, which grew from the mystery of not knowing what would happen. The second would have been the production of the four outstretched fingers. Even though there is no verbal or visible cue of a sense of fulfillment or satisfaction, Ruby seemed to know when she had accomplished her own goal of making four on her fingers—she did not even need to check the result with me—and that making of four grew from a cumulative tension as she struggled to work her fingers into the correct configuration. The third involved Ruby’s making of four on her own, which she does quite fluently and is evidently proud of as she rocks back onto her bottom and smiles, this time with the feedback of TC confirming her making of four. The fourth ended with Ruby’s excited pronouncement of ‘four’, at the end of the clip, with her arm raised, as if in victory, marking an emotional intensity. This comes at the end of a sequence of events involving some cumulation (her pinching two and two together, then doing it again, then walking me through the process), some tension and anticipation (as she hesitates before saying ‘four’, literally anticipating what will happen) and fulfillment (the pride evident in her assertion).

Identifying these four experiences enables me to consider what happed in the video clip as an instance of transfer, using Nemirovsky’s (2011) expanded notion of that concept. Indeed, Nemirovsky’s takes transfer to describe when an experience becomes part of another experience and insists on retaining the full cognitive, emotional and embodied sense of experience rather than narrowing it down to cognitive symbols or schemes. He takes experiences to be characterised by ‘episodic feelings’, which are not just general feelings (of happiness or sorrow) but feelings associated with particular sights, sounds, places and people and, following Dewey, I take to be indicative of an experience. Like an episodic memory (‘that time I figured out what two and two made’), an episodic feeling is situation-specific. Nemirovsky asserts that episodic feelings are anchors for episodic memories since we often only remember the feeling of an event (a book, a movie, an encounter, a trip) and forget its details (the names of the characters, the plot, the location, the train ride). Crucially, Nemirovsky draws attention to the fact that episodic feelings are re-experienced bodily—for example, in the case of Ruby, when she brings her arms together, she is doing so both in the present and as a memory of the past, thereby performing a spatio-temporal integration. This effectively changes the past in the sense that the prior experience is now entangled with the present one. In other words, the experience of Ruby in the fourth re-enactment includes and transforms the experiences of on-screen finger-pinching, gathering up a feeling of combining that Ruby can use to anticipate the sum of two and two.

The spatio-temporal enfolding resonates with Barsalou’s quantum interpretation of cognitionFootnote 6 which draws attention to entanglement of concept and context, insisting on the radical situatedness of knowing and highlighting how the very act of simulation modifies the simulator. Barsalou draws on the quantum to explain why replicability is so difficult to achieve in grounded cognition research, since situational cues can prompt unexpected simulations.Footnote 7 I find Barsalou’s turn towards the quantum interesting also because of its re-figuring of time. In quantum mechanics, time is not so linear; the present can change the past (see Wendt, 2015 for examples of this in a broad range of fields). In classical views of transfer, there is a strict, linear order of events in which the past remains intact and then affects the present (or future). In the quantum view, Ruby does not merely re-use an existing gesture in a new situation, because the air-pinching gesture re-figures the finger-pinching gestures as not only being manipulative, not just epistemic, but also to remember and communicate.

Conclusion

My goal was to investigate the function of these arm/hand/finger movements—what was Ruby accomplishing as she moved in the environment? I found that she used her arms/hands/fingers to manipulate the iPad, in order to make and operate on herds. She also used them to count on and with, thus using her fingers and hands both as tools and as means of communication. Finally, she used her arms to remember the finger-pinching on the screen, thereby translating a tangible gesture into a ‘in the air’ one.

A theory such as instrumental genesis or semiotic mediation, which are often used in technology-specific research studies, would read the ‘in the air’ gesture as the visible tip of a ‘generalization’ process iceberg or a sign suggesting that TC is becoming a ‘psychological tool’ (in the Vygotskian sense) for Ruby, who would have developed an instrumented action scheme with TC. This interpretation focuses on the student-artefact relationships, cutting others (including me, in this case) out, as well as the affective-material context, which is read as incidental. In a discursive analysis, as in Sinclair and Heyd-Metzuyanim (2014), more attention might be paid to the fact that Ruby seems to be engaging in the situation because I am there, wanting to interact with her and smiling, and also showing her something she probably finds interesting (the iPad). In this article, and through both the narrative and re-enactments, I am trying to avoid decomposing what has happened into categories (cognitive/affective or acting/thinking or abstract/concrete) in order to hold on to the complex entangled phenomenon. Nemirovsky’s notion of an epistemic feeling, which can traverse from one experience to another, was used to show the socio-material-affective transfer of one experience (on-screen finger-pinching gesture to manipulate) to another (in the air arm-pinching gesture to re-member and communicate). In this sense, learning occurred, as suggested by my colleague who remarked on seeing the learning happen in the video clip.

Methodologically, I have endeavoured to enrich the virtual witnessing called upon by the reader, so as to provide a sensory, embodied account of a technologically modulated experience. The reader will have to be the judge of whether this affected their own interpretation of the video clip, as well as the trustworthiness of my interpretations. The re-enactments hopefully made more significant and relevant the manu-facturing of meanings in the video, which included touching and dragging and pointing and finger-counting, as well as ‘in the air’ gesturing. The use of narrative and re-enactments alternatives—or additions—to transcripts will require much improvement and community buy-in if it is to be seen as having validity. Nemirovsky et al.’s (2001) attempt to introduce the use of videopapers, which included transcripts, videos and images side-by-side, was never taken up by the community. Perhaps there were technical concerns for journal editors and publishers or ethical ones related to the sharing of videos. But videopapers also challenged epistemological and aesthetic assumptions related to the importance of the body and the senses in mathematical thinking and learning. Re-enactments also challenge the forms of validity that govern much of qualitative research in mathematics education. Replicability would perhaps be impossible. The question of reliability would shift from being centred on the researcher (can I believe that she writes? Did that really happen?) to the reader, as the reader is the one being asked to participate in the re-making of experience. Other challenges emerge as well, such as how to address the fact that the experience of Ruby cannot be the same as that of the reader (whose age, gender, ability, class and race will be different)? what kinds of prompts would lead to more effective re-enactments—are the static images I offered adequate? under what conditions are re-enactments useful or valid? and must re-enactments involving digital technologies require the actual use of those digital technologies by readers? In any case, it will take much more experimentation with re-enactments to determine whether this approach to enhancing virtual witnessing is worth the new challenges re-enactments introduce.