1 A question about acting together

People walk, build, paint and otherwise act together with a purpose in myriad ways. Minimally, our acting together with a purpose requires that there be some sense in which we are acting together, and also an outcome to which our actions are directed. But this is not quite enough to capture what we want. To see why not, consider a contrast involving a long drain which is blocked:

CASE 1: Ayesha and Beatrice live at different ends of the drain. By chance, they are simultaneously attempting to unblock it with drain rods. Although neither’s efforts would have been sufficient, the common effect of their actions is sufficient vibration and disturbance to clear the blockage.

CASE 2: … is exactly like CASE 1 except that it is not by chance but by agreement that Ayesha and Beatrice are simultaneously attempting to unblock the drain; moreover, they expect that blockage to be removed as a common effect of their actions.

In both cases we can say that Ayesha and Beatrice unblocked the drain together—after all, neither of them alone was doing quite enough to move the blockage.Footnote 1 Further, there is a single outcome, the unblocking, to which each of their actions are directed. Yet we aim to exclude the first case and include the second only. This can be done by imposing a further requirement: where we act together with a purpose, there must be an outcome to which our actions are directed and this cannot be, or cannot only be, a matter of each action being directed to it. To save words, we stipulate that in such cases our actions are collectively directed to the outcome.

These minimal requirements on acting together with a purpose are weaker than those commonly assumed in discussions of acting together for they involve neither psychological mechanisms such as shared intentions (contrast Bratman, 1992) nor normative structures (contrast Gilbert, 2013); nor do they invoke any novel kind of reasoning (Gold & Sugden, 2007) or subject (Helm, 2008).Footnote 2 Despite this, there is at least one question which can be brought into view without relying on anything more than the minimal requirements.

To see this question, recall the contrast above. Given that actions are events, there is no special mystery about how the actions we perform can collectively have effects such as unblocking a drain: this is just a matter of events having common effects. By contrast, it is less obvious how our actions could be directed to unblocking the drain without this being merely a matter of each action being individually directed to that outcome. When we act together with a purpose, in virtue of what are the actions we perform collectively directed to outcomes?

This question is parallel to one about ordinary individual actions and the outcomes to which they are directed. Some have argued that the directedness of ordinary, individual actions to outcomes depends not only on intention or knowledge or other familiar philosophical paraphernalia but also other kinds of representations which are involved in preparing and executing actions; in particular, on something called motor representations (see Sect. 2). Inspired by these arguments, and taking the correctness of their conclusion as a premise, we shall consider whether they can be generalised to acting together with a purpose. This will involve four steps. The first step is to examine the possibility that, when people act together with a purpose, the outcomes to which their actions are collectively directed are sometimes represented motorically. Recent discoveries suggest that this is indeed the case (see Sect. 3). The second step is to propose a conjecture: motor representations of outcomes can be components of certain interagential structures, and these structures can facilitate interpersonal coordination of actions around the represented outcomes (see Sect. 4 where we specify the interagential structure). The third step is to show that the conjecture is theoretically coherent and empirically motivated (see Sect. 5). The fourth step is to observe that the conjecture implies an answer to our overall question: Sometimes when people act together with a purpose, it is a certain interagential structure of motor representations in virtue of which their actions are collective directed to outcomes (see Sect. 6).

Our focus may seem unusual. There are rich discussions of commitment (Gilbert, 2013; Michael, 2022; Roth, 2004), of shared intention (Bratman, 2014; Ludwig, 2016; Searle, 1990), of knowledge (Blomberg, 2016; Rödl, 2018; Roessler, 2020; Satne, 2020), of norms (Bicchieri, 2016), of awareness (Schmid, 2013), of temporal coordination (Issartel et al., 2007; Oullier et al., 2008), of reasoning (Gold & Sugden, 2007; Pacherie, 2013; Sugden, 2000), and of agents (Helm, 2008). By contrast, despite a ground-breaking proposal (Pacherie & Dokic, 2006), motor representation has been widely ignored in philosophical discussions in this area. This may be because philosophers assume motor representations play at most an enabling role by coordinating actions. Our aim is to identify a more central role for them in the story of acting together.

2 Motor representation in acting alone with a purpose

The aim of this paper is to argue that when acting together with a purpose, the actions performed are sometimes collectively directed to an outcome in virtue of a certain structure of motor representations. As a starting point, let us outline, in barest detail, an established view about motor representation in acting alone with a purpose. We shall later argue that this view can be generalised to acting together.

Consider very small scale actions, such as playing a chord, dipping a brush into a can of paint, placing a book on a shelf or cracking an egg.Footnote 3 Often enough, the early part of such an action carries information about how the action will unfold. For example, in grasping a book (or tall cylinder) you would probably hold its middle, which makes lifting it less effortful. But if you are about to place the book on a high shelf, you are more likely to grasp the book at one end, which makes lifting it more awkward now but will later make placing it easier (Cohen & Rosenbaum, 2004; Meyer et al., 2013). For another illustration, imagine you are a cook who needs to take an egg from its box, crack it and put it (except for the shell) into a bowl ready for beating into a carbonara sauce. How tightly you now need to grip the egg depends, among other things, on the forces to which you will later subject the egg in lifting it. It turns out that people reliably grip objects such as eggs just tightly enough across a range of conditions in which the optimal tightness of grip varies. How tightly you initially grip the egg indicates your anticipated future hand and arm movements (compare Kawato, 1999).

This anticipatory control of grasp, like several other features of action performance (see Rosenbaum, 2010, chapter 1 for more examples), is not plausibly a consequence of mindless physiology. It indicates that control of action involves representations concerning how actions will unfold in the future. These and other representations which characteristically play a role in coordinating very small scale actions are labelled ‘motor representations’.Footnote 4

What do motor representations represent? An initially tempting view would be that they represent sequences of bodily configurations and joint displacements only. However there is a significant body of evidence for the opposing view that some motor representations represent outcomes to which purposive actions are directed, such as the placing of a book or the breaking of an egg. These are outcomes which might, on different occasions, involve very different bodily configurations and joint displacements (see Rizzolatti & Sinigaglia, 2010 for a selective review). The experiments providing such evidence typically involve a marker—such as a pattern of neuronal firings, a motor evoked potential or a behavioural performance profile—which allows sameness or difference of motor representation to be distinguished. Such markers can be exploited to show that the sameness and difference of motor representations is linked to the sameness and difference of outcomes such as the grasping of a particular object. This supports the view that some motor representations represent outcomes other than sequences of bodily configurations and joint displacements.Footnote 5

If some motor representations do indeed represent such outcomes, why consider them to be motoric at all? Part of the answer concerns their role in preparing and performing actions.Footnote 6 Motor representations can trigger processes which are like planning in some respects. These processes are planning-like in that they involve starting with representations of relatively distal outcomes and gradually filling in details, resulting in motor representations whose contents can be hierarchically arranged by the means–ends relation (Grafton & Hamilton, 2007). Some processes triggered by motor representations are also planning-like in that they involve meeting constraints on the selection of means by which to bring about one outcome that arise from the need to select means by which, later, to bring about another outcome (Rosenbaum et al., 2012). So motor processes are planning-like both in that they involve computation of means–ends relations and in that they involve satisfying relational constraints on the selection of means.

Motor representations can be distinguished from intentions. Future-directed intentions set problems for practical reasoning and are subject to norms such as agglomeration. (In short, the norm of agglomeration says it is a mistake to knowingly have several intentions if it would be a mistake to knowingly have one large intention agglomerating the several intentions; see Bratman, 1987). While there may be kinds of intention which are unlike future-directed intentions, intentions of any kind are inferentially and normatively integrated with future-directed intentions (compare Bratman, 1984, p. 379; Pacherie, 2000, p. 403). By contrast, motor representations are inferentially and normatively isolated from future-directed intentions. To illustrate, consider again the cook who is reaching for an egg to grasp and transport it. Her actions are subject to two kinds of constraint. One kind is associated with motor processes and representations, and is manifest in things such as the end-state comfort effect (which concerns avoiding extreme joint angles; Rosenbaum et al., 2012, p. 926) and Fitt’s law (which links speed and accuracy). Another kind of constraint arises from her intentions and beliefs, which in this case mean she needs to break the egg into the sauce and not the similar-sized, conveniently positioned fruit bowl. If motor representations and intentions were inferentially integrated, we would expect a single process of practical reasoning to enable agents to meet both kinds of constraint. But in fact practical reasoning rarely if ever enables agents to meet constraints associated with motor processes and representations (compare Searle, 1983, p. 151). This suggests that motor representations are interestingly distinct from intentions, at least as intentions are standardly conceived.Footnote 7

These reflections indicate that actions are sometimes directed to an outcome in virtue of motor representations. For we have just seen that, when you break an egg (for example), there may be a motor representation in you of this outcome, the breaking of the egg, and that this motor representation can trigger planning-like processes which coordinate your actions and do so in such a way that, normally, such coordination would facilitate the occurrence of the outcome represented. This ensures that your actions are directed to the breaking of the egg. Motor representations and processes are therefore sufficient for actions to be directed to outcomes.Footnote 8

To summarise, motor representations characteristically play a role in the coordination of very small scale actions. Some represent outcomes such as the placing of a book or the breaking of an egg. They trigger planning-like processes which ensure that sequences of very small scale actions are coordinated around the outcomes represented motorically. And despite resembling intentions in some ways, motor representations and the processes in which they feature are distinct from intentions and practical reasoning. All of this indicates that in some cases where someone acts alone with a purpose, it is motor representations that link her actions to outcomes. In what follows we aim to generalise this claim to acting together with a purpose.

3 Motor representation in acting together with a purpose

The first step is to examine the possibility that when people act together with a purpose, outcomes to which their actions are collectively directed are sometimes represented motorically. A variety of findings jointly indicate that this is indeed the case.

Using simultaneous EEG measurements from a pair of interacting agents performing complementary actions, Ménoret et al. (2014) provided evidence that motor representations in one agent may be affected by facts about another’s actions or the outcomes to which they are directed, even when these have no significant effect on the kinematics of the agent’s own actions. While this doesn’t quite show that an outcome to which both agents’ actions are directed is represented motorically (as Ménoret et al. themselves note on p. 95), the finding does indicate that some or other aspects of another’s actions can be represented motorically when interacting with them.

To take a step further, consider pianists who produce a chord in playing a duet. There is evidence that, sometimes, in a pianist playing a duet, monitoring and control of action involves representations not only of the pianist’s and her partner’s individual contributions but also of the chord that she and her partner are supposed to produce together (Loehr et al., 2013). This indicates that, in some cases, where some agents’ actions are collectively directed to an outcome, this outcome is represented motorically.

Further evidence for this view comes from a study exploiting the fact that imitation facilitates motor responses (Tsai et al., 2011). Subjects were required to imitate single key-presses performed by a counterpart. Each subject sat next to a confederate who also had a counterpart and who, on critical trials, either imitated her counterpart or, in another condition, performed an action complementary to her counterpart’s. The experimenters asked whether subjects represented motorically only the outcomes to which their own actions were individually directed, or whether they also represented motorically outcomes to which their actions and those of the confederate were, or could have been, collectively directed. Where the former occurs, facts about the confederate’s task should not directly affect the subjects’ actions: imitation should facilitate performance in every case. By contrast, where the latter occurs, subjects will only be imitating in conditions in which the confederate’s task is to do what her counterpart does. So performance should vary systematically depending on which outcomes the subject’s and confederate’s actions could be collectively directed to. And this is what the findings revealed (related evidence is provided by Ramenzoni et al., 2014).

While any of these studies can be interpreted in various ways, taken together they provide at least enough evidence to justify treating as a working hypothesis the view that there are motor representations of outcomes to which the actions people perform in acting together with a purpose could be collectively directed.Footnote 9 But taking this hypothesis seriously invites a question. What are those motor representations doing there?

4 Coordination in acting together: a conjecture

We conjecture that motor representations of outcomes to which actions are collectively directed can enable interpersonal coordination. More specifically, we conjecture that there are certain interagential structures (specified below) which include these motor representations and that, where some actions are collectively directed to these outcomes, the structures can facilitate interpersonal coordination of the actions around the outcomes.Footnote 10 The next step in our overall argument is to explain how this conjecture could be true.

Consider what is involved when, in acting alone, you move a mug from one place to another, passing it between your hands half-way. In this action there is a need to coordinate the exchange between your two hands. If your action is fluid, you may proactively prepare to release the mug from your left hand moments in advance of the mug’s being secured by your right hand (compare Diedrichsen et al., 2003). How is such tight coordination achieved? A full answer cannot be given by appeal to physiology alone (Jackson et al., 2002; Piedimonte et al., 2015). Instead, part of the answer involves the fact that there is a motor representation for the whole action which triggers planning-like motor processes, so that the motor representations and processes concerning the actions involving each hand are not entirely independent of each other (compare Kelso et al., 1979 and Rosenbaum, 2010, pp. 244–248). As we have seen (in Sect. 2), such planning-like processes result in motor representations concerning different parts of the action which can be hierarchically arranged by the means-ends relation and ensure that relational constraints on components of the action are satisfied. So when you move a mug from one place to another, passing it between your hands half-way, and when this action and its components are represented motorically in a plan-like hierarchy, it is this plan-like hierarchy which ensures the movements of one hand constrain and are constrained by the movements of the other hand.

Compare this individual action with a similar moving of the mug performed by two agents acting together. One agent takes the mug and passes it to the other, who then places it. This event is like the individual action in two respects. There is a similar coordination problem—the agents’ two hands have to meet; and the outcome to which their actions are collectively directed is the same, namely to move the mug from here to there. Our working hypothesis is that, sometimes, when acting together, this outcome, the movement of the mug from one place to another, is represented in each agent motorically. Such motor representations can trigger planning-like processes, so that in each agent there will be motor representations somewhat like those that would occur were she performing the whole action alone (compare Kourtis et al., 2013, 2014; Meyer et al., 2011, 2013).

But why should this support, rather than hinder, coordination? Suppose that the agents’ planning-like motor processes are sufficiently similar that, in this context at least, they will non-accidentally produce matching plan-like hierarchies of motor representations in each agent. For a plan-like hierarchy in an agent, let the self part be those motor representations concerning the agent’s own actions and let the other part be the other motor representations.Footnote 11 Then, just as in the case where one agent moves the mug all by herself, the self part of each agent’s plan-like hierarchy (grasping and giving the mug with the left hand, say) will be directly constrained by the other part of her plan-like hierarchy (taking and placing the mug with the right hand, say). But matching implies that the other part of her plan-like hierarchy matches the self part of the other’s plan-like hierarchy. This ensures that the self part of her plan-like hierarchy will be indirectly constrained by the self part of the other agent’s plan-like hierarchy. Thus, much as motor representations of outcomes can enable intrapersonal coordination in acting alone, so also can matching structures of motor representations enable interpersonal coordination when acting together. Or so we conjecture.

The case just offered for this conjecture relies on an as yet unspecified notion of matching. In the simplest case, plan-like hierarchies of motor representations match if they are identical. But matching does not require identity. It is sufficient that the differences between two (or more) plan-like hierarchies of motor representations don’t matter in the following sense. First consider what would happen if, for a particular agent, the other part of her plan-like hierarchy were as nearly identical to the self part (or parts) of the other’s plan-like hierarchy (or others’ plan-like hierarchies) as psychologically possible. Would the agent’s self part be different? If not, let us say that any differences between her plan-like hierarchy and the other’s (or others’) are not relevant for her. Finally, if for some agents’ plan-like hierarchies of motor representations the differences between them are not relevant for any of the agents, then let us say that the differences don’t matter. Consider the condition that the differences between plan-like hierarchies of motor representations in two agents don’t matter. Meeting this condition is sufficient to support our proposed explanation of how motor representations could enable interpersonal coordination. So even without fully specifying how plan-like hierarchies in two (or more) agents must be related if motor representations are to explain interpersonal coordination, we can be sure that there are sufficient conditions for such matching.

We can now describe an interagential structure of motor representations capable of facilitating interpersonal coordination. This involves four conditions. First, there must be an outcome to which the actions are, or could be, collectively directed, and in each agent there must be a motor representation of this outcome. Second, these motor representations must trigger planning-like processes which result in plan-like hierarchies of motor representations in each agent. Third, the plan-like hierarchy in each agent must involve motor representations concerning not only actions she will eventually perform but also actions another will eventually perform. Fourth, the plan-like hierarchies of motor representations in the agents must non-accidentally match. When all four conditions are met, the result is an interagential structure of motor representations.

As we have just seen, instances of this interagential structure could coordinate actions performed by people acting together with a purpose in something like the way that ordinary, individual hierarchies of motor representations can coordinate actions performed by a person acting alone with a purpose.

It is no part of our view that the need for interpersonal coordination can always be met by this interagential structure of motor representations. People acting together with a purpose may successfully coordinate their actions around outcomes which are not represented motorically. Suppose that Hannah puts slugs in Sara’s shoes and Lucas puts worms in her hat, where their actions are collectively directed to freaking Sara out. (As they each know, Sara is so robust that finding only slugs or only worms in her clothing would barely perturb her.) Now let us suppose, for the sake of argument, that freaking Sara out isn’t an outcome that is represented motorically on this occasion. So Hannah’s and Lucas’ actions could not be coordinated around this outcome by virtue of motor representations of it. However, their acting together may well involve them passing objects between them, reaching in a coordinated way, and one holding the hat while the other drops worms into it. Such very small scale actions are the sort most plausibly coordinated by the interagential structure of motor representations we have identified.

Is there any empirical motivation for our conjecture that certain interagential structures of motor representations can enable interpersonal coordination? Contrast acting together with a purpose and acting in parallel but merely alone. Even where these two require the same joint displacements and bodily configurations, the conjecture we have just introduced implies that they can differ motorically: acting together with a purpose, unlike acting in parallel, provides reason to expect that there will be motor representations concerning another’s action in each agent. della Gatta et al. (2017) set out to test just this prediction and found some evidence for it (as we discuss in Sinigaglia & Butterfill, 2020, p. 189). A further test of the prediction is provided by Clarke et al. (2019), who use an entirely different paradigm. And although they do not frame it in our terms, the results of Sacheli et al. (2018) can also be interpreted as supporting the conjecture. Further, less direct evidence for the conjecture is provided by some earlier studies including Kourtis et al. (2013, 2014), Meyer et al. (2011).

While more evidence would be needed for us to know that the conjecture is true, it is clearly precise enough to generate readily testable (and actually tested) predictions and there is already enough evidence to motivate considering it as a candidate for truth. But there are also theoretical objections to the conjecture.

5 Two objections

The conjecture we have just provided evidence for is that motor representations can enable interpersonal coordination. Or, more accurately: motor representations of outcomes can be components of certain interagential structures (specified in Sect. 4), and, where some actions are collectively directed to these outcomes, the structures can facilitate interpersonal coordination of the actions around the outcomes.

In this section we reply to two objections to our conjecture’s theoretical coherence. The first objection concerns what is represented motorically, whereas the second hinges on considerations about the direction of fit of motor representation.

There are limits on which outcomes can be represented motorically. Some such limits are linked to peculiarities of the agent in which the motor representation occurs, and, in particular, to the range of actions she can perform. For example, whether the grasping of a particular object can be represented motorically may depend in part on whether the agent can reach it (Costantini et al., 2010, 2011). Doesn’t this make it incoherent for us to conjecture that, without any relevant ignorance or irrationality, there are motor representations concerning actions another will perform? After all, no agent can perform another’s actions—at least not in the cases we are primarily concerned with.Footnote 12

In replying to this first objection it is helpful to invoke the notion of agent-neutrality. By saying that a representation is agent-neutral, we mean that its content does not specify any particular agent or agents. It is plausible that some motor processes involve agent-neutral representations of outcomes (Ramsey & Hamilton, 2010). Indeed, that some motor representations are agent-neutral is implied by Jeannerod’s argument for the view that motor representations concerning your own actions are ‘entangled’ with motor representations concerning others’ actions in such a way that knowing which actions are yours requires the two to be disentangled.Footnote 13

The agent-neutrality of some motor representations allows us to clarify what is involved in one agent having motor representations concerning another’s actions. The first agent does represent outcomes to which the other’s actions are, or will be, directed; and, normally, these representations occur in part because the other’s actions are, or will be, directed to these outcomes. But the contents of these representations, which are agent-neutral, do not specify the other (or the self) as the agent. It follows that an action’s having someone other than you as its agent is not necessarily a barrier to the occurrence in you of motor representations concerning that action.

The first objection concerned content. A second objection concerns direction of fit. It may seem that our conjecture entails that there is a single kind of representation, motor representation, different instances of which have different directions of fit. Apparently, some are world-to-mind insofar as they are supposed to lead to performing actions, while others are mind-to-world insofar as they are supposed to enable predictions of others’ actions.

This objection arises because of a point of contrast between intention and motor representation. On some accounts (Bratman, 2014’s, for instance), two agents’ having a shared intention involves each of them having knowledge of the other’s intentions. They do not, of course, intend actions the other will perform (at least not on many accounts; but see Roth, 2004). By contrast, where motor representations are involved in enabling interpersonal coordination, the conjecture under consideration is that in each agent there are motor representations concerning actions that she will eventually perform and also motor representations—not knowledge of motor representations but actual motor representations—concerning actions that another will eventually perform. So because our conjecture involves saying that there is just one kind of representation here, it appears to entail that different instances of a single attitude can have different directions of fit: some are world-to-mind, others are mind-to-world. But this is arguably incoherent.Footnote 14 So the second objection.Footnote 15

As a preliminary to replying to the second objection, consider an analogy. There is a rotary dial on your oven which enables you to initiate and control the oven’s activity. We might think of the dial as having an oven-to-instrument direction of fit: the oven temperature is supposed to adjust to the setting on the dial. But now suppose, further, that there is an indicator light on your oven which is illuminated unless the oven has reached the temperature specified by the dial. This enables you to use the dial to discover the temperature of the oven: if the light is on, you turn the dial down until just the point where the light goes off. Now the setting on the dial tells you the temperature of the oven. So we might think of the dial as having an instrument-to-oven direction of fit.

This analogy will guide our response to the second objection. The key point can be put like this. There is a core system featuring the dial, thermostat, heating element and oven. Relative to this core system, the dial always has an oven-to-instrument direction of fit. However, there is a larger system which embeds the core system and exploits it for novel ends. This larger system includes you and your capacity to temporarily prevent significant changes in the temperature of the oven (perhaps by moving the dial between settings too quickly for the heating element to respond). Relative to this larger system the dial has an instrument-to-oven direction of fit. So to understand the dial’s functions, we do need two directions of fit, oven-to-instrument and instrument-to-oven. But this is not quite to say that the dial has both directions of fit. For something has a direction of fit only relative to a particular system. Which direction of fit we see depends on which system we are considering. Understanding the dial does not require supposing that anything has two directions of fit relative to a single system.

Our response to the second objection is similar. If we consider planning-like motor processes only (the core system), then each motor representation’s function is linked to initiating and controlling action. From this perspective, only a world-to-mind direction of fit is in view. But these planning-like motor processes can occur in the context of a larger system, one which involves something that somehow prevents performance of action. The functions of this larger system concern predicting which outcomes actions will be directed to. If we consider this larger system, it is natural to describe the motor representations as having a mind-to-world direction of fit. So, as in our analogy, which direction of fit we see depends on which system we are considering. Our reply to the second objection, then, is that our conjecture involves no incoherence when properly understood.

6 How are actions linked to outcomes when acting together?

Our discussion so far has concerned the coordination of actions people perform when acting together with a purpose. How is this relevant to our opening question, which is about understanding in virtue of what actions performed in acting together with a purpose can be collectively directed to outcomes?

The interagential structure of motor representations we identified can be used in linking actions to outcomes. For some (but not all) cases in which people act together with a purpose, we can explain in virtue of what their actions are collectively directed to an outcome by invoking this interagential structure of motor representations. To see how this works, consider first how accounts of shared intention suggest a way of linking actions to outcomes. On at least some accounts of shared intention,Footnote 16 shared intentions relate things we do together to actual or possible outcomes. Suppose we have a shared intention that we move the piano. Then, on these accounts, our having a shared intention consists in part in each of us intending that we, you and I, move the piano, or in each of us being in some other state which specifies this outcome. The shared intention also provides for the coordination of our actions so that, for example, you don’t attempt to take one route while I pursue another, where coordination of this type would normally facilitate occurrences of the type of outcome intended. When people act together with a purpose, this is one way of explicating what it is for their actions to be collectively directed to an outcome. But the interagential structure of motor representations we identified can be used in giving an alternative, structurally parallel explication. For, as we specified in Sect. 4, this interagential structure involves there being a single outcome such that there is a representation in each agent of this outcome, and these representations provide for coordination of the agents’ actions where coordination of this type would normally facilitate occurrences of outcomes of this type. It is therefore possible to link actions to outcomes by appeal to this interagential structure of motor representations.

The result of our investigation implies that this is not merely possible: sometimes it is actually in virtue of a certain interagential structure of motor representation that two or more agents’ actions are collectively directed to an outcome.

Some may attempt to resist this conclusion on the grounds that the interagential structure of motor representations identified in Sect. 4 is not something other than shared intention, but simply one kind of shared intention. To see that this is false, consider that shared intentions, whatever exactly they turn out to be, are inferentially and normatively integrated with ordinary, individual intentions. To illustrate, tonight there is a party and a ceremony. It is impossible for anyone to attend both, and this is common knowledge among Lily and Isabel. They have a shared intention that they attend the ceremony together. But while having this shared intention, Isabel also intends to go to the party. Given their common knowledge, this combination of shared and individual intentions is irrational. Its irrationality is related to that which might be involved in Isabel individually intending to attend the ceremony while also individually intending to go to the party. By contrast, as already mentioned (see Sect. 2), motor representations are not inferentially or normatively integrated with intentions: they do not feature in practical reasoning, nor in any inferences in which intentions also feature; and if there are any normative requirements linking the contents of intentions with the contents of motor representations at all, these are distinct from those governing intentions. So even where the actions of agents who are acting together are collectively directed to outcomes in virtue of an interagential structure of motor representations, the occurrence of this structure in the agents does not amount to their having a shared intention.

A basic requirement on any account of acting together with a purpose is that it explain in virtue of what actions performed in acting together are collectively directed to outcomes. The interagential structure of motor representation we have identified is needed to explain this. Motor representation must therefore feature in any adequate account of acting together with a purpose.Footnote 17

7 Conclusion

We started by asking, When people act together with a purpose, in virtue of what are their actions collectively directed to an outcome? Our answer is that this is sometimes a matter of a certain interagential structure of motor representations being realised and playing a role in coordinating the actions (see Sect. 6 for the conjecture; the interagential structure is specified in Sect. 4). How did we arrive at this answer? We first identified evidence supporting the hypothesis that when people act together with a purpose, outcomes to which their actions are collectively directed are sometimes represented motorically (see Sect. 3). This allowed us to make a conjecture about how motor representations of such outcomes facilitate interpersonal coordination (see Sect. 4). Sometimes these motor representations trigger processes in each agent which result in matching plan-like hierarchies concerning not only actions to be performed by the agent herself but also actions that another will eventually perform. These matching hierarchies realise an interagential structure that could facilitate coordination of the actions performed by people acting together with a purpose. This conjecture is theoretically coherent and empirically motivated (see Sect. 5). It suggests a way of generalising a view about acting alone with a purpose to cases of acting together with a purpose. When an agent acts alone with a purpose, sometimes it is motor representations that ground the directedness of her actions to an outcome. Similarly, we argued (in Sect. 6), when agents act together with a purpose, their actions are sometimes collectively directed to an outcome in virtue of an interagential structure of motor representations.