1 Introduction

This paper explores the role of Joint Attention (JA) in sport. JA is a psychological phenomenon whereby two or more agents focus their attention on the same object while in a state of mutual awareness that the content of their experience is thus shared – and where, importantly, this mutual awareness itself plays a role in determining the content of their shared experience. JA plays a crucial role in joint action and is widespread in collaborative interactions in sport: from a baton handover in athletics to a synchronised stroke in team rowing or a ‘set-and-spike’ manoeuvre in volleyball. At times JA may even hold the key to victory. Consider (in football) Liverpool’s late winner to overturn a 3-goal deficit against Barcelona in their 2019 Champions League semi-final. For the decisive goal, as Trent Alexander-Arnold places the ball for a corner kick and starts walking away to delegate the task to a teammate, Barcelona’s defenders lapse in vigilance. Still walking away, Alexander-Arnold glances towards the box and, catching the eye of his teammate Divock Origi, changes tack. He swivels without pause to drill a low ball across the face of goal for the now-unmarked striker to sweep into the net. Given the abrupt improvisational brilliance of the decisive move and its context, capping the most unlikely of comebacks in such an important match, it was an astonishing moment. What is less often noticed is how it was sparked by JA in the pivotal instant when the protagonists exchanged a glance and mutually recognised a new opportunity for a devastatingly effective joint action.

Sport is founded on constraining rules which afford a limited range of physical challenges: a normative space ritually separated from everyday life where instrumentally ‘useless’ physical skills (like a topspin backhand in tennis) are problematised, tested, perfected and celebrated. Team sports often demand a high degree of synchronization and mutual awareness to perform intricate tasks which test the limits of our capacities for joint action: think of the synchronised movements of a figure skating pair. The sporting environment is also associated with a specialised vocabulary which scholars can draw upon to illustrate JA and understand how the phenomenon is experienced by experts. Sport represents a valuable and relatively underexplored environment for illuminating JA and assessing rival theories of its workings through application to a range of real-world complexities.Footnote 1

While a wealth of existing research from various quarters bears on this discussion, no previous study links the literature on JA to the dedicated philosophy of sport literature.Footnote 2 JA has yet to be addressed in journals and collections dedicated to the philosophy of sport. According to a theoretical approach to games and sport known as ‘formalism’, “[w]hat it means to engage in a game, to count as a legitimate instance of a game, to qualify as a bona fide action of a game, and to win a game is to act in accordance with the appropriate rules of the game” (Morgan 1987, 1). Critics argue that sport depends, in a way not strictly determined by explicit rule-formulations, on an informal ‘ethos’ whereby players and officials embody the relevant normative standards through a shared practical grasp of what the activity demands.Footnote 3 Endorsing such criticisms, I argue that an account of sporting action must not only address “higher-order cognitive states such as commitments, goals, and intentions” having formal rules as their content, but explain how sport is jointly realised, in dynamic interactions between skilled agents, via “lower-level phenomena, including joint attention and various alignment mechanisms” (Tollefsen and Dale 2012). I am sympathetic here to accounts in the philosophy of skill where sporting examples are frequently used to illustrate a spectrum of tightly integrated or ‘meshed’ cognitive levels, from (i) “strategic control” (Christensen et al. 2016, 49), as in explicit knowledge of rules and gameplans; to (ii) “situation control” (ibid.), responsiveness to properties and objects in our immediate surroundings salient to our goals; and (iii) “implementation control” (ibid.), required to execute the intended bodily movements down to fine-grained corrections which keep performance on track in light of unexpected information.Footnote 4 I also draw on philosophical analysis by Pacherie (2011) and, especially, Wilby (2020) which specifically addresses the role of JA in ‘interfacing’ between shared plans and lower-level cognitive processes in joint action. Relevant existing research also includes a rich body of psychologically-focused work on joint action in sport, including Strachan et al. (2020), Williamson and Sutton (2014), Montero and Toner (2021), Araújo et al. (2023), Bourbousson et al. (2015), Muntanyola-Saura & Sánchez-García (2018), Bicknell (2021) and Protevi (2023). While this literature provides valuable insight on the role of lower level cognitive processes in sport, it does not focus on JA in particular. Yet, sport provides a promising context for studying JA and testing various theoretical explanations thereof. In addressing this lacuna, I adopt David Papineau’s methodological recommendation to shine “the spotlight of illumination” (Papineau 2017, 4) in two directions for mutual advantage: with a view to understanding sport as an independently fascinating phenomenon and using it as a lens to examine wider philosophical issues.

This paper pursues a twofold aim, combining one broad and one more narrow thread of discussion. The former provides an exploratory overview of the role of JA in (especially team) sport, a relatively uncharted topic until now. The latter defends a particular philosophical approach as providing the most promising theoretical resources for understanding JA in this context. Section 2 introduces the phenomenon of JA and begins to illustrate its role in sporting action. Subsequent sections add further detail to this picture while developing arguments aimed at testing philosophical theories of JA as applied to sport. Section 3 argues, via a sporting ‘Coordinated Attack’ counterexample, that ‘knowledge-based’ theories of JA fall foul of a vicious infinite regress while ‘relational’ accounts better explain the role of JA in joint sporting action. Section 4 discusses the nature of collective affordances and the objects of JA in joint actions like ‘pressing’ in football. This example is used to drive a wedge between ‘lean’ and ‘rich’ versions of relationalism. Section 5 explores the key issue in this dispute, whether the objects of JA can be individuated ‘extensionally’, in terms of causal properties not sensitive to the ways in which they are perceived. Siding with ‘rich’ relationalism, I reject extensionalism because collective sporting achievements depend on agreement in the perceived ‘aspectual shape’ of objects of JA. Section 6 suggests a new context of application for this rich relationalist account to illuminate the nature of rule-constituted kinds like baseball’s ‘strike zone’. Section 7 summarises the findings of this paper.

2 Joint Attention and Sport

JA enables human beings to grasp how the same objects feature in the perceptual experiences of different agents.Footnote 5 JA is thus essential for referential acts which enable others to recognise which “currently perceived objects” of perception we are communicating about (e.g. “that mountain” or “that book”) (Campbell 2002, 157) and an essential precursor of “collaborative engagements” (Tomasello et al. 2005, 681) in which two or more agents act jointly on the same objects, where this depends on “the uniquely human cognitive adaptation for understanding others as intentional beings like the self” (Tomasello 1999, 40). JA is also necessary for understanding objects and actions in terms of socially constructed symbolic and normative statuses, so is among the capacities which make games and sports possible (Tomasello and Rakoczy 2003). The capacity for JA develops in humans from around nine months of age when infants become capable of following a caregiver’s gaze in (prelinguistic) communicative interactions involving, e.g., pointing, showing and pantomiming. Significantly, JA arises before the development of more sophisticated capacities to form discursive beliefs about the contents of other people’s minds. JA depends on the prior development of the capacity for ‘primary intersubjectivity’ as found in the synchronisation of expressions between parent and child (Trevarthen 1979). This developmental trajectory suggests that capacities for relating to other people are at least as deep-rooted in human nature as those for making sense of the ‘external world’ and raises difficulties for a ‘Theory of Mind’ perspective which holds that our understanding of other minds must be grounded in a theoretical stance and mediated by “propositional representations” (Campbell 2011, 416) which enable us to explain and predict the behaviour of others. Cognitive scientists recognise a range of mechanisms whereby “individuals have their interpersonal understanding enhanced through a ‘meeting’ of minds rather than an endless ascription of high-order mental states” (Gallotti and Frith 2013, 164). Coordination mechanisms like ‘entrainment’ and ‘mirroring’ occur below the level of propositional thought: think of the way members of a rowing team synchronise their movements without consciously framing beliefs or intentions about what the others plan to do.Footnote 6 JA may be a similarly ‘low-level’ cognitive ability which would be “better conceptualized as a motor and skill-like phenomenon, than as a perception- and belief-like phenomenon” (León 2021, 567).

JA is crucial to coordinated action in sport. It enables us to tell which currently perceived “objects, locations, processes and actions” (Wilby 2020, 139) the beliefs and intentions of others are directed at – an essential precondition of coordinated action directed at the relevant particulars. To illustrate how JA is experienced by top sportspersons, consider the reflections of Dutch footballer Dennis Bergkamp, a forward of outstanding technical skill whose genius lay in his quasi-“clairvoyant” ability to “understand … spatial opportunities amid a complex flow of movement” (Winner 2011, 24).Footnote 7 Perhaps this talent for conceiving actions not anticipated by others explains not only Bergkamp’s effectiveness in bamboozling defenders but also the aesthetic impact of unforgettable moments like his winner in Holland’s 1998 World Cup quarter-final against Argentina where, in three immaculate touches of his right foot, he pulled Frank De Boer’s long pass out of the sky, turned the defender inside-out and flicked the ball into the top corner. What a great player does in such moments outstrips the stock of available possibilities that, just a moment earlier, had seemed (to everyone but him) to exist. Bergkamp recognises, however, that his individual successes would have been impossible were he not operating ‘on the same wavelength’ with others to co-create scoring opportunities. Bergkamp recalls experiencing a seamless accord with teammates: “[t]hat’s the thing which in my opinion is the beauty of the game”, he says, “[y]ou create a certain relationship with players. On the pitch they know what I want to do with the ball, and I know exactly what they are going to do” (Winner 2011, 26). Bergkamp repeatedly highlights this sense of mutual awareness of opportunities. Describing the build-up to that famous goal, Berkgamp recalls: “[y]ou’ve had the eye contact … Frank knows exactly what he’s going to do […] You’re watching him. He’s looking at you. You know his body language. He’s going to give the ball” (Winner 2011, 29). Of course, this meeting of minds also can, and often does, elude us. Contrast the absence of chemistry between midfielder Kevin De Bruyne and striker Michy Batshuayi during their 2022 World Cup campaign with the underperforming Belgian national team: “…[Batshuayi had] absolutely no relationship with De Bruyne […] When De Bruyne played the ball in behind, Batshuayi was coming short. When Batshuayi came short, De Bruyne wanted to play a killer ball” (Cox 2022). What is the nature of the mysterious ‘meeting of minds’ that enables players to dovetail in perfect synchrony – but whose absence can leave them looking hopelessly disjointed?

Teammates who coordinate successfully experience what theorists sometimes call a “joint attention triangle” (Carpenter and Call 2013, 5; see Fig. 2 ad loc. for a visual illustration): a situation whereby two or more agents perceive the same ‘object’ – broadly construed to include “objects, processes, features, locations or events” (Wilby 2020, 138) – knowing the other perceives it too. With the schematic image of the triangle, we imagine two lines reaching out to the object from each agent’s perspective, representing their respective lines of vision, plus a third ‘horizontal’ line representing the connection between the agents themselves.Footnote 8 The feature represented here by the horizontal line, the openness or transparency of JA, separates it from several adjacent but distinct phenomena. (i) In parallel attention, two agents may be perceiving the same thing at the same time without being affected by presence of the other. E.g., in a swimming relay, the returning swimmer touches the pool wall before her teammate enters the water; the former attends to this location to touch it as quickly as possible, while the latter attends to the same location in expectation of her cue to start, but their attention is not coordinated and need not make reference to what is going on in the other’s mind. (ii) In gaze-following, one agent follows another’s gaze to see what they are attending to, but the latter need not recognise that they are being thus observed, as when Alexander-Arnold noticed that “[all] the Barcelona players were not concentrating and weren’t looking at the ball” (Banks 2021) (iii) In social referencing, similarly, one party tracks the other’s affective reactions but the latter may be unaware of this. E.g., a young substitute observes her coach’s emotional reactions to events on the field. In none of these cases are agents participating in a genuinely shared experience in the transparently reciprocal sense represented by the horizontal line in the JA triangle. It is essential to JA not just that agents are aware of the same object, but that they are mutually aware of one another’s awareness of that object. This crucially involves processes aimed at monitoring and controlling one another’s attention: e.g., in a 4 × 100 m relay race, the incoming runner hold outs the baton and uses gestures and verbal calls such as ‘hand!’ or ‘stick!’ to indicate preparedness to execute the handover. Both runners attend to the baton while monitoring the other’s speed to ensure a smooth transition, only passing it over when they are ready and well-positioned. While scholars widely agree that mutual transparency or openness is essential to JA, this has proven to be the most puzzling and contentious feature to explain. The next section introduces the theoretical debate between ‘knowledge-based’ and ‘relational’ approaches in the philosophy of JA. Looking to the sporting context helps illustrate how JA works in the real world, providing evidence against which to assess the theoretical claims of these competing approaches.

3 Joint Action and the Regress of Common Knowledge

The main division between rival camps in the philosophy of JA lies between relational and knowledge-based accounts (KB). On the latter, the crucial transparency of JA can be explained through a reductive form of analysis which supposes that the states of individual parties to an episode of JA can be described “without this already implying that there is joint attention involving [each individual] and another” (Campbell 2002, 161). On this view, the states of each agent are constitutively independent of one another and JA arises by aggregation across a series of individuals whose mental states have matching content because they make reference to the same object or prospective joint activity. In terms of methodology, this approach involves an underlying commitment to ‘constructivism’: a form of analysis which aims to reduce complex phenomena to a series of simpler elements or building blocks (Wilby 2020, 130–131). Relationalists instead favour a ‘clarificatory’ approach according to which certain constellations of important and philosophically puzzling concepts may be irreducible but amenable instead to a method which aims “to explore the conceptual, causal and normative links … between them” (Wilby 2016, 105). Accordingly, relationalists hold that we cannot describe the states of each individual in an episode of JA without “already implying that there is someone with whom [that individual] is jointly attending” (Campbell 2002, 161). Thus, the methodologically-motivated expectation that “the kind of relation that can hold between the psychological states of different people” (Campbell 2002, 175) must be reducible to the psychological states of individuals may be a distorting factor, arising from the imposition by philosophers of a reductive form of analysis upon phenomena whose structure resists it. Since, in JA, each ‘element’ of the joint attentional constellation presupposes the whole, Campbell argues that the states of each agent are constitutively interdependent with those of her ‘co-attender(s)’. Thus, the complex triadic relation which links those agents and a third object is understood to be “a primitive phenomenon of consciousness” (Campbell 2002, 161). If processes occurring at the individual level are derivative of this primitive triadic relation, and only explicable in terms of it, constructivist analysis in terms of constituent ‘building blocks’ (the overlapping contents of individuals’ mental states) is misguided. This section explores this theoretical dispute through the sport context, questioning whether, as KB supposes, the difference between the individual and the plural case is of a merely quantitative nature. I conclude, siding with relationalism, that the states and actions of individual co-actors cannot be adequately described without prior reference to others in the full-blown joint attentional scenario. If so, we must begin with this complex social arrangement to understand the contributions of various individuals, and not the other way around.

According to KB, JA is best explained as a kind of “recursive mind reading” (Tomasello 2009, 72) having the following structure: S knows X, A knows X (level 1); S knows that A knows X, A knows that S knows X (level 2); S knows that A knows that S knows X, A knows that S knows that A knows X (level 3) … and so on ad infinitum (Schiffer 1972, 32–33). Thus, for any given level ‘k’ of common knowledge, there exists (at least potentially) a higher level ‘k + 1’ consisting of a situation whereby the relevant agents are mutually aware that the previous level of common knowledge obtains (Wilby 2023). Given the finite processing power of the human brain, this raises worries about empirical plausibility – but since proponents accept that KB produces potentially infinite iterations of common knowledge, but only in the sense that these can be generated by abstraction, the dispute is not whether an infinite regress arises but whether it is indeed vicious. The strongest philosophical arguments against reductionism do not depend on whether recursive mindreading involves the actual performance of infinite mental acts by real human agents. Instead, they rely on the observation that whatever grounds common knowledge must be part of our conscious experience if it is to be “psychologically expedient” (Wilby 2010, 86) in rationalising further thoughts and actions made on its basis. As Campbell (2002) argues, there are two aspects to understanding demonstrative thoughts. We understand the causal role of a demonstrative belief or intention when we are able to identify (i) which particular object or property is the cause of the relevant state and (ii) grasp which other states or actions it may cause. Crucially, JA also has a normative function which concerns whether the relevant states are rationally justified in the sense that the agent has given something a role in thought and action which reflects the properties actually possessed by that object. Campbell considers the regress vicious because it undermines the normative role of JA with respect to the rationality of joint action. Given that it is only rational for each agent to perform their part of a joint action if they are aware that the other intends to do likewise, implicit mental states are unsuited for this role: “the reciprocity of the perceptual scenario has to feature in the agent’s conscious mental life if she is to act jointly with the other subject on what both mutually know” (Seemann 2019, 28).

Campbell brings this point home via what he calls “the puzzle of Coordinated Attack” (Campbell 2002, 167). In ‘coordinated attack’ counterexamples to KB, two or more agents must act jointly on a single target for an optimal payoff, but it is difficult to see how they can ever achieve this given a lack of perfect knowledge about their prospective partner’s intentions. Cases are described such that joint action is obviously rational, because cooperation leads to a big payoff while defection spells disaster, but how such cooperation is possible remains puzzling. Consider the following coordination problem faced by footballers executing a pass, simplified so that only two options are available for each player.Footnote 9 De Bruyne can either play the pass (1) short or (2) long, and Batshuayi can either run (1) short or (2) long. There are thus three possible outcomes: (a) coordination failure: De Bruyne passes long while Batshuayi runs short, or vice versa, and they lose possession (“if he stops, it’s a silly pass for me. Like ‘what did he see?’” (Winner 2011, 26); (b) intermediate outcome: both go short and they retain possession, albeit in a non-threatening area; (c) optimal outcome: they successfully complete the longer pass and Batshuayi is through on goal. Wilby argues that the role of JA in joint action is comprised of three distinct but complementary functions which any theory of JA must accommodate. I introduce these next, before returning to our example to explain how KB’s infinite regress of mental states is vicious because it undermines the crucial ‘transparency function’ of JA in joint action.

First, (i) the ‘plan execution function’ links the moment of performance to a shared plan or intention framed in advance. The footballers in our example are part of a team that aims to win and they share a tactical plan to this end. Their training enables them to perceive their surroundings, as it were, ‘through the team’s eyes’: in terms of “a cultural pillow or frame of shared and public meanings” (Muntanyola-Saura and Sánchez‐García 2018, 434) without which players might simply remain “unmoved” (Wilby 2023, 141) when the moment for joint action arrives. A clarification here may help alleviate the worry that the plan execution function seems to squeeze out any role for the kind of spontaneity that is surely central to skilled performance in sport.Footnote 10 Alexander-Arnold explains that his quick corner actually ran counter to the team’s gameplan, based on his and Origi’s reaction to an opportunity that cropped up unexpectedly: “[i]t just happened … We never trained it. I wasn’t really even meant to take the corner. It wasn’t a routine. I wasn’t playing a trick. I was actually walking away because I was meant to stand somewhere else […] it just fell into place” (Banks 2021). Indeed, the “sweet tension” (Kretchmar 1975, 26) of uncertainty which drives the dramatic tension of sport depends on the resistance of the present moment to the control coaches strive for in their best laid plans. Alexander-Arnold’s manager, Jürgen KloppFootnote 11, recognises this too: “[y]ou cannot tell the players: stand here, and if this happens, run there. Instead, you have to train the impulse…” (Connolly 2021).Footnote 12 This does not mean, however, that there is no role in such joint actions for the plan execution function in Wilby’s sense. The relevant kind of improvisation depends on alignment in terms of which features of a scene stand out as salient for the co-actors. The players could not have anticipated a situation with these precise features, but they are constantly on the lookout for situations of certain types, e.g. situations where players on their own team are unmarked. Footballers are constantly engaged in a process of visual search or ‘scanning’ whereby they monitor their surroundings for opportunities and threats.Footnote 13 This, on Wilby’s account, involves “two factors … a top-down cognitive procedure that is looking to identify”, e.g. unmarked strikers, “and a bottom-up perceptual procedure that presents the subject(s) with an object that looks like” an unmarked striker (Wilby 2023, 151). While agents use scanning to continually update their information about token particulars, and new information often necessitates spontaneous adjustment or a change of plan during implementation, this is best understood as improvisation on a previously established ‘theme’: thus, the plan execution function does not exclude but facilitates intelligent spontaneity in joint action due to its role in bridging between higher and lower levels of coordination.

This interdependency across cognitive levels works both ways. In the absence of JA at the opportune moment, no prior plan could adequately specify the “temporal-spatial range” (Wilby 2023, 8) for the required token acts, because prior plans are necessarily general but action pertains to token particulars perceptually present in our immediate surroundings. Plans are thus dependent for their execution on (ii) the ‘referential function’ of JA, which provides a link between the agents and the relevant token particulars: “demonstratively identifying the specific features of the environment that can act as particular instantiations of the elements that figure in prior plans in only a general way” (Wilby 2023, 138–139). It is vital that the co-actors both recognise the relevant opportunity when it arises. They are able to secure this mutual recognition because of the way in which each agent’s perception of the scene is “embedded within a conceptualized shared plan that is structured with the purpose of identifying (and consequently acting upon)” situations of this kind (Wilby 2023, 151). While the referential function concerns the link between co-actors and things in the world that are the objects of their attention, (iii) the ‘transparency function’ concerns the link between action partners as represented by the horizontal line in the JA triangle. Mutual awareness is necessary here because “the agents are justified in acting in a coordinated way only on condition that the other is … neither agent has a unilaterally decisive reason to do their part” (Wilby 2023, 138). This horizontal link may be confirmed via a ‘checking look’ whereby agents register one another’s readiness to act: “we made brief eye contact and I could see that he was going to do something instinctively” (Origi quoted in Dutton 2020). This spark of mutual recognition completes a chain linking the two agents, their shared plan and the salient perceptually present particulars like an electrical circuit which powers the ensuing joint action.

All three functions are essential to the successful performance of joint actions in sport but the regress chiefly concerns transparency. Let us try analysing the situation of the two footballers in KB’s terms. Obviously De Bruyne and Batshuayi must be attending to the same opportunity to appreciate where and when to act. Now, suppose both players do recognise the salient gap between defenders. De Bruyne intends to play the long ball and Batshuayi intends to make the corresponding run. This is still not enough, however, to make joint action rational because each must additionally know that the other is in the requisite state of readiness. If De Bruyne intends to play the long ball but thinks Batshuayi does not recognise this, it no longer makes sense for him to play that pass. De Bruyne’s reason for action is conditional on Batshuayi’s readiness and vice versa. Thus, joint action depends on a higher level of awareness with the previous ‘level one’ as its content – but that still does not suffice. Suppose De Bruyne and Batshuayi are both attentive and ready to execute the optimal pass. De Bruyne still needs to know that Batshuayi knows that he knows that Batshuayi is ready. If De Bruyne thinks that Batshuayi does not realise he will play the long ball, he will not expect Batshuayi to make the run in behind – and if he does not expect Batshuayi to make that run, it makes no sense to play the pass. So they need a higher state of mutual awareness, ‘level three’, with ‘level two’ as its content. In fact, for any imaginable level “k“ of mutual awareness, there is always a higher level “k + 1” which must be secured to give them reason to act jointly (Wilby 2023, 140). Since these iterations are potentially infinite, the meeting of minds needed to motivate and rationalise joint action is never finally secured and, “epistemically speaking, the agents are in no better a position” (Wilby 2023, 140) than if they had failed to recognise the relevant opportunity in the first place. Common knowledge fails to explain how two or more agents ever reach the point of initiating a joint action. ‘Coordinated Attack’ counterexamples show that KB’s infinite regress of conditional mental states undermines our reasons for acting when the moment comes. The regress is vicious after all.

4 Collective Affordances in Team Sport

The ‘objects’ of JA should be understood broadly. According to influential Italian coach Arrigo Sacchi, a footballer’s actions should be a function of “four reference points: the ball, the space, the opponent and his teammates” (Wilson 2018, 366). Clearly, these four elements are tightly interrelated; e.g. the position of the ball, insofar as it bears on prospective action, just is a matter of its location in relation to teammates, opponents and space. Thus, JA in sport need not be a matter of ‘focused attention’ on a single object or property, as in the simplified model of the triangle, but frequently involves ‘distributed attention’ across several objects or properties at once (Muntanyola-Saura and Sánchez‐García 2018). The relevant properties are not merely physical; they are relational, in the sense that they are individuated in terms of players’ capabilities to act upon them. In ecological psychology, such perceived possibilities for action, which objects and situations in the environment present to an agent, are called ‘affordances’ (Gibson 2015 [1979]).Footnote 14 In joint action, co-actors perceive their environment in terms of properties individuated relative to the action capabilities of the group. JA plays a crucial role in securing perceptual access to such ‘collective affordances’ (Weichold and Thonhauser 2020). According to Seemann’s helpful characterisation: “[t]o be attending to an object or state of affairs is to understand the causal properties of the thing, through a perceptual event, in a way that puts you in a position to act upon it” (Seemann 2011, 191). To be jointly attending additionally requires that each agent involved “be causally sensitive in this way to the other’s focus of attention and behaviour” (Seemann 2011, 199). ‘Co-attenders’ reciprocally ‘monitor and ‘control’ one another’s attention: “you and I each keep track of what the other is attending to, so that we both work to ensure we attend to the same thing” (Campbell 2002, 162). Players exert such influence on each other’s attention via gaze cues, postural cues, gestures, pointing and verbal cues, which “often function less as direct instructions than as context-sensitive nudges to adjust action” (Sutton and Bicknell 2020, 197).

To act effectively, footballers must be alert to affordances of several kinds (Fajen et al. 2008): for themselves: “[t]he ball is under my feet so I can’t really have a good, full swing at it. The only way is to chip” (Winner 2011, 28); for opponents: “… you know where the defender will be and that his knees will be bent a little, and that he will be standing a little wide, so he can’t turn” (Winner 2011, 23); and for teammates: “… you have to pass them the ball and do it in a way that they don’t have to do a lot to score” (Winner 2011, 25). It is important also to note the inevitable perspectival asymmetry between agents in JA: no individual can see all sides of the object and yet they each understand that they are perceiving the same thing from different angles. This asymmetry is essential to sporting challenges where players act intelligently based on awareness of what is perceptually available to others, in contexts where it is frequently expedient to disguise one’s intentions. This gives rise to the “cognitive juggling act” (Strachan et al. 2020, 374) of simultaneously striving to make one’s actions more predictable for one’s teammates and less predictable to the opposition – e.g., consider the disparity between a player’s proprioceptive knowledge of her own bodily movements versus her opponent’s observational knowledge of the same: “[w]e are both going one way but … I’m the only one who knows I want to go somewhere else” (Winner 2011, 27). Interplay between what lies open to view and what is concealed underlies many core sporting skills: e.g. a boxer may strategically ‘telegraph’ certain punches to misdirect her opponent’s attention, eliciting a reaction which makes room for the subtler unexpected blow. If Campbell is right that “the role of conscious joint attention will be to secure … conscious access to the objectives of our interactive tasks, whether they are collaborative or competitive” (Campbell 2002, 174), the role of JA in sport will not be limited to cooperative interactions among teammates.Footnote 15

The set of properties causally relevant for joint action in teams differs from the set of properties causally relevant for individual action because the causal powers of the team are different from those of the individuals that compose the team. Affordances for the team will be individuated accordingly and the objects of JA will be a function of opportunities for joint action available to the team. Members of joint action partnerships, moreover, monitor and control one another’s attention to ensure that all are ‘tuned in’ to appropriate features of the environment to ensure smooth collaboration. Such ‘collective affordances’ include those implicated in what the great Soviet coach Valeriy Lobanovskyi called “coalition actions” (Wilson 2018, 266). Consider a coordinated ‘pressing’ action: a defensive movement in which players try to close off space and “win the ball back quickly, high up the pitch, when the opposition is disoriented” (Wilson 2018, 450), a tactic so effective that Klopp calls it “the best playmaker in the world” (quoted from Wilson 2018, 450). A press must be initiated in a synchronised manner and in significant numbers because a partial or disjointed press leaves gaps for the opposition to play through: “[a]t a certain moment the entire team needs to decide to press NOW and we all move accordingly”.Footnote 16 Players recognise opportunities to launch “the hunt” by jointly attending to tactically salient collective affordances called ‘pressing triggers’: cues that indicate a good chance of “trapping” opponents deep in their own territory.Footnote 17 Pressing triggers may include an opposition player receiving the ball with back to goal or without scanning for danger; receiving a backwards or slow pass; receiving the ball in a certain targeted area of the pitch; being less skilled or confident in possession; taking a poor first touch or hesitating on the ball (Desmond 2022). These cues also depend on the condition of the protagonists themselves: “actors must consider the affordances and limitations of their co-actors, which allows for sophisticated coordination and distribution of tasks” (Strachan et al. 2020, 370). This reflexive understanding of the group’s joint action capabilities is continually recalibrated during implementation as teammates jointly respond to “shifting situational, physiological, and psychological features within the performance context” (Bicknell 2021, 611), engaging a collective sense of “[p]rospective awareness [which allows them to] anticipate the efficacy of actions in [their] immediate future” (Bicknell 2021, 596). Like pilots in formation, co-pressers must stay connected and be prepared to adjust speed, positioning and angle of attack to ensure the most effective ‘swarming’ of opponents. Players must monitor and control the direction of their teammates’ attention and allow the direction of their own attention to be reciprocally influenced thereby.Footnote 18 Co-actors also continually adjust to a range of factors which can render their “sense of prospective agency” (Bicknell 2021, 611) variable and vulnerable. These may include physical factors like fatigue;Footnote 19 psychological factors like lack of confidence;Footnote 20 social factors like strained relationships (King and Rond 2011); or situational factors like the current score.Footnote 21 Teammates’ mutual awareness of one another’s attention to a suitable pressing trigger thus plays a causal role in the aetiology of a ‘coalition action’ like pressing. This includes a reflexive sense of the readiness of one’s co-actors without which they would lack reason and motivation to launch a coordinated attack. Each recognising when the others have ‘cottoned on’, the members of a well-drilled team exhibit extraordinary fluency in the simultaneous and reciprocally controlled performance of tactically optimal ‘coalition actions’.

5 Rich Relationalism and the Objects of Joint Attention

The relationalist account defended above requires further refinement for application to these complexities of the sport context. Campbell’s version of relationalism faces difficulties regarding its construal of the ‘objects’ of JA. Campbell is committed to what Wilby (2023) calls a ‘lean’ version of relationalism, the key differentiating feature of which is its ‘extensionalism’. Extensionalism about the objects of perception says that two agents perceiving the same object “are bound to have experiences with the same phenomenal character” because “the phenomenal character of the experience is constituted by the layout and characteristics of the very same external objects” (Campbell 2002, 116). Campbell argues that the joint attentional relation is extensional in the sense that it is not sensitive to “the ways in which” (Campbell 2011, 424) each agent experiences the object in question. Wilby defends a contrasting ‘rich’ version of relationalism which holds that “the way in which each of the agents is experiencing the object does matter to the individuation of the shared experience” (Wilby 2023, 147). On this view, two agents may experience the same object while their respective experiences of it differ. Here, I argue that rich relationalism is the most promising account for understanding the role of JA in the sport context.

Imagine an experiment where participants must launch a coordinated attack to shoot any rabbits that appear on a screen but not any ducks, and are presented with ambiguous targets in the form of Jastrow’s duck-rabbit illusion.Footnote 22 Although the agents both see the same object (in the sense that there is no difference in the ‘layout and characteristics’ of ‘external objects’ before them) it is in these circumstances only rational to launch an attack if they each additionally see the figure as a rabbit and can be sure that their partner also sees the figure as a rabbit. If there is divergence in terms of the perceived “aspectual shape” (Wilby 2023, 147) under which the object appears to each agent, it is not rational for either of them to launch their part of the attack. Here, the properties that constitute the object as a collective affordance can only be individuated by reference to the protagonists and their shared goals; they are not perceptually identifiable under just any aspect or description. Players could not coordinate successfully by attending to “action-neutral physical properties of the environment” (Fajen et al. 2008, 100); they must jointly focus on aspects that, in light of their training and common purpose, “are meaningful, and provide information about how to control activity so as to achieve behavioural goals” (ibid.). On this view, the relevant properties are ‘intensional’ to the context of this shared activity or “joint engagement” (Seemann 2011, 183). The availability of such ‘aspects’ is keyed to the perspective of skilled agents equipped through experience “with the capacity to see similarities, to make discriminations, and find saliences in things” (Luntley 2003, 84) as required for excellent team performance in the relevant sport. A pressing trigger cannot become the target of a joint action unless grasped as such by the relevant co-actors. Coordination failure results when two players perceive the same event which one recognises as a good opportunity to press (sees under the aspectual shape of a ‘pressing trigger’) but another (say, a new recruit to the team) sees under a different aspectual shape. Not all situations having the same ‘lean’ characteristics are equal with respect to their status as pressing triggers. Note that matters here are further blurred because defenders may use tactics – such as deliberately hesitating on the ball – to ‘bait’ opponents into pressing, exposing gaps in their defensive structure. That is to say, these triggers are really rules of thumb which are defeasible and highly context-sensitive. Determining whether enough of the relevant features are in play at a given moment to warrant pressing as a team requires a joint exercise of skilled practical judgement resulting in convergence on the ‘ways in which’ the relevant features of the environment are experienced. As Wilby argues, such examples show that something more than a merely extensional relation to the object of JA underpins its normative role in making joint action possible. An affordance for joint action in sport is “not a feature of the world with which we are presented, but something we establish, something we make happen” (Eilan in press, 15) when we successfully coordinate our responses through JA in light of a skilled and context-sensitive grasp of salient features of the sporting environment.

6 Joint Attention and Sport-Specific Kinds

This section explores a possible application of the foregoing account of the role of JA in sport to explain how token instances of sport-specific kinds, like baseball’s strike zone, are mutually recognised by players and officials. Playing ‘with’ rather than ‘alongside’ others involves a single token action jointly produced by various agents: “one token action, many participants” (Schmid 2018, 232). The kind of ‘agreement’ needed for playing sport together must pertain not just to the general level at which formal rules define sport-specific action-types but must influence lower levels of alignment determining how token objects and events are experienced and taken up in action (Tollefsen 2002; Wilby 2020). JA plays a crucial role here because of its referential function in securing perceptual access to the “objectives of our interactive tasks, whether they are collaborative or competitive” (Campbell 2002, 174). I argue next, focusing on the example of the ‘strike zone’ in baseball, that rich relationalism can explain how players and officials achieve alignment concerning the ‘ways in which’ they experience token objects and events instantiating the types mentioned in the rulebook.

Baseball’s strike zone is the volume of space through which the ball must pass to be eligible to be called a ‘strike’. The strike zone is officially defined as “the area over home plate from the midpoint between a batter’s shoulders and the top of the uniform pants – when the batter is in his stance and prepared to swing at a pitched ball – and a point just below the kneecap.” (MLB 2023). (I take it that is intended as an ‘extensional’ description of the strike zone.) The content of this “paper rule” (Berman and Friedman 2021, 377) notwithstanding, the applied strike zone rarely matches its official formulation: it is “in fact a relatively free-floating area subject to each plate umpire’s authoritative interpretation” (Lewandowski 2015, 46)Footnote 23 – typically smaller than indicated in the rulebook (favouring batters), and varying across different umpires and circumstances.Footnote 24 The umpire’s task is to determine “whether it is to the pitcher’s credit that the batter couldn’t hit the pitch, or whether it is the pitcher’s fault that he threw a pitch the batter could not reasonably be expected even to try to hit” (Noë 2019, 29). Thus, the strike zone is not an extensionally defined physical location but “a zone of responsibility” (ibid.), and players must “continuously reflexively monitor and adjust their actions to a particular umpire’s practical interpretation of that zone” (Lewandowski 2015, 46).

The strike zone presents distinct but interrelated affordances to the various agents involved in this complex interaction. The umpire’s task is to deliver a verdict as to whether the ball is in or out of the zone. The batter’s ‘strike zone awareness’, however, must be expressed through motor action as it informs whether and how she swings. Similarly, the pitcher’s actions are determined by her perception of this space, as she typically aims the ball just within the legal boundary to present the toughest legal challenge to the batter. JA facilitates this complex interaction by enabling the protagonists to orient themselves towards the same sport-specific object, where some degree of perceptual alignment is essential for the contest to function. Players do not just strive to conform to the extensional description in the rulebook but must jointly attend to the zone enforced by the umpire on that particular occasion. Similarly, determining the zone’s boundaries requires the umpire to attend to the players’ perspectives on the scene. In particular, she must assess, with respect to the batter’s perspective, whether the pitch presents an affordance to swing (with a reasonable chance of success) or not. This task is complicated as players often try to influence the umpire’s perception: e.g. on borderline pitches, catchers may ‘frame the pitch’, “choreograph[ing] their catching movements … to make it seem as if they took the ball in the strike zone” (Papineau 2017, 83–84). In such cases, widespread in sport, a player demonstratively identifies something as an instance of ‘x’ for the benefit of officials. If mutual recognition is secured, the official goes on to “close the triangle” of JA in a “communicative act” which makes her judgement public (Moll 2023). The umpire’s rulings are the most obvious way to get players to recognise ‘her’ strike zone but she may also use other means like verbal cues and gestures to help players recognise, and calibrate their actions in accordance with, the spatial boundaries she intends to establish. Conversely, players and coaches engage in advocacy aimed at ‘priming’ the referee to favour a certain interpretation. While the umpire has ultimate authority over the location of the strike zone, the game could not function were such judgements not comprehended and to some extent shared by players. In the absence of such “perceptual common ground” (Sebanz et al. 2006, 70) between players and officials, rule-enforcement would lose all legitimacy and the game would collapse. JA is essential to establishing this common ground, enabling each agent to direct their role-specific actions towards the same publicly recognised objects. As Seemann argues, agents in the joint attentional constellation are together responsible for constituting the (intensional) ‘social space’ within which the object of their attention is mutually recognised: “what each of us knows perceptually about the scene is to be explained in terms of a spatial arrangement in which the location of things is determined relative to each of our respective standpoints” (Seemann 2019, 75).

7 Conclusion

In this paper, I have argued that the phenomenon of JA plays a central but often-overlooked role in (especially team) sport. Philosophical work on JA informs our understanding of sport while sport helps illustrate the workings of JA, providing a valuable lens to examine rival theories of its workings. I rejected knowledge-based accounts because they are subject to a vicious regress and ‘lean’ relationalist accounts because joint sporting actions depend on a shared grasp of particular ‘aspects’ of sporting objects, where this cannot be explained in ‘extensional’ terms. I defended instead a ‘rich’ relationalist account as best suited to elucidate the workings of JA in this context. JA plays a crucial role at the ‘interface’ between higher and lower-level cognitive processes, allowing us to pick out token features of the environment salient to the shared goal of playing sport together. In addition to providing a useful lens for understanding ‘coalition actions’ like pressing in football, this approach provides resources for understanding the constitution of the sort of rule-governed ‘social space’ within which competition unfolds. Future research might deepen and extend these findings by investigating the role of JA in the teaching and learning of sport-specific concepts, rule-following and officiating, deceptive actions, fans’ experience and competitive or antagonistic contexts. This article has introduced an array of fascinating yet underexplored connections between JA and sport which are deserving of further dedicated study for the reciprocal illumination of both topics.