Introduction

The sense of agency refers to the feeling we have of being in control of our deliberate actions. We have it when we perform deliberate actions, and we do not have it when we generate movements that are not deliberate, such as reflexes or twitches. The sense of agency is surprisingly fragile; it is not always a reliable guide to whether our actions harmonize with our intentions. There are a range of conditions in which the sense of agency arises when it should not, as well as conditions in which it does not arise when it should. The main claim of this article is that the operating systems, apps, and input hardware on our electronic devices create conditions in which the sense of agency is likely to accompany actions that are not genuinely intentional. In other words, there are times that we feel as if we are in control of our clicks and our swipes, when in fact we are not. Rather than being in control, we are automatically reacting to stimuli in more or less predictable ways. Considering the time increasingly spent interacting with our devices (10 h a day, not including work, on one estimateFootnote 1) along with the range of real-world actions that we can perform using them, my thesis may have implications for the future of human autonomy.

Before beginning, I’d like to situate this article in the larger context. In her recent book, Shoshana Zuboff [1] demonstrates that the overall goal of those with power in Silicon Valley is to predict human behavior on a large scale by manipulating individual human behavior. She calls for mass social change in order to prevent a scenario in which the general public is being controlled by a few corporations wielding tremendous power.

But there is at least one understandable reason why someone might not worry about Zuboff’s thesis and instead complacently go along with the status quo. The reason is as follows: it does not usually feel to us as if we are being manipulated through our devices. The best reply to this kind of complacency is a main message of this article, the message that subjective feeling is not always a reliable guide to the causes of our actions. It is possible to feel as if we are in control despite the fact that we are not.

Zuboff makes the case that the theoretical foundation of Silicon Valley’s behavior modification project are the behaviorist techniques of B. F. Skinner with superficial alterations [1]: chapter 12, [2]. Her message is powerful because those techniques, such as conditioning and nudging, are effective. Contemporary work in psychology has moved beyond the behaviorist paradigm and offers additional insight into the causal factors at play during intentional action. This recent work, presented below, identifies the conditions under which we are most likely to have an illusion of agency, to feel control over an action that is automatically triggered by the environment. Screens on mobile devices create precisely those conditions.

Here is an outline of the article. In the following section, I make the case that the sense of agency is fragile by using examples that show illusions of agency occurring under both pathological and non-pathological conditions. These examples raise two questions. The first question is: what are the causal mechanisms at play during intentional action? I answer this question in the section below titled "The Supervisory-Inhibition Model of Action." The second question is: how is the sense of agency generated? I answer this question in "Two Mechanisms for Generating the Sense of Agency" by presenting the two cues involved in generating the sense of agency. The first cue is predictability and the second cue is fluency. In the fifth section of the article I turn to existing guidelines for user interface (UI) design. Those guidelines explicitly encourage designers to create apps that are predictable and fluent, apps that cue for the sense of agency. In the sixth section of the article, I present evidence suggesting that many users engage their smartphones without the sort of intentional supervision required for genuine agency. Such patterns of engagement are likely accompanied by an illusory sense of agency. There I introduce the new concept of Digital Environmental Dependency Syndrome (DEDS) as a possible way of characterizing extended use of the smartphone without genuine (but with illusory) agency. In the final, the seventh section of the article, I offer reasons and strategies for addressing illusions of agency in human–computer interaction. There I suggest ways that future research might inform the design of apps that reduce the likelihood of illusions of agency.

Illusions of Agency

In the preceding section, I made the claim that the sense of agency is fragile. Here are some examples of scenarios in which the sense of agency does not arise as it should. I will begin with the pathological cases and then present some of the scenarios that induce non-pathological illusions of agency.

Pathological Disorders of Agency

a) Schizophrenia

A common symptom of schizophrenia is to have delusions of control. Patients perform actions, but do not have a sense of agency for those actions. As a result, they form the false belief that someone or something is controlling them [3, 4].

b) Depersonalization Disorder

One symptom of this disorder is a loss of the sense of agency for self-generated actions. Unlike schizophrenics, individuals suffering from depersonalization do not form false beliefs about the causes of their actions. That is, they maintain the true belief that they are controlling their own bodily movements, but report losing the feeling as if they are in control. They might report feeling as if they are a robot or an automaton [57].

c) Anarchic Hand Syndrome

This condition involves complex goal-directed actions of an upper limb that are not intentional and may even conflict with the intentions of the patient. Patients cannot inhibit these actions and feel no sense of agency for them [8, 9].

d) Utilization Behavior

When objects are presented to patients with this disorder, the patients grasp and use the objects, even when it is socially inappropriate to do so and even when they are explicitly instructed not to do so [10, 11]. For example, one patient was presented with one pair of eyeglasses after another and put all three pairs on his face, one on top of the next. The patients seem to have a sense of agency for these actions. When asked why they perform the actions that are triggered by objects in the environment, they give vague responses, claiming that they believed the examiner wanted them to perform those actions: “ʻYou held them out to me, I thought I had to use them’” ([11]: 251).

e) Imitation Behavior

This disorder is similar to Utilization Behavior in that patients seem to reply automatically to a feature of the environment. Instead of responding to artifacts, patients with Imitation Behavior will imitate the gestures of the examiner. Again, patients seem to have a sense of agency. They report thinking that they were supposed to imitate [1214].

f) Environmental Dependency Syndrome

This syndrome describes behavior that occurs when Utilization and Imitation behavior combine such that the patients spontaneously play a role solicited by their environmental context. This syndrome refers to complex behavior, while the previous two disorders refer to simple actions [13, 14]. For example, when presented with medical equipment, one patient started playing the role of a physician, taking blood pressure and so on. Two patients were brought into a room with a buffet and about 20 other people. The patient from the higher social background immediately started behaving as a guest, while the patient from the more modest social background started behaving as a hostess. As above, the patients seem to have a sense of agency, reporting that they felt a duty to respond as they did to the environment. Overall, the behavior of these patients seem to be entirely driven by the environment, as they exhibit “mental inertia and apathy” when not stimulated by environmental affordances ([13, 14]: 342). In a more recent case study, a patient was taken to the hospital bar and began to take orders for drinks, claiming that he was on a “two-week trial” for the job of bartender. The same patient claimed (falsely) to be a chef in charge of preparing special dishes for patients when taken to the hospital kitchen [73].

The final three disorders listed above are all associated with frontal lobe damage. Among the pathological cases, they are perhaps the most relevant for the topic of this article. One main goal here is to raise the serious possibility that our electronic devices can cause a kind of Digital Environmental Dependency Syndrome (DEDS). I will return to this possibility in "No Supervision" and "Consequences and Solutions" below.

Non-pathological Illusions of Agency

a) Ideomotor Actions

In the nineteenth century, there was great interest in a variety of purportedly supernatural phenomena such as table turning, divination using a rod (also known as dowsing), and planchette writing (as in a Ouija board). William Carpenter [15, 16] introduced the ideomotor theory of action as a naturalistic explanation for these phenomena. According to this theory, merely thinking about an action can cause one to perform it. If the conditions are right (such as during a Ouija board séance), we can perform actions without having a sense of agency for those actions. Ideomotor actions have been observed under experimental conditions [17, 18], references and discussion found in ([19]: chapter 4).

b) Developmental Illusions of Agency

A number of different experiments have found that children under the age of 5 can have difficulty distinguishing whether or not the actions that they perform are intentional. Children of 3 and 4 years claim that they have acted intentionally for reflex movements and for passive movements in which the arm is moved by the experimenter [2022]. Another study gave 4 year old children the task of distinguishing voluntary from involuntary action in a video. The children responded incorrectly that all of the actions were voluntary [23].

c) The “I Spy” Scenario

This classic experiment involves a square wooden board attached to the top of a computer mouse. The subject and a confederate place their hands on the board, as one does with the planchette of a Ouija board, in order to control the cursor on a screen visible next to them. The screen has images of lots of small objects and both participants are instructed to move the cursor and then stop it on an object after a short interval. Both the subject and the confederate wear headphones through which the subject hears music and some words. The subject thinks that the confederate also hears music and words, but in fact the confederate only hears instructions from the experimenter. These conditions generate an illusion of agency in the subject by playing words through the headphones of the subject shortly before the confederate stops the cursor on the object whose name the subject hears. For example, the subject may hear the word “swan” right before the confederate stops the cursor on the swan. Subjects falsely report that stopping the cursor on the image of a swan was what they intended even though the location of the stop was determined by the confederate [24].

d) Human Error

In his fascinating study on the varieties of human error, James Reason identifies a kind of error that he refers to as “double-capture slips” ([25]: 68–71). These errors involve “double” capture because attentional resources are captured as well as automatic motor responses. Attentional resources are captured by an internal thought or external distraction while motor responses are captured by environmental affordances. Attentional supervision fails to inhibit automatic motor response. Here are some examples that Reason takes from diary studies:

“We now have two fridges in our kitchen, and yesterday we moved our food from one to the other. This morning, I repeatedly opened the fridge that we used to have our food in.”

“I intended to stop on the way to work to buy some shoes, but ‘woke up’ to find that I had driven right past.”

“I meant to take off my shoes, but took off my socks as well.”

“I was putting cutlery away in the drawer when my wife asked me to leave it out, as she wanted to use it. I heard her, but continued to put the cutlery away.” ([25]: 70)

These examples demonstrate that non-pathological (though erroneous) behavior can sometimes be driven entirely by environmental affordances, just as in pathological cases of utilization behavior. Since no one reported feelings of being externally controlled during the error, the sense of agency seems to be present in each of these cases even while the agent is not doing what he or she intends to do. Thus, these everyday cases of error provide additional examples of the illusion of agency. Since many readers can relate to these examples, we might note that the illusion of agency is not a rare occurrence. Also note that these examples fit especially well with the supervisory-inhibition model of action covered in the following section.

The examples listed above all suggest that the sense of agency is not always a reliable guide to the causes behind an action. We can perform actions for which we feel no sense of agency (Schizophrenia, Depersonalization, Anarchic Hand Syndrome, ideomotor actions). Also, we can feel a sense of agency for actions that we do not perform deliberately (developmental illusions), for actions that we do not perform at all (“I Spy”), and for actions that are automatically triggered by the environment (Utilization and Imitation Behavior, and human error). The discrepancy between the sense of agency and the causes of action lead to two distinct questions in the empirical psychology of action. First, what are the causal mechanisms at play during intentional action? This question arises because we can no longer naively assume that actions are simply caused by the agent’s intentions. Second, how is the sense of agency generated? I will address the first question in "The Supervisory-Inhibition Model of Action" and then turn to the second question in "Two Mechanisms for Generating the Sense of Agency."

The Supervisory-Inhibition Model of Action

Acting in the world requires a delicate balance between responding to the affordances of the environment, on one hand, and striving towards goals that are not immediately available, on the other. For example, in the middle of a fast-paced basketball game, the skilled player must respond to the dynamics on the court, to changing environmental affordances. In contrast, when reflecting in solitude on how to resolve a complex social conflict among one’s peers, it is ideal to turn one’s attention away from the immediate environment. This distinction between the temporally immediate versus distant objects of intentional action is well-known in the philosophical literature on the topic [2627, 68]. Note that the distinction need not rely on highly skillful activity. For example, mundane activities such as dusting one’s house or navigating a sidewalk can be described as more or less automatic responses to the affordances of one’s environment.

In empirical psychology, some of the most influential models of the causal dynamics of intentional action are based upon this distinction. According to these models, which I will present below, we balance immediate environmental affordances against long-term goals through the inhibition of action by some supervisory mechanism. Perceiving environmental affordances activates the motor routines that would enable us to act upon those affordances. Seeing a teacup activates the motor routine of grasping the cup in the normal way. When things are going as they should, we are able to inhibit the execution of that motor routine if it would be inappropriate or otherwise undesirable to pick up the teacup.

An early version of the supervisory-inhibition model can be found in William James, who cites Hermann Lotze as an influence. Above, I introduced the explanation of purported supernatural phenomena by appeal to ideomotor actions. In his treatment of the will, James suggested that ideomotor actions are merely the “normal process [of acting] stripped of disguise” ([70]: 522). What he means here is that the flow of thoughts in our mental lives always naturally lead to the corresponding action. When thoughts do not lead to the corresponding action, it is because they are inhibited, or, in James’ words, there is a “conflicting notion in the mind” (523).

More recent work has followed James’ general theme while making adjustments to the model and incorporating additional empirical evidence. Donald Norman and Tim Shallice [29] have developed a supervisory model involving motor schema (also see [28]). Motor schemata are neural representations that can be selected to control action. The basic idea is that perceptual processing can “trigger” motor schemata in order to initiate actions in a more or less automatic fashion. Thus, our automatic actions are driven by what Norman and Shallice call the horizontal thread of processing, which runs (roughly) from perception, to triggering motor schemata, to action. As already mentioned above, not all of our actions are automatic in this way. Sometimes we have to resist the urge to act upon environmental affordances. This fact motivates Norman and Shallice to posit a supervisory mechanism based in conscious attention. The role of supervisory attention, on their model, is to increase or decrease the activation values of competing motor schemata. Conscious attention can be modeled as a vertical thread that serves as a sort of gatekeeper for the horizontal thread, enabling the appropriate motor schemata to initiate action while inhibiting the inappropriate schemata from doing so (see Fig. 1).

Fig. 1
figure 1

Norman and Shallice’s model of action involving vertical and horizontal processing threads (from [29])

The model by Norman and Shallice can account for ideomotor actions in a straightforward manner by appropriating the main idea from James. Recall that James’ suggestion was that a “conflicting notion in the mind” inhibits ideomotor actions from being conducted. On the model by Norman and Shallice, those conflicting notions are represented by vertical conscious supervision of the horizontal processing thread. Both models make use of inhibitory supervision, but a difference is that the more recent models regard the mechanism of supervision to be supported by activity in the frontal lobes. Norman and Shallice’s model is designed to account for an impressive range of empirical results, especially behavior associated with frontal lobe damage.

Readers are referred to their work for the details, but here are two examples. First, recall utilization and imitation behavior and the related Environmental Dependency Syndrome from above. These types of disorder are associated with frontal lobe damage and seem to involve a deficit in the ability to supervise and inhibit motor schemata. A patient sees a tool, or a gesture, or a social context, and these percepts trigger the relevant motor schemata. Due to the brain damage, which compromises the vertical supervisory thread, the patient is unable to inhibit the triggered actions and thereby behaves in the socially inappropriate ways described above.

Another example given above that can be addressed by the Norman and Shallice model would be some types of human error. Recall the examples of double capture errors given above, such as the person who reported intending to take off his shoes but takes off his socks as well ([25]: 70). On the model under consideration, these errors occur when conscious supervision (the vertical thread) fails to supervise adequately the actions that are triggered along the horizontal thread. Norman and Shallice explain as follows: “a schema that controls an incorrect action could become more strongly activated... than the correct schema and capture the effector systems. The supervisory system, being directed elsewhere, would not immediately monitor this, and a capture error would result” ([29]: 12). This explanation fits nicely with the evidence that capture errors tend to occur when individuals are distracted or preoccupied [25]. The supervisory control mechanism is otherwise engaged and thereby unable to inhibit the undesirable action. As one might expect, frontal lobe damage is strongly associated with deficiencies in error correction [3032], all cited in [29].

In addition to James and Norman and Shallice, various iterations of the supervisory inhibition model receive approval from other influential contributions to the literature. The idea of a supervisory system based in the frontal lobes with the function of monitoring motor schemata, for example, is adopted in Marc Jeannerod’s treatment of the fine-grained neurophysiology of action ([71]: Sect. 5.5). Chris Frith et al. [33] have developed a comparator model of action (see below) that appropriates key elements from Norman and Shallice. On their view, the supervisory inhibitory mechanism is associated with intention formation: “Responses to objects in the environment are normally inhibited until an intention has been developed. The system that develops intentions also inhibits inappropriate responses” ([33]: 1783). Along with utilization behavior, Frith et al. model aims to account for optic ataxia, anarchic hand, phantom limb, anosognosia, and delusions of control.Footnote 2

Two Mechanisms for Generating the Sense of Agency

Here is a review of the claims that I have introduced so far. In the second section of the paper, I made the case that the sense of agency is fragile. This fact immediately raises two questions about the nature of intentional action. First, what are the causal mechanisms at play during intentional action? I have answered this question in the previous section by sketching the received view of action generation in psychology and cognitive neuroscience. According to this view, which I have called the supervisory-inhibition model, perception of the environment automatically triggers motor schemata that control action responses. Those responses are supervised and can be inhibited by a conscious attentional mechanism that seems to rely on functionality in the frontal lobes. Now in this section we turn to the second question regarding intentional action: how is the sense of agency generated? This question arises because, as demonstrated above, it is not the case that the sense of agency arises if and only if the action is genuinely intentional. We can have agency for actions that are not intentional and we can perform intentional actions without a sense of agency. There must be some factor other than genuine intention giving rise to the sense of agency.

In fact, several decades of research on the sense of agency suggests that there are two different types of factors or, more precisely, cues at play in the generation of the sense of agency. The first type of cue is based on a comparison between the predicted sensory outcome of an action, on one hand, and the actual outcome, on the other. It is known as the comparator model. When there is a sufficient mismatch in the comparison between the predicted outcome and the actual outcome, there is no accompanying sense of agency. The second type of cue has to do with the mental states leading up to the action. When action selection is fluent, or cognitively effortless, we seem to have a greater sense of agency compared to cases in which there is disfluency between the preceding mental states and the selected action. Here is some of the evidence in support of each type of cue.

The comparator model was not initially formulated as an account of the sense of agency. Instead, it was developed as an account of motor control [36] with roots in cybernetics and control theory in engineering. There are different versions of the model with variations in complexity, but here is the basic idea. Every time a motor command is issued in order to execute an action, there is also at the same time an “efference copy” (also called a “corollary discharge”) of the motor command generated and sent as input to a forward model. The forward model predicts the sensory consequences of the action command. This prediction brings a number of advantages in motor control. One of the most important advantages is that the prediction enables the system to make corrective adjustments more quickly due to the fact that the forward model generates predictions (thereby detecting the need for correction) faster than the sensorimotor feedback from the actual movements of the limb.

A classic bit of evidence dates back to Hermann von Helmholtz [37] who observed that gently moving the position of the eye by using one’s fingers causes a visual experience as if the entire visual scene shifts. When we move our eyes using ocular muscles, the forward model predicts the movement and the visual world does not shift. When we move the eye with the fingers, there is no such prediction.

A second line of evidence in support of the existence of a forward model is the attenuation of self-generated tactile sensations. In other words, touching oneself tends to generate a weaker subjective sensation than being touched by someone else with the same amount of force [38, 39]. This phenomenon can be explained by appeal to the forward model. The motor command, say, to touch the back of one’s left hand with the fingers of one’s right hand sends an efference copy of this command to the forward model. The forward model predicts the experience of a tactile sensation on the back of the left hand and this prediction attenuates the sensation itself because the sensation is expected. When someone else touches the back of one’s left hand, there is no such prediction by a forward model and the sensation is surprising; it is not attenuated. This account can be used to explain why we cannot tickle ourselves [66]. Interestingly, the attenuation of self-generated touch does not occur in individuals with symptoms associated with schizophrenia, a disorder widely thought to involve malfunction of the forward model [33, 40]. As the theory would predict, schizophrenics are able to tickle themselves [67].

The forward model makes sense as an obvious cue in the generation of the sense of agency. A match between internally predicted movement and actual movement is a strong indicator that the movement is self-generated. There has been a great deal of empirical research into the role of the forward model as a cue for the sense of agency, with a standard paradigm making use of a joystick [41] or finger motion controlling the movement of a symbol [42] or a virtual hand on a computer screen [43]. The motion on the screen can correspond to the actual motor movement, or it can deviate from the subject’s movement in various ways. The motion on the screen can be temporally and spatially congruent with the action of the subject, it can be systematically spatially distorted by, for instance, an angular bias, or there can be some temporal delay. Spatial and temporal distortion both reduce the sense of agency for actions [44, 45].

The forward model accounts for the sense of agency in these cases as follows. The motor command sent to the muscles to move the joystick (or one’s finger) is accompanied by the efference copy sent to the forward model which predicts the sensory outcome of the motor movement. The sensory outcome is perceived as motion on the screen. When the motion on the screen matches the anticipated outcome, subjects experience a sense of agency. When there is incongruency between the anticipated outcome and the actual outcome on the screen, the sense of agency is attenuated. Sufficient incongruency can annihilate the sense of agency altogether as indicated by the subject’s attributing the cause of the movement to another agent [45]. The main conclusion that we can draw from the comparator model is that the predictability of self-generated movements is a strong cue for the sense of agency. When actions are predictable, there is a match between prediction and action and this match underlies the sense of agency.

The comparator model is the best known account of the sense of agency, but there is also evidence for another type of cue involved. The comparator model provides a cue for the sense of agency through proprioceptive feedback, which, because it is feedback, must occur retrospectively after the motor movement is executed. The other type of cue for the sense of agency occurs prospectively, prior to the motor movement itself. This other type of model of agency is known as the action selection model [46, 47]. Early support of prospective action selection cues can be found in Daniel Wegner’s interpretation of his “I Spy” experiments [19, 24], mentioned in "Illusions of Agency" above. According to Wegner, the illusion of agency is generated in the “I Spy” scenario due to the occurrence of a thought (due to auditory priming) prior to the perception of the action effect. In order for the thought to cue the sense of agency for the action, the thought must have priority, consistency, and exclusivity.Footnote 3 Priority means that the thought must occur prior to the perceived action. Consistency means that the thought must be consistent with the perceived action – the content of the thought must correspond with the object of the action. Exclusivity means that there should be no other apparent causes of the action. By altering experimental conditions so that, for example, the priority condition is not met due to the timing of the cue relative to the action, the illusion of agency is lost [24]: 489.

More recent empirical studies have prompted a refinement of these initial ideas about prospective action selection (see [49], for example). Valérian Chambon, Patrick Haggard, and colleagues have developed an action selection model according to which fluency or effortlessness of action is a cue for the sense of agency (see [50] for a review). The concept of fluency in cognition is a relatively new and promising area of research in cognitive neuroscience. Examples of factors determining fluency might include the font and contrast of the written word, phonetic and grammatical complexity, or the number of factors involved in making a decision [51].

Fluency of action selection has been incorporated into a number of studies with the use of unconscious priming. Here is an example from Wenke et al. [47], discussed in Chambon et al. [50]. The subject has the task of pressing a left or a right button as instructed by the display of an arrow pointing to the left or to the right. After the button press, there is a random delay and then a color appears on the display. The subject is then asked to evaluate the degree of control that they feel over the outcome of the color display. The fluency or disfluency is generated by an unconscious prime displayed prior to the consciously perceived arrow. On some trials, the subject is shown an unconscious prime that is compatible: an arrow that points in the same direction as the consciously perceived arrow. On other trials the unconscious prime is incompatible: an arrow pointing in the opposite direction from the consciously perceived arrow. Compatible primes are intended to generate fluency in action selection, while incompatible primes are intended to generate disfluency. As one might suspect, the feeling of control is higher with compatible primes and lower for incompatible primes. Chambon et al. conclude: “Consistently, our findings suggest that people may use the fluency (or ease) with which an action is selected as a good advance predictor of actual statistical control over the external environment” ([50]: 7).

In summarizing some of the main empirical results on the sense of agency, Chambon et al. make two general points relevant to our purposes here. First, they claim that the sense of agency is likely generated according to various cues, and that “Bayesian models of cue integration might be able to encompass these dynamic changes in cue weight” (ibid.). Second, they propose that the “results overall support the idea that agency is the ‘default’ assumption, which is only falsified, or reduced, when there is ‘sufficient’ evidence against it” (ibid., also see [23] for an early expression of this idea). I mention these two points here in order to illustrate the gap between our intuitive, pre-scientific conception of agency, on one hand, and the way in which agency is understood in cognitive neuroscience, on the other. While our intuitive conception naturally treats agency as a trustworthy guide to intentional action, the picture we receive here is quite different. On the picture here, the mind, or brain, must continuously “decide” whether to generate the sense of agency for bodily movements based on cues that are weighted probabilistically. Importantly, if Chambon et al. are correct that the “decision” to generate the sense of agency is the default assumption, then it is most reasonable to think that it may not be difficult to create conditions that generate a false sense of agency, to maximize the likelihood of the brain making the default assumption. Now I will demonstrate that our electronic devices are designed to create those conditions.

Agency by Design

In over two decades of empirical studies, researchers have identified two kinds of cues that generate the sense of agency: predictability and fluency. In the guidelines from both Apple and Microsoft for the design of the user interface (UI) for apps, presented below, both companies emphasize that the UI should feature two properties: predictability and fluency. These two companies explicitly advise creators to design apps according to the principles that are known to cue the sense of agency.

Here are some of the relevant passages taken from the UI design guidelines from the two technology giants. Begin with Apple:

An app can make people feel like they’re in control by keeping interactive elements familiar and predictable... (emphasis added)

A consistent app implements familiar standards and paradigms by using system-provided interface elements, well-known icons, standard text styles, and uniform terminology. The app incorporates features and behaviors in ways people expect.Footnote 4

In Microsoft’s “Windows Dev Center,” there is an article titled “The Fluent Design System for Windows app creators.” Here are some pointers from this article:

An experience feels intuitive when it behaves the way the user expects it to. By using established controls and patterns and taking advantage of platform support for accessibility and globalization, you create an effortless experience…

Fluent experiences use controls and patterns consistently, so they behave in ways the user has learned to expect.Footnote 5

Throughout the guidelines for both companies, the recurring terms describing a well-designed UI include: predictable, expected, familiar, fluent, effortless, and intuitive. These are all different ways of describing the features that cue for the sense of agency. It is no secret: our devices are designed to make us feel as if we are in control.

To be clear, I am not suggesting that these strategies are formulated by malicious managers with the deliberate intention of creating an illusion of agency. In defense of these two corporations, there are other justifications for these design features apart from cueing the sense of agency, such as, most obviously, user satisfaction. Nobody likes to use an annoying app. But even without a mens rea in Silicon Valley, the outcome remains the same: the design features on apps today maximize the likelihood that users will feel a sense of agency while engaged with the device.

These design features are embraced by Apple and Microsoft but they are also well-established in the educational literature on human–computer interaction, where the empowerment of the human user is more of an explicit commitment. For example, one college textbook on UI design lists eight golden rules. The seventh rule is that the UI should “support an internal locus of control” because users “strongly desire the sense that they are in charge of the system” [52]. In a recent article on the sense of agency in the human–computer interface, this rule is cited with approval as the authors suggest that the incorporation of techniques from cognitive neuroscience “will encourage the HCI researcher to consider the sense of agency as a quantifiable experience in future research” ([53]: 1). These authors focus on the important question of maintaining feelings of user responsibility by the use of cues that generate the sense of agency. Along with user responsibility, future research might also address closely related questions about the illusion of agency in the human–computer interface, as I indicate below in "Consequences and Solutions".

No Supervision

Let us now bring together the various points made so far. Awareness of the device in one’s personal space automatically triggers various schemata along the horizontal stream that are involved with using the device (see The Supervisory-Inhibition Model of Action). When those schemata activate the motor routine, we engage with our device and are presented with affordances in the digital environment that continue to trigger motor responses. The device can deliver pleasing stimuli from cyberspace without any particular intention from the user, which means that we can have sustained engagement without any need of intentional supervision from the vertical thread. Since the sense of agency is fragile (see Illusions of Agency) and the device is designed to generate cues for the sense of agency (see Two Mechanisms for Generating the Sense of Agency and Agency by Design), this sort of engagement is accompanied by an illusory sense of agency.

Importantly, the scenario that I have just sketched is one in which there is no intentional supervision. Thus, my suggestion is that our sense of agency is likely to be illusory when we engage our devices without a particular intention or goal, or, in terms of the Norman and Shallice model, without supervision from the vertical thread. To be clear, there are surely many instances of engagement with our devices that do involve particular intentions. One might engage with one’s phone only in order to access some particular bit of information, such as the weather forecast, and then disengage once the information is obtained. There is no reason to suspect an illusion of agency in those cases. But the risk of illusory agency does arise in the times that we engage without specific intentions. In those times, our actions may be driven largely by the device yet still accompanied by a sense of agency. In this section of the article, I will present some of the research suggesting that users engage their devices quite frequently without particular intentions or goals, without vertical supervision. This research indicates that the conditions for the occurrence of an illusory sense of agency are not uncommon.

Most of the evidence that we have about the motivations for engaging with mobile devices have come from self-report using questionnaires. These questionnaires tend to focus on problematic or addictive use of the smartphone, but there is currently no consensus definition for such use in the literature – a recent review found 78 different scales that have been used to identify problematic use over the past 13 years [54]. Despite methodological differences, a clear result that emerges across the studies is that users most often do not engage their devices with particular goals in mind.

The most common reason given for heavy use of smartphones is to engage the device in hope of some sort of emotional gain [55]. The emotional gain typically takes the form of alleviating states with negative valence such as fear of missing out or FoMO [5658], boredom [59], or loneliness [56]. One study found escapism to be a main motivation for problematic use [60]. Another common way of using the smartphone without any particular intention is found in the practice of “phubbing,” which is “the act of snubbing others in social interactions and instead focusing on one’s smartphone” [69]. In all of these cases, the reason given for using the device is not to achieve some particular goal in the particular way that one uses the device. Instead, the reason is to change or alleviate or avoid some undesirable state in real life, as it were. Since there is no specific goal in using the device, just the general goals listed above that can be met in any number of ways, there seems to be little requirement for intentional supervision.

One shortcoming of standard methodologies such as the self-report questionnaire is that they lack insight into the user’s context when engaging the device, into what the user may have been doing prior to and during engagement. Heitmayer and Lahlou [61] have deployed a method of Subjective Evidence-Based Ethnography (SEBE) in order to overcome this shortcoming as well as to gain additional detail from the participants about their intentions. This approach involves collecting first-person video of user engagement with the device in their daily lives through a small camera mounted on eyeglasses. The participants are interviewed about their engagement based on the video and the interview is followed by qualitative analysis of the data. In their study, Heitmayer and Lahlou found participants to be surprised at the frequency with which they pick up their phones. Importantly for our purposes here, they also found that users engaged with their phones out of habit and not with specific intentions in mind.

Heitmayer and Lahlou summarize a main finding a follows:

Overall, picking up the phone seems to be widely automatic and habitualized, with participants often ending up with their phone in hand without intending to do so, or longer than they had originally intended. In this context, all but two of our [37] participants mentioned that they felt they spent too much time on their phones. ([61]: 5)

This summary is based upon three findings. First, participants demonstrate habitual engagement with their devices, with one participant reporting that grabbing the smartphone feels as automatic as covering one’s mouth when coughing (ibid.). Second, even when participants do have a specific intention in engaging with the phone, most of the time they end up disregarding or even forgetting this original intention. Instead, the participants find themselves caught “in a loop” in which they spend much more time engaged than originally intended. As one participant reports:

Probably wanted to check the weather or something like this and I usually go on Instagram or Facebook. I pick it up for something, then I forget what I wanted to do and check all the things, my routine, and then I remember, ah yeah, I wanted to check the weather. (ibid.)

This report suggests a lack of intentional supervision by the participant. The third finding is that nearly all of the participants exhibited “fidgeting” behavior with their phones such as opening and closing apps without any reason at all. Some reported “that fidgeting with apps on the touchscreen felt relaxing or therapeutic” (ibid.). Taken together, these three results all reveal patterns of engagement with smartphones that lack specific intentions, and that therefore lack inhibitory supervision from the vertical thread.Footnote 6 Such patterns of behavior fit well with the claim that our devices have been intentionally designed to reinforce continuous repeated engagement [63]. The long-term repeated engagement found by Heitmayer and Lahlou is especially concerning for illusions of agency in light of evidence that features of the sense of agency can be transferred from voluntary actions to involuntary movements through associative learning [64].

The suggestion that there is a lack of inhibitory control in excessive smartphone users also finds support from evidence at the neurophysiological level. Chen et al. (72) recorded event-related potential (ERP) in subjects during a Go/NoGo task. The task requires the inhibition of actions (pressing a button) in response to visual cues. They found that subjects who used smartphones excessively showed a neural response during this task that suggests “general deficits in the early stage of inhibitory control” ([72]: 6).

Consider the sorts of behavior described here in comparison with Environmental Dependency Syndrome, encountered above in "Illusions of Agency." Patients with Environmental Dependency Syndrome are unable to form intentions due to frontal lobe damage and they act without attentional supervision from the vertical thread. The patients lack genuine agency because they lack the ability to form intentions that depart from the affordances of their immediate environment. Perhaps we might conceptualize the patterns of habitual unsupervised smartphone use that have been presented here as a form of Digital Environmental Dependency Syndrome (DEDS). When we engage devices without the sorts of intentions that can inhibit acting upon affordances, then our swipes and clicks are merely reactions to the digital environment. Without the right sort of intentions, such behavior is not genuine agency, although it is nonetheless accompanied by the sense of agency.

Consequences and Solutions

I have made the case that illusory agency is not an uncommon occurrence for at least some subset of the billions of smartphone users on the planet today. Now I will conclude the article with two questions that are raised by my claim. I will offer initial answers to these questions but must leave a full discussion for future work.

The first question is: so what? Why should anyone care about illusory agency during smartphone use? Someone inclined not to care about this result might point out that users often have the general intention of engaging with their devices with no particular intention, of “mindlessly scrolling” deliberately as a form of recreation. Perhaps we just enjoy having the illusion of agency that smartphones offer and what is the problem with that?

While acknowledging the value of personal liberty to engage the device “mindlessly,” there are strong reasons to care about this phenomenon, reasons that arise out of a consideration of the possible consequences of widespread illusions of agency as well as the threat to human dignity raised by the illusion itself. Begin by considering the consequences. Recall the case made by Zuboff [1] that we are being deliberately manipulated through our devices. With illusions of agency, that manipulation is even more dangerous because the illusion masks the extent of external control. If we combine Zuboff’s claims with the argument of this article, then we have a situation in which a large subset of the billions of smartphone users on the planet are being actively manipulated with precision while retaining the feeling as if they are in complete control. A first step in resisting manipulation is to realize that one is the target of manipulation. Illusions of agency can prevent users from having that realization. The consequence of such a scenario is a decreased ability for users to resist the various behavioral modification projects targeting them on a massive scale – projects that might range from nudges towards purchasing a sandwich to nudges towards taking up arms.

In addition to the behavioral consequences of widespread illusions of agency, one might also raise a concern by appealing to the dignity and well-being of the user. Sustained illusions of agency prevent personal growth and self-knowledge. A consequence of behavioral manipulation might be that one is being used as a puppet, but illusions of agency add the additional trouble that one is incapable of realizing that one is being used as a puppet. The illusion itself undermines the human project of self-understanding, a project that ought to be facilitated, not impaired.

The second question that arises from my main thesis is as follows: what should we do about it? There are at least three different domains in which we might respond to the likelihood of illusions of agency in human–computer interaction: research, education, and regulation. Let us start with research. The case that I have made here raises three general groups of questions that can and should be investigated through empirical research. Such investigation may open a path to developing software applications that mitigate against illusions of agency. A first group of questions have to do with detecting occurrences of the illusion. For instance, apart from a first-person report about a lack of intention, are there measures that reveal the user to be experiencing an illusion of agency? That is, are there measures that correlate strongly with self-reports of lack of supervisory inhibition while experiencing a sense of agency? Candidates here might be the types of apps being used and the duration of their use, the length of the session during which the user is engaged with the device, or perhaps even the fine details of the finger motions that serve as input. Biomarkers such as eye-tracking, electroencephalogram (EEG), and functional near-infrared spectroscopy (fNIRS) may be useful as well. This line of research might offer a way to estimate the prevalence of illusions of agency among users. The research may also support design of an app to detect that the user is likely to be suffering an illusion of agency. Such an app could notify users of this likelihood when it occurs and perhaps nudge them to consider their intention while engaging the device. Parents could set this app to suspend the device when children are experiencing illusions of agency.

A second group of questions for research has to do with the actions that users perform while under the illusion of agency. Without intentional supervision, do users tend to engage mostly in more “passive” activities such as consuming news or information from social media? Or, alternatively, are there instances of more active engagement such as making a significant purchase or (re-)posting some message with social impact? In connection with the article by Limerick et al. [53] cited above, do we find a lower sense of responsibility for actions performed while experiencing the illusion compared against actions performed under clear and conscious supervisory inhibition? This line of research will be important in order to gauge the extent to which illusions of agency pose a threat to human autonomy.

A third group of questions surrounds the newly proposed Digital Environmental Dependency Syndrome (DEDS). Is it fruitful to characterize the use habits of some individuals as exhibiting this syndrome? If so, is it a chronic condition for specific individuals or does DEDS perhaps manifest transiently across the general population? Are there factors (emotional, environmental, choice of hardware or software) that place individuals at higher risk for DEDS? If we can identify chronic cases of DEDS, what are some initial strategies for treatment?

In conjunction with responses that involve further research, the second domain in which one might respond to the illusions of agency in human–computer interaction is education, both for the general public and in educational settings. One important message for the general public is that illusions of agency are possible and are likely to be generated by smartphones in virtue of the ways in which apps are designed. In the context of formal education, it will be important to teach students about the possibility of illusions of agency in technology and to equip them with the skills to avoid it. Students should learn that engaging with their devices without clear intentions may give the control of their actions over to the device.

The third domain in which one might respond to the claim of this article is through state regulation. Of course, the tension between regulation and technological innovation is currently a large, controversial, and important topic of discussion [65]. While I will not engage with this discussion here, I do suggest that illusions of agency should be included for consideration in the regulatory context of existing technology, such as smartphones, as well as emerging technology, such as immersive social virtual worlds. Technology regulators, leaders, and activists – indeed, all of us – should reflect upon the likelihood that a large swath of the human population today may be suffering systematic illusions about the locus of control for their own actions during much of their waking lives.