Since the 1940s, Harold Garfinkel developed a sociological attitude known as ethnomethodology that is suited to study interaction. Today the ethnomethodological analysis of interaction is largely associated with ethnomethodological ethnography (Dingwall 1981; Duck 2015), conversation analysis (Have 1998; Psathas 1995; Schegloff 2007) and video-based studies of interaction that explore the organization of vocal, visual and bodily action (see Deppermann 2013; Goodwin 2017; Heath 2012; Meyer and Streeck 2017). The ethnomethodological analysis of interaction reveals how participants display their orientation to the situation they are in, and how each action is related to each prior and each next action.

The ethnomethodological analysis of interaction therefore is concerned with unpacking ‘interaction’ by revealing the organization of action. Ethnomethodologists characterize this organization as “sequential”. “Sequentiality” thereby is not considered as a social-scientific concept but as a principle underlying the ways in which actors themselves organize their actions. Ethnomethodologists analyze action by inspecting how actors oriented to an immediately prior action and how their own action provides the context for each next action (see Heritage 1984; Schegloff 1968). They thus work to reveal how the actors themselves produce their actions in a particular way and at a particular moment in light of their analysis of each other’s actions.

This article begins with Garfinkel’s (2006[1948]) examination of contemporary sociology and philosophy that led him to conceive practical action as an “experiment in miniature”. It then shows how this concept of practical action can be seen as one of the starting-points for the development of the concept of the “sequentiality” and the emergence of ethnomethodological analysis of interaction in the 1960s. After having provided the intellectual background to the ethnomethodological analysis of interaction I will examine two fragments of interaction audio-/video-recorded in a museum and an optometric consultation to exemplify the key concerns of this kind of analysis. The article will conclude with a discussion of current debates in ethnomethodology and conversation analysis.

Action as Social Practice

In Garfinkel’s Studies in Ethnomethodology (1967) the term ‘interaction’ features 31 times. Yet, Garfinkel neither develops a theoretical concept of ‘interaction’ nor undertakes detailed studies of ‘interaction’. Instead, the book is comprised of a variety of studies in which Garfinkel develops some of the key principles of the sociological attitude that has become known as ‘ethnomethodology’. He has arrived at these principles that underlie also the ethnomethodological analysis of interaction from his critical reading of contemporary scholars in sociology, pragmatist philosophy, phenomenology, the information sciences and others (cf. Rawls 2002, 2006, 2008). This engagement with contemporary philosophy and sociology becomes apparent in Garfinkel’s early writings, such as in Seeing Sociologically (Garfinkel 2006/1948) a manuscript he produced in the 1940s.

In this early manuscript Garfinkel examines contemporary sociological discussions as well as Alfred Schütz’s social phenomenology. His analysis is particularly concerned with concepts of the relationship between actor and situation, with the nature and origin of action and with methodological questions. This critical examination of the intellectual debates of his time provide Garfinkel with the basis for the later development of ethnomethodology. They for example show his interpretations of pragmatism, Parsons’ voluntaristic theory of action and Schütz’s social phenomenology.


In the 1940s Garfinkel explored the relationship between actor and situation. At the time, contemporary social science was pre-occupied with behaviorist concepts that use stimulus-response models to explain behavior by considering it as a response to events in the environment; each action generates a response in the environment that engenders another action, and so forth. The resulting stimulus-response chains register with the actor which behavioral scientists sometimes call ‘learning’.

The pragmatist philosopher Dewey (1896) criticized the behaviorist stimulus-response model in his article on ‘The Reflexive Arc’. He argued that experience does not distinguish between stimulus and response, but experience is a continuous process that arises in and through the actor’s practical action. Through their analysis the pragmatists correct the behaviorist position by considering the relationship between actor and situation as a reflexive process. They consider the actor as active being who explores the world (Joas and Knöbl 2009). Through their actions in the world actors develop habits and routines that are effective until they meet ‘resistance’ (Mead 1932a). Resistance creates doubt in the habitual world and engenders reflection and thinking for the development of creative solutions that allow them to continue with the action (Emirbayer and Maynard 2011).

In this view, actor and situation are not separated but interwoven with each other through practical action. An object is constituted moment-by-moment, in, and through, practical action. Thus, object and action become one “collapsed act” (Mead 1938). Mead (1932b) illustrates the constitution of objects by describing how a book is constituted moment-by-moment as the actor notices it and then walks toward and grasps it. In this process of practical action the object poses resistance for the actor. The actor in turn needs to respond to this resistance through actions that again elicit further resistance from the object, and so forth. Thus, as soon as the actor notices an object it becomes intertwined with the action as the object’s resistance shapes each next action, and in turn the object becomes constituted in a particular way through each action.

Resistance is not limited to physical things but also applies to other actors. When an actor meets another they pose resistance to each other and challenge each other’s habitualized trajectory of action. This resistance is progressively overcome by virtue of practical action through which actors engage with each other in interaction and communication (Mead 1926). In interaction actors take the perspective of the other and see themselves from the other’s point of view. Thus, through their actions actors not only constitute the other in a particular way but they also develop a sense of self that is often described as ‘identity’ (see Emirbayer and Maynard 2011).

Some of the arguments made by pragmatists closely relate to Garfinkel’s (2006[1948]) development of a novel sociological attitude in the 1940s. Like the pragmatists he considers action and experience as reflexively interrelated and considers actor and situation as intertwined with each other. It is the task of sociologists to analyze how actors constitute situations through practical action. For such sociological research it is necessary to conduct empirical studies. Other than William James who promoted a radical empiricism with a focus on the individual as experiencing subject Garfinkel argues for empirical research that examines the intertwining of actor and situation.

Parsons’ Voluntaristic Theory of Action

When Garfinkel joined the Department of Social Relations at Harvard, Parsons had begun to assemble around him an impressive number of scholars from a range of social-scientific disciplines who, he hoped, would contribute to the development of a general social theory (Vidich 2000). A central aim of Parsons’ program of research was to provide sociology as a discipline with an answer to one of its foundational questions: how is ‘social order’ possible? To address this question Parsons created a novel concept of the relationship between actor and situation. At the center of this concept is what Parsons (1968/1937) calls the “unit act”. It consists of the actor, her/his goal in the situation and the actor’s orientation to the situation. The actor needs to select means that will help her/him to achieve her/his goal in the situation (Parsons 1952; Parsons 2010).

Parsons’ “voluntaristic theory of action” is designed to explain the selection of goals and means as well as the possibility of social order in light of diverging individual goals and means. The purpose of the theory therefore is to address the implicit tension between subjective orientations to situations and society. Parsons (1951) introduces the assumption that actors’ orientation to situations is shaped by a system of norms and values that they have acquired through socialization, education and communication.

The socially shared system of norms and values is important for the organization and coordination of action. Based on socially shared values and norms actors can expect others to conduct themselves in certain ways. And they can assume that when they meet others those actors will have very similar expectations towards them and their actions. For Parsons therefore social order results from mutually shared expectations grounded in a system of norms and values actors have internalized in the course of their life.

According to Parsons sociological descriptions of the relationship between actor and situation are produced from the perspective of a (social) scientist. Social scientists therefore need analytic tools and techniques that allow them to produce scientific propositions that allow for historic and intercultural comparisons. The ‘pattern variables’ and the ‘AGIL’ are analytic schemata that Parsons developed for sociologists to use when observing and describing the social world (Parsons and Shils 1951). These schemata are designed to help sociologists to produce generic propositions from their research that can be compared across time and cultures.

For Parsons, therefore, social order is a theoretical problem. His theory of social action includes two perspectives on social order. First, he provides a methodological perspective that prioritizes the social-scientist’s methods of making social order observable over the social practices through which participants in the social world achieve order. And second, Parsons argues that social order is possible because participants internalize and orient to society’s system of norms and values. Garfinkel who was a keen admirer of Parsons’ work disagreed with his doctoral advisor’s focus on social-scientific theories and methods. He argued that the perspective put forward by Parsons does not help sociologists to understand (Verstehen) actors’ orientation to the social world. For Garfinkel social order is a practical problem for actors, and it arises from participants’ ongoing production of actions. Sociologists therefore need to explore the practical organization of action to reveal how social order is accomplished in concrete situations, moment-by-moment. In Garfinkel’s view participants’ actions are not prefigured or even determined by norms and values they have internalized but norms and values are resources they orient to and use to organise their actions. The kind of research proposed by Garfinkel requires sociologists to adopt the actor’s perspective. He therefore turns to social phenomenology developed by Alfred Schütz as this provides him with theoretical and methodological concepts to produce an understanding (Verstehen) of actors’ perspective to the social world.

Schütz’s Social Phenomenology

While working toward his PhD at Harvard Garfinkel attended Alfred Schütz’s seminars in New York and exchanged letters with him (Barber 2004; Psathas 2009). At the time, Schütz was known for his development of a social phenomenology based on his analysis of Edmund Husserl’s phenomenology. In his writings Schütz introduced the idea of “cognitive style” as an attitude or orientation people take toward the world and argued that (social) scientists apply a theoretical attitude to the world they observe while everyday actors adopt a pragmatic attitude to deal with practical problems at hand. From this distinction between the scientific attitude and the everyday attitude Schütz arrived at a fundamental methodological question, i.e., how can sociologists investigate the social world from the perspective of the actors? He argued for the need of a change of perspectives to allow the sociologist to comprehend the cognitive style of the actor in the world (Schütz 1953, 1967).

Schütz’s argument implies a critique of Parsons’ approach to the methodology of the social sciences that considers the scientific perspective to be superior to the perspective of the everyday actor (Grathoff 1978). Schütz suggests that sociologists’ task is to see the world in the way in which actors themselves experience it. Rather than developing analytic schemata for the scientific observation of action Schütz (1970) suggests exploring the schemata or typologies that actors themselves use to interpret the world. As Schütz argues these typologies are not subjective and idiosyncratic schemata lodged in actors’ brains but socially shared structures (Schütz and Luckmann 1985). For the sharing and distribution of these structures communication is critical.

Communication and interaction arise whenever actors meet. In such situations actors mutually assume that in principle they experience the world in the same way as the other. According to Schütz this assumption is based on two “idealizations”: (1.) the idealization that in principle actors’ geographical standpoints are interchangeable, and (2.) the idealization that in the situation at hand idiosyncratic personal differences in terms of the biography and interests do not impact the relevance the situation has for the actors (Schütz 1967). Schütz denotes these two “idealizations” as “reciprocity of perspectives” or “intersubjectivity”.

Garfinkel used the critical analysis of Schütz’s discussion of intersubjectivity and of his discussion of the relationship between (social-)science and the everyday to address his concerns against Parsons’ social-scientific approach to studying the social world (cf. Turowetz et al. 2016). From this critical examination of contemporary sociology and philosophy Garfinkel developed his sociological attitude that forms the basis for a sociology that conceives social order as a practical concern for participants. It involves a “radicalization” (Eberle 1984) of Schütz’s position that in Garfinkel’s (1952, 2006) view implied a cognitive bias and maintained the distinction of a sociological and an everyday perspective. Ethnomethodology as a sociological attitude in its own right involves the sociologist in adopting the actor’s perspective and to make sense of the social world as it is produced and experienced by the actor (Garfinkel 1952).

Respecifying the Relationship Between ‘Actor’ and Situation

As Garfinkel developed his own sociological attitude he considered how contemporary sociological theories and theories in other disciplines conceive the actor. He (2008a), for example, examined theories of information and criticized game theory’s construct of the “rational actor” for being distant from actors’ experience of the everyday. The construct of the “rational actor” also is of great importance in sociology informed by Max Weber’s (1978) sociology and theory of action. Parsons (1968/1937) considers “rational action” as the most important of Weber’s ideal types and likens an actor’s rational orientation to the world to the attitude that (social) scientists adopt to it. In his book The Structure of Social Action Parsons (1968[1937]) illustrates the notion of “rational action” by describing the situation of an actor who asks someone for the quickest way from Harvard Square in Cambridge to South Station in Boston; the advice is to use the underground because that is the quickest way to traverse the distance between the two stations. The foundations to this advice can be checked and verified according to people’s mundane experiences of travelling between Cambridge and Boston. These mundane experiences underpin everyday rules that people consider to be the basis for ‘rational actions’. They “are strictly comparable to scientific laws, are indeed themselves entirely adequate scientific laws for the purposes for which they are used” (Parsons 1968: 625). Thus, Parsons implies that the rational actor will follow this law and take the underground.

Garfinkel (1952) critically examines Parsons’ notion of the rational actor and asks how it can be explained that some people do not take the underground but walk the distance between the two stations. Parsons would explain such a decision by suggesting that the actor probably either did not have all the information available to her/him or s/he has decided to act this way for other irrational reasons. He argued that the scientist’s view is superior to that of the actor who has made an “irrational” decision (vom Lehn 2014). The notion of rationality underlying Parsons’ theory has been criticized by Alfred Schütz who in his article on “Commen-Sense and Scientific Interpretation of Human Action” argued that on the common-sense level “actions are at best partially rational and that rationality has many degrees” (1953: 26). Garfinkel draws on Schütz’s arguments and challenges models of the “rational actor” in a book chapter that has become known as the ‘Trust-Paper’ (Garfinkel 1963). In this chapter, he describes tutorial exercises that, for example, required students to visit shops and haggle with sales assistants over the price given on price tags. As it turns out many students managed to purchase goods for prices lower than advertised. Later, in his Studies Garfinkel (1967; see also Lynch 2012) introduces the term “cultural dope” for actors who blindly accept formal rules given by organizations.

The tutorial exercises provide Garfinkel with evidence to suggest that the social order is a practical and local achievement rather than a theoretical framework defined by norms, rules and values that people acquire through upbringing, socialization and education. The observations from the exercises resonate well with Garfinkel’s (1967, 2006) argument that situations do not have particular characteristics that prefigure how participants act within them. Instead he suggests that situations obtain their characteristics by virtue of the social practices through which participants orient to the situation in a particular way. It therefore is not the material and visual environment in which a person acts and the uniform he wears that make him a security guard but he becomes a security guard by virtue of his actions and by virtue of the ways in which others orient to his actions (see Garfinkel 2006[1948]). Other than in contemporary “role theory” (see Linton 1936; Mead 1934) role is not a property of an actor but a practical achievement. It changes in light of her/his practices and in relationship to how other participants orient to them. Similarly, a situation becomes observable and is treated as the meeting of a jury when participants produce practices that others orient and respond to as practices of a jury (Garfinkel 1967). These observations suggest that the characteristics of situations and therewith the “social order” of a situation are not predetermined and stable but they are practical accomplishments. The “social order” always is the local order that is produced in and through participants’ actions. Ethnomethodologists therefore are concerned with the organization of these actions or practices.

The Organization of Social Practice

When examining the social world and producing sociological descriptions Garfinkel’s contemporaries largely adopted a (social-)scientific perspective that requires sociologists to produce historically and interculturally comparable descriptions of the social world. They therefore followed Parsons’ lead in using scientific schemata to make visible order in the social world. This model for social-scientific research that ethnomethodologists at times describe as “formal-analytic sociology” still pervades current sociology. Garfinkel as well his students and colleagues (see Garfinkel 2002; Garfinkel and Sacks 1970; Lynch 1998; Watson 2008) have criticized sociology for relying on this kind of methodological approach that leads to descriptions that have nothing in common with the actors’ experience of the social world. He therefore highlights the need for descriptions that are “uniquely adequate” in the way in which they capture the actor’s experience. For ethnomethodologists, the adequate description of the social world requires sociologists to practically adopt the perspective of the actor and immerse themselves within the actor’s social world.

Each action thereby is produced in a particular moment and designed in a particular way. With reference to contemporary developments in linguistics (Bar-Hillel 1954) Garfinkel (1967) called this property of action ‘indexicality’. When arguing that each action is ‘indexical’ Garfinkel included within these actions the propositions that (social-)scientists produce when describing the social world. Unsurprisingly contemporary sociologists received this argument about their work with disdain as they saw the integrity of their own work undermined. They ridiculed Garfinkel and ethnomethodology and argued it was not producing any observations of relevance to sociology (see Coser 1975; Gellner 1975).

This rejection of ethnomethodology by sociologists origins in their pursuit of objective, historically and culturally comparable descriptions of society. Garfinkel and ethnomethodological research, however, argue that indexical actions embody a local order, and with their studies they pursue to reveal the characteristics of this local order. Through their research they produce detailed descriptions of the organization of actions and thus reveal the “ethnomethods” through which indexical actions exhibit orderliness.

The idea of actions exhibiting orderlinesss is grounded in Garfinkel’s (1967) argument that practical actions are “observable-and-reportable”, i.e., accountable, and therefore intelligible in their relationship to the moment in which they are produced. Accountability ‘runs’ in the background as an ordering principle of action. It is ongoingly produced although mostly taken for granted and only referred to when actors are asked to provide an account for their actions. “Garfinkel concluded that shared methods of reasoning generate continuously updated implicit understandings of what is happening in social contexts – a ‘running index,’ as it were, of what is happening in a social event” (Heritage 1988: 128).

Garfinkel describes the relationship between action and the moment or context in which they are produced as “reflexive”. It is through reflexivity that actors are able to make sense of indexical actions. In this sense reflexivity is the solution of the problem posed by the indexicality of action; indexical actions are bestowed with meaning through their reflexive relationship with the context in which they are produced (Heap 1980). Meaning is not encapsulated within actions, but it arises from the relationship of actions to the context in which they are produced. Heritage (1984: 242) therefore describes action as “doubly contextual in being both context-shaped and context-renewing”. The ethnomethodological analysis of interaction is concerned with revealing the organization of action to explore how participants constitute meaning moment-by-moment.

Sequential Organization of Action: Interaction

‘Interaction’Footnote 1 thereby is conceived as the retrospective and prospective orientation of action. For Garfinkel (2006/1948) the question regarding interaction is how actors align their actions and produce a sense of a co-orientation to the situation. With the focus on the relationship between particular actions and their production, Garfinkel radicalizes Schütz’s (1967) suggestion that intersubjectivity is based on participants mutually making assumptions about each other’s orientation to the situation. Garfinkel instead argues that participants’ orientation to the situation is observable for each other by virtue of the production of their actions. Therefore, intersubjectivity is not lodged in people’s heads but it is a practical accomplishment.

Already in the 1940s Garfinkel developed a basic concept of the organization of practical actions through which intersubjectivity is achieved. Since the 1960s, this organization is described as “sequentiality” (see Schegloff 1968). In Seeing Sociologically Garfinkel (2006[1948]) characterizes actions as “working acts” and argues that each action is an “experiment in miniature” that tests the hypothesis a participant has about a co-participant’s response to her/his action (Garfinkel 2006/1948:180).Footnote 2 His concept of the working act implies that the situation is progressively created as actors mutually and continually test each other’s orientation to it. Each action therefore is at the same time a display of one’s expectation toward the other’s response and an embodiment of the response to the co-participant’s working act. From this recursive relationship of actions results a concerted experience of a situation that is constituted through the participants’ actions. As they act in each other’s presence and mutually monitor each other’s actions they share the same „vivid presence,“ and can later say: „[W]e experienced this occurrence together“ (Garfinkel 2006: 181).

Garfinkel’s (2006) analysis suggests that sequentiality is a characteristic of the organization of action, and not a sociological concept. Actors themselves organize their actions sequentially, and ethnomethodologists strive to uncover the sequential organization through their analysis. Thereby, each action is relevant for the production of activities as long as the actors themselves display an orientation to the actions.

Local Order and Social Practice

With sequentiality Garfinkel provides a way of capturing the organization of actions from participants’ point of view. It markedly differs from Goffman’s (1983) notion of the “interaction order” (cf. Rawls 1987, 1989, 2015). Throughout his work Goffman elaborated on the techniques that actors use to reconfigure their relationship to the situation. He however glosses the details of the deployment of these techniques by describing them in a generic fashion. For example, he (1961) argues that an actor shields her/his ‘self’ from the constraints placed on her/him by the situation. And he (1961) shows how actors maintain some scope for individuality despite their confinement in a total institution. He also explains some of the techniques actors deploy to manage the impression they give of themselves (Goffman 1971a/1959) and to organize their relationships with others in public places (Goffman 1971b).

Goffman’s analyses reveal that the interaction order is a social order in itself that arises from the organization of social practices. The properties of the interaction order are not defined by institutional rules and regulations, but the interaction order has its own principles that remain unchallenged and unquestioned unless actors’ assumptions about others’ attitudes toward the situation are put in doubt. Goffman (1974) discusses such challenges in theoretical terms in Frame Analysis. His arguments stand in contrast to the observations of the practices through which the transsexual Agnes deals with the uncertainties about others’ orientation to her. For example, when Agnes meets others, she can often not be sure how they orient to her which can lead to the emergence of a “problematic community of understandings” (Garfinkel 1967: 126). Other than Goffman Garfinkel does not use his study to produce concepts to describe the generic features of the interaction order, but he is concerned with the concrete and practical deployment of techniques and methods through which actions are locally organized.

Examples for the production of a local order can be found in his book on the ‘Ethnomethodological Program’ (Garfinkel 2002) where he describes tutorial exercises that encourage his students and readers to consider the practical production of order. Beyond the tutorial exercises Garfinkel rarely elaborates on the production of a local order in interaction between participants.Footnote 3 Such a study has recently been published by Duck (2015) who conducted an ethnography of a local community whose everyday is characterized by drug trade and gun violence. Whilst people who do not live in this community avoid entering it because they see it as chaotic and dangerous place Duck explicates the local order that its inhabitants produce and experience through their social practices. As an ethnographer Duck had to learn the principles of the local order to survive:

“I tried to walk around the neighborhood in such a way that I could make observations without being noticed by the dealers. I was especially careful not to do anything that would draw the attention of the more powerful dealers and suppliers. … On some streets I slowed my pace; on others I hurried. In accord with the code of the street, I never made direct eye contact with dealers, even when they were being helpful, and on the few occasions when we did cross paths I spoke only if spoken to”.

(Duck 2015: 43)

Through his ethnography that reminds us of Bittner’s (2013[1965]) “Larimer Tours” Duck has shown that social order is not a theoretical concept but an observable and recognizable feature of the social world. It is produced moment-by-moment in an intelligible way through social practices. Whilst in Goffman’s work we find a large number of concepts glossing features of social order ethnomethodological research points to the need for detailed analyses of concrete social practices through which local order is produced.

Ethnographies, such as Duck’s (2015) No Way Out, use ethnomethodology to explore the production of a local order. They however do not make use of Garfinkel’s concept of sequentiality to examine the organization of action in any detail. Such research was developed by Harvey Sacks (1992) and his students and colleagues (see Sacks et al. 1974; Schegloff 1968) who created the field of conversation analysis. More recently this field has been augmented by research that uses video-recording as principal data for the examination of the organization of social practices (see Heath et al. 2010; Mondada 2009).

Ethnomethodological Analysis of Interaction

For about 60 years sociologists who draw on Garfinkel’s ethnomethodology explore “structures of social action” (Atkinson and Heritage 1985). These sociologists began their research with detailed inspections of the analysis of talk. At the forefront of this research was Harvey Sacks who, initially together with Garfinkel, developed and applied conversation analysis to reveal the organization of action (Sacks 1966, 1992; Sacks et al. 1974). Conversation analysts consider utterances in talk, however small they might be, as action, and with Garfinkel they assume that the meaning of an action is constituted in, and through, the sequential organization of action. Thus, they further develop Garfinkel’s notion of the recursive relationship between action by arguing that such an organization is the basis for meaning to arise moment by-moment. In this view, meaning is not intrinsic to action, but it arises retrospectively in the context that is provided by the previous action and prospectively in that it creates the context for the next action (see Cicourel 1973; Heritage 1984).

The detailed examination of talk relies on audio-recording as principal data. Whilst the recording of actions creates distance between the ethnomethodologist and the field, the possibility to review short fragments of data repeatedly and to examine still frames provide ethnomethodologists with unprecedented closeness to the action. Recordings cannot replace researcher’s “existential engagement” (Honer and Hitzler 2015) with the field but they allow ethnomethodologists to reconstruct the prospective and retrospective orientation of action. Thus, the data enable the recovery of the participants’ attitude to the situation at hand.

The analysis necessitates a focus on short fragments of talk that are transcribed to aid the uncovering of the organization of utterances. The transcription helps the researcher to make intelligible the sequential organization of utterances as participants produced them (Hepburn and Bolden 2017; Jefferson 1984). Ethnomethodologists pursuing conversation analysis consider the participants themselves as conversation analysts who inspect each other’s actions as a basis for the production and design of their own actions. They do not observe the action as distant scientific observers but take the perspective of the participants and ask why an action has been produced in a particular moment, and why it has been designed in a particular way (Have 1998; Heath et al. 2010; vom Lehn 2018a, b).

Already in the 1970s Harvey Sacks realized that “[b]ody behavior in interaction also seems to be, in many respects, sequentially organized” (Sacks and Schegloff 2002: 136) and began to develop a system for the transcription of non-vocal action. It took however, until the 1980s for the analysis of video-recorded interaction to become widely used by ethnomethodologists.Footnote 4 The analysis of talk and bodily action was pioneered by Charles Goodwin (1981) and Christian Heath (1986). Since then, a burgeoning body of research on the organization of vocal, visual and bodily action has emerged, including studies of interaction in workplaces (Heath and Luff 2000; Szymanski and Whalen 2011) and public places like coffee shops and museums (Heath and vom Lehn 2004; Laurier and Philo 2007) as well as the analysis of interaction of mobile participants (see Haddington et al. 2013).

The studies examine interaction by exploiting the opportunities offered by video-recordings, including the possibility to repeatedly view fragments of interaction, the slow-motion function and the inspection of still frames (Heath et al. 2010; Knoblauch et al. 2015; vom Lehn 2018a, b). They are grounded in ethnomethodology and use the methodological tools developed by conversation analysts, in particular the use of transcripts as an aid to uncover the sequential organization of action. The analysis mostly begins with the transcription of talk and then maps participants’ bodily, visual and material action onto the talk in order to reveal how utterances are interwoven with non-vocal action and with aspects of the material and visual environment. Resulting from the detailed analysis are detailed descriptions of how participants accomplish a sense of intersubjectivity in, and through, the organization of their action. In the following section, I will briefly discuss video-recorded fragments of interaction to reveal the interactional production of intersubjectivity when participants are concerned with what each other is seeing.

The Interactional Achievement of Visual Phenomena

People often meet in situations where the concerted seeing of events is important for them. Examples are all kinds of situations where multiple people witness the same situation or object, for example as an audience. When two people stand or sit next to each other they view the same object or event assuming they are seeing it in the same way. Or if they assume they have not seen it in the same way they often begin to engage in conversation. In this section I will discuss two fragments of interaction, one video-recorded in an art museum and the other in an optometric consultation. The analysis of both fragments is concerned with how the participants create a sense of intersubjectivity in and through their interaction.

Aligning Perspectives: Achieving Intersubjectivity

The following fragment has been recorded in an exhibition that shows amongst others Rubens’ painting of the family of Jan Brueghel the Younger (Fragment 1)Footnote 5. We join the interaction after the older lady on the right, Eva, who has inspected and voiced her admiration to a companion before, has turned away from the painting only to return a short moment later. At this moment, another of her companions, Maggie, who has read the label to the left of the piece and briefly glanced at the painting begins to leave the exhibit (Image 1.1.). As Maggie turns to her right Eva arrives near her and encourages her companion to return to the painting by saying, “But I like that that’s Rubens with the Brueghel family” while gesturing with her stretched out right arm and index finger to the piece (Transcript 1, line 1, Image 1.2.).

Fragment 1
figure 1

2 Ladies, Eva and Maggie

A moment later, Eva and Maggie stand next to each other, both looking to Brueghel’s painting. Eva continues her description of the exhibit without obtaining an audible response from her companion. By virtue of a brief pause after highlighting the quality of the painting, “before photographs were there” (line 2) Eva offers Maggie an opportunity to respond. Yet, when a response is not forthcoming Eva expands on her description further by saying, “to bring them to life another painting you see painted” and puts additional emphasis on it by pointing to it (line 2–3; Image 1.3.). Only then, after another short pause Maggie displays a response, “>yah<” (line 5), which brings the joint looking at the painting to a close.

Although both participants have looked at the same painting their response to it is very different. Maggie has read the label and only briefly glanced at the piece. Her companion appears to notice Maggie’s lack of captivation or excitement about the quality of the piece. She produces an expanded description that highlights the painting’s quality but still fails to elicit a response to the piece from Maggie. Even when the two participants stand next to each other and look to the painting Maggie does not display a response that reflects an experience of similar quality as Eva’s. She stands still and looks to the painting but remains silent throughout her companion’s description that is interspersed with pauses that provide her with opportunities to interject and voice her response to the piece. Only at the end of the encounter at the painting, Maggie says “yah” and thus displays agreement with Eva’s evaluation of the piece. This is a moment in which the two participants achieve intersubjectivity; each of their perspectives to the work of art is, or appears to be, in alignment. However, because we do not know how the artist or the curator imagined an audience to see and experience the piece, it remains unclear if the participants also achieved “imagined intersubjectivity,” i.e., if their perspective to the work of art is in alignment with the perspective the artist or curator imagined viewers to adopt (see vom Lehn 2018a, b).

Although eventually Maggie produces a response that displays agreement with her companion’s description Eva cannot be sure that Maggie has seen Rembrandt’s painting in the same way as she did. For the participants the uncertainty about an actual alignment of perspectives is unproblematic, and they rarely scrutinize each other about their experience. Instead, a relatively blunt display of agreement is sufficient to bring the joint examination of the exhibit to a close. If we have an interest in how people manage to see the world together in the same way it might be worthwhile to look for a profession with an expertise in assessing what and how clearly other people can see.Footnote 6

Co-producing Optometric Intersubjectivity

In the United Kingdom optometrists are a profession that specializes in uncovering the quality of their clients’ ability to see. They undertake a series of tests that allow them to progressively determine a quantitative score or metric that describes a client’s quality of seeing and once the series of tests has been completed to prescribe, if necessary, particular lenses to correct their vision. One of the tests undertaken as part of optometric examinations is the Distance Vision Test. Many readers will know this test that involves a standard chart showing rows of letters that from top to bottom become smaller. To the right of each line a figure like 3/6 or 6/6 is printed that indicates the visual acuity score, i.e., the metric that describes how clearly a client can see in the distance. A client who is able to read letters up to the line marked 6/6 has the same clarity of vision at the distance of six meters as a standard client; the visual acuity score of 3/6 on the contrary suggests that the client’s distance vision is half as clear as that of a standard client.

In Fragment 2Footnote 7 we join a consultation when the optometrist has completed the interview and now moves to begin the Distance Vision Test. From the start of the test the optometrist uses formulations that allow the client to make mistakes or to say he is unable to read any letters, “if you can read anything on the middle li:ne” (Fragment 2, line 11).

Fragment 2
figure 2

Distance vision test: optometric consultation

As the optometrist encourages the client to read out letters from the middle line she leans slightly toward the client and holds an occluder over his left eye. While she completes her utterance the optometrist turns her eyes from the client to the monitor attached to the wall overhead and pulls up the letter chart she wants the client to read (Fragment 2, Image 1). The client who sits upright with his face oriented to the screen where the letters have appeared immediately responds to the optometrist’s request and produces a token “eh::” that prefaces the reading of the letters in an even rhythm, “eFf eNn Peeh Dee yoUh↓”. The optometrist acknowledges the client’s reading by turning from the screen to him, lowering her head to a nod and by saying, “thats great” (Image 2, line 13). She then does not bring the test to a close but continues it by encouraging the client to read with another formulation that allows him to make mistakes or to say that he is unable to see any of the letters clearly enough to read them out, “the bottom line at all?” (line 13). After a short hesitation (line 14) the client reads, “Peeh >eitcH or eNn< Deeh whY Zed”. The reading is accomplished quickly with a change in rhythm after the first letter. Through this change in rhythm and the particular vocalization the client displays uncertainty about one letter in the row, “eitcH or eNn” (line 16) before bringing the reading of the line to a close. Save for the display of uncertainty and mistaking a ‘V’ for a ‘Y’ the client displays confidence in reading out this smaller line of letters encouraging the optometrist to acknowledge the reading as “great”. She then says she will pose the client “a bit of a challenge” (line 17) and changes the chart to show a set of smaller letters” for the client to read. She asks him if he can read any letters from the smallest line in this chart, “so anything at all on that bottom line?” (line 18). Knowing that the client might have difficulties to read this line of letters the optometrist further qualifies the request by saying that, “you might not got (.) might not get very many” (line 18). Already in the middle of the production of the optometrist’s request the client begins a vocalization, “n::::::” that might be treated as the beginning of an attempt to read the line. He then firmly states that he is unable to read any letters on the line, “n:::::::::n::::::::::::::NO” (line 19). Subsequently the optometrist brings the test of the right eye to a close (line 20).

The fragment reveals the organization of actions through which the Distance Vision Test is undertaken. It involves the optometrist working to encourage the client to read rows of letters from the chart even if they have displayed difficulties in reading out a previous row. In formulating their request optometrists do take clients’ prior reading performance into account and allow them to make mistakes. Their interest is in identifying the line of letters that clients are unable to read or where they can read only a few letters. This allows them to transfer the visual acuity score from the chart to the client record form and add how many letters the client was unable to read from this line (vom Lehn et al. 2013). The score written in the record form enables optometrists to compare the client’s ability to see in the distance with a standard client defined in textbooks and by the creators of the vision chart.

Discussion: Two Kinds of Intersubjectivity

The analysis of these two fragments suggests how participants organize their actions in two institutional settings. In both settings the participants’ actions are oriented to a particular visual object. Participants standing at a painting in a museum display through their bodily positions and visual orientation that they are looking at a particular work of art together. For the Distance Vision Test in optometric consultations, optometrist and client focus their actions on the letter chart. What the participants in both settings actually look at emerges in interaction between them. At the painting, talk and gesture are used to reference aspects of the piece. Similarly, the optometrist uses talk and referential practice to highlight particular lines of letters for the client to read out. In the situation at the painting it has been sufficient for the co-participant to voice a confirmation or agreement with her companion. The procedure of the Distance Vision Test puts certain demands on the client who is requested to read out lines of letters to display that he is actually able to see what the optometrist asks him to look at. Other than the voicing of a confirmatory “>yah<” in a museum, the reading out of letters makes seeing accountable. The optometrist can use the reading performance to produce a score that reflects the ability of the client to see in the distance.

Coupled with these differences in the local order of the interaction in the two settings is the observation that in both cases the participants align their visual orientations and establish a sense of intersubjectivity that serves the purposes at hand. As part of the Distance Vision Test clients’ reading performance is assigned a visual acuity score that embodies a theoretical standard of client’s clarity of vision. Resulting from the process of the test therefore is an optometric intersubjectivity (vom Lehn 2018a, b) that defines the distance from where the client can see the letters as clearly as the standard client. The score indicates how close or far the client has to come to the letter chart to be able to maintain the assumption that her/his standpoint in principle is interchangeable with that of the standard client (and that s/he adopts the same system of relevances to the situation as the standard client).Footnote 8

The detailed scrutiny of fragments suggests that the local order of the organization of the participants’ actions underpins the possibility of the emergence of intersubjectivity. If we return to Garfinkel’s (2006/1948) metaphor of “experiment in miniature” for a moment, we can argue that in museums through each vocal or bodily action participants “test” each other’s seeing of an object. The participants mutually entertain the expectation that others will respond to each other’s action in a particular way. By virtue of vocal and/or bodily action participants display how they orient to and experience the work of art in light of the co-participant’s action. Thus, they generate a sense of a mutually aligned orientation to the painting.

In the Distance Vision Test we can take Garfinkel’s metaphor literally. The optometrist literally tests if the client sees the letters on the chart. The purpose of the sight test, however, is not to establish intersubjectivity between the client and the optometrist, but between the client and the standard client. By turning to the chart and reading out the requested lines of letters the client aligns with the optometrist’s orientation, and practical intersubjectivity is achieved. As the reading from the chart is brought to a close the optometrist uses the information gauged from the client’s actions and is able to establish the visual acuity score and therewith optometric intersubjectivity between the client and a standard client. Optometric intersubjectivity implies that the geographic locations of client and standard client are interchangeable and that when in an optometric consultation they both could approach the sight test with the same system of relevances.


This article has argued that over the course of his career Garfinkel has developed a distinctive sociological attitude that provides the starting-point for the emergence of ethnomethodological analyses of interaction. At the heart of his œvre is the question for the possibility of social order. With the pursuit of this question Garfinkel investigates an issue that has been one of sociology’s foundational problems. Here, I have briefly explored how Garfinkel developed a solution to this problem that focuses on the social practices through which order is intelligibly produced moment-by-moment. For ethnomethodologists, order always is local order produced in a particular moment.

Drawing on Garfinkel’s sociological attitude studies have emerged that use ethnographic observations and participation in the field as well as audio-video-recordings to reveal the local order of interaction. These studies have begun to reveal some of the constitutive practices through which actions are momentarily organized (see Korbut 2014). In this article, I have examined interaction in two different institutional settings. In both settings the participants achieve intersubjectivity, i.e., the participants are able to align each other’s perspectives to a visual object and accomplish a moment in which they both display how they see it. But only in the optometric consultation the analysis also suggests that a different kind of intersubjectivity is achieved that is related to the institutional context where the interaction takes place. The participants achieve optometric intersubjectivity when the optometrist uses the reading out of letters to align the client’s perspective with that of the standard client by assigning the reading out a visual acuity score. In the museum, however, the achieving of intersubjectivity with the artist and/or designer is not observably produced.


Over the past couple of decades, conversation analysis and the analysis of interaction based on video-recordings, often called “multimodal analysis” (Deppermann 2013; Mondada 2016) has been expanding. This expansion is partly due to the uptake of conversation analytic methods and techniques by disciplines other than sociology, including linguistics. Ethnomethodology has not seen a similar development but instead has influenced applied areas, such as the computer sciences. At the same time, ethnomethodologists occasionally criticize some developments in conversation analysis for ignoring its ethnomethodological heritage as they would rely all too much on abstract analytic constructions.Footnote 9

Controversy about ethnomethodology from within and without the field is nothing new (see Hammersley 2018; Maynard and Clayman 1991; Ruggerone 1996). Rather than problematizing these debates we can also interpret them as a further normalization of the field that after its revolutionary beginnings now increasingly meshes in with sociology more widely. With regard to conversation analysis we currently observe the emergence of a variety of conversation analytic approaches, some more closely related to ethnomethodology, others moving away from here and becoming more scientific in design. At the same time, new varieties of ethnomethodology are emerging as well. Examples for these developments are novel research in science and technology studies (Alač 2011; Sormani 2014), organizational analysis (see Llewellyn and Hindmarsh 2010), practical mathematical reasoning (Greiffenhagen and Sharrock 2011; Greiffenhagen 2014) and many others that deploy a range of methods of data collection and analysis to reveal the interactive achieving of social phenomena. Over the coming years it will be the task of ethnomethodologists and conversation analysts to jointly further develop their field and enhance its relevance for studies of interaction in sociology and other disciplines.