1 Introduction

The concept of culture and what constitutes it can be interpreted in many different ways. Some immediately think of languages or countries, others may use it to refer to books or films. Many academics have attempted to formulate the concept of culture. The book Redefining Culture (Baldwin 2006) lists 313 definitions from different disciplines including psychology, sociology linguistics, anthropology, political science and philosophy, to name only a few. Smith (2016) argues that the plethora of definitions cause conceptual incoherence. Definitions range from those that define location (Smith 2000), shared symbol and meaning systems (Berger 1995; Morawska and Sphon 1995; Matthews 1996; Spillman 2002), and genetic or social heritage (Greenfeld and Malczewski 2010; Hofstede 2001). However, when it comes to introducing culture into social robotics this concept is commonly reduced to only one interpretation: nationality based macroculture.

Our argument is that defining culture this way is problematic for two reasons. First, relying on a definition of culture based on national traditions and macrostructures creates a methodological roadblock for cultural robotics. This requires culture to be interpreted as a set of rules to be encoded into a robot rather than emphasising cultural learning. Examples of such rule-based cultural encoding can be seen in (Wang et al. 2010; Eresha et al. 2013; Khaliq et al. 2018; Trovato et al. 2015), to name only a few. Second, the focus on national culture obfuscates the nuances and complexity of culture, for example subcultures (Martin et al. 2002). Again, this hinders the development of culturally competent robots as small groups with their own subcultures can differ significantly from national cultures. Additionally, there is an ethical concern, namely that groups that do not fit into traditional national culture definitions are excluded. These individuals are often marginalised groups such as refugees and immigrants. Excluding these individuals can further ostracise these communities from cultural robotics research. Inclusion of this type is critical to ensure that such robots can be adequately deployed for use in places such as transit hubs, immigration bureaus, visitor centres, and city government buildings. In these places, robots could be used to either provide basic information or perform certain administrative tasks.

Given that the assumptions made about culture in the existing literature are inadequate, we propose to widen the understanding of culture within robotics to encompass the emergent nature of culture. This paper argues that the concept of culture in robotics should be revised based on current research in other social and cognitive sciences. We propose a definition that borrows foundational principles from Embodied Cognition, Ecological Psychology, and Dynamical systems research (Calvo and Gomila 2008; Lobo et al. 2018; Shapiro 2014). Our proposal is that culture should be thought of as an emergent phenomenon that arises from interactions, and that subcultures belonging to localised groups rather than national culture should be the subject for cultural robotics. As a result, we contend that what matters for cultural robotics is, at least initially, not a ‘culture’, but social dynamics and social learning that leads an agent to mutually participate in and co-create culture.

This paper will argue for and justify the need for a new definition of culture, describe our proposal for an emergent understanding of culture, and explain how our proposal addresses the problems we identify with current definitions of culture. First, we will discuss current research in cultural robotics and demonstrate that ‘culture’ is generally interpreted as ‘national culture’. We will then indict the current definition, and call for a new definition. We argue that, when we consider definitions of culture that go beyond national culture, it becomes clear that culture is not simply a collection of facts in a knowledge base or set of norms that guide behaviour. Instead, culture is a phenomenon that emerges from interactions between agents in their environment. In addition, we will discuss how viewing culture as nationality is problematic and inherently exclusionary. We argue that a nationality-based definition excludes refugees and stateless persons, and is often simplified to nationality, which fails to isolate culture from politics and economics. These can lead to the marginalisation of minority cultural groups and to pave the way for future social robots that serve an already privileged few.

Once we have demonstrated the need for a new definition, we will review a variety of definitions in the social sciences literature. Research demonstrates that there are six general categories of definitions for ‘culture’ that are used throughout the social sciences (Kroeber and Kluckhohn 1952). There is a lack of consensus as to what ‘culture’ is, indicating that researchers often define culture via their investigative domain. While some may see this as scientifically suspect, that issue is beyond the scope of this paper, as many definitions are commonsensical, i.e. defining language and slang terms as cultural seems appropriate. While each of these definitions has a pragmatic use for other areas of research, we wish to refocus the question to: ‘what matters about culture for cultural robotics?’ We argue that none of the definitions are sufficient and that culture should be thought of as an emergent phenomenon rather than employing one or a combination of the major six categories of definitions. The need for a new definition is not only motivated by our earlier objections, but also that current definitions of culture are not responsive to contemporary theories of cognition, namely embodied cognition, ecological psychology, and dynamical systems theory.

Our positive proposal begins by generally defining emergence and the consequences of such a definition for cultural robotics. Our proposed definition relies on key principles from each of the three aforementioned research programs. We do not claim to endorse each of these theories in their entirety, but only that they each yield important insights on how to continue research into cultural robotics. We leave open the possibility that each of these research programmes may yield different projects in cultural robotics; however, we defend that insights from each are needed to advance cultural robotics.

Given the emergent nature of culture, it is not adequate for a robot to merely possess knowledge of the rules and norms of a certain culture to engage in culturally correct interactions. Instead, robots must become participants in the process that leads to the emergence of culture. Consequently, it is important to establish not which knowledge a robot should have, but which capacities a robot must possess to successfully engage in such interactions. In our paper, we provide such an analysis. The position we put forward in our paper, along with our analysis of which capacities a robot should have to participate in the emergence of culture lays the foundation for a new roadmap for research in cultural robotics. This also guides us to explore which technical approaches are appropriate to realise the vision of cultural robotics we advance in this paper.

2 Background to cultural robotics

Cultural robotics is widely considered a sub-field of social robotics or the broader area of human–robot interaction (Koh et al. 2015). The complexity of what constitutes culture, what being social means and how they relate to robotics have made the definition and distinction between social and cultural robotics highly debatable topics (Koh et al. 2015). The earliest example of a robot in a cultural setting goes back to 1964 when “K-456” was built by Nam June Paik and Shuya Abe as a radio-controlled anthropomorphic robot that played a recording of John F. Kennedy’s inaugural address and excreted beans. In the early years, cultural robotics mostly focused on robot acceptance within a particular culture. In 2010, under the influence of Šabanović’s publication (2010) “Robots in Society, Society in Robots”, questioning ‘technological determinism’ (MacKenzie and Wajcman 1999) in which society has a passive role in technology/robot design process as well as promoting for co-designing approaches, become prevalent in cultural robotics. Following this trend, Samani et al. (2013) explored the process of culture formation between robots and humans based on the cultural values of the robotics developers, diversity of cultural communities, and the learning ability of robots. The most recent study on designing “culturally competent robotics” belongs to a wide-range of work done in the CARESSES project (Bruno et al. 2019; Khaliq et al. 2018). CARESSESFootnote 1 (Culturally-Aware Robots and Environmental Sensor Systems for Elderly Support) is an international multidisciplinary project whose goal is to design the first socially assistive robots that can adapt to the culture of the older people they are taking care of. The framework developed in this project alleviates national stereotypes modelled in a logic-based language using a Bayesian approach to learning human preferences. In the next section, we explain how confounding nationality and culture is the dominant view in cultural robotics.

3 Nationality and culture

In a recent review article, Lim et al. (2020) analysed 50 studies on the intersection of culture and social robotics. In all studies, culture was understood as ``culture as national culture—values, norms, and practices that are undertaken by a country''. Although it was the authors' intention to focus on this particular interpretation of culture, to the best of our knowledge challenging this understanding remains a minority and underexplored approach. We argue that research in cultural robotics is actually subsumed by research in social robotics. As a result, we advocate reframing how social robotics understands the dynamic relationship of human–robot interactions applied to cultural interactions. Thus, we want to explicitly focus on the impact culture has in cultural robotics.

In general, we can look at culture within robotics from two important perspectives: culture in specific interactions, and the interplay between culture and robotics at a wider scale. Within-specific interactions, the primary concerns of roboticists centre on the leverage of cultural knowledge in the production of intelligent behaviour in interactions with humans (Wang et al. 2010; Bruno et al. 2019). At a wider scale, the key concerns are the impact of culture on perceptions of robots, trust, and the reciprocal impact robots have on the cultural environment in which they are situated (Baker et al. 2018; Nomura 2017). The current definitions and assumptions of culture presently used in robotics are problematic from both of these perspectives. In the following, we present some key problems with defining culture as national culture.

3.1 National definitions in cultural robotics

When defining culture as national culture, many studies fail to separate culture from other aspects of nationality. Although this limitation is often acknowledged, a common theme in social robotics papers that reference culture is the investigation of perceptions of social acceptability and trust of robots. Lim et al. (2020) examined 18 papers which follow this theme in their recent exhaustive review article. Examples include Kamide et al. (2017), Bajones et al. (2017), Bajones et al. (2017), Belpaeme et al. (2013), Ros et al. (2011), Shiomi et al. (2017), Li et al. (2010), Lee et al. (2014), and Rosenthal-von der Putten et al. (2015). These authors rightly identify culture as a key factor influencing perceptions of robots. To investigate this they typically include in their experiments participants with a variety of nationalities, assuming that this is sufficient to demonstrate the influence of culture. Underlying this is the tacit definition of culture as nationality.

Even if we accept a definition of culture as national culture, the above move is still unconvincing. Supposing that including participants with different nationalities illustrates the influence of culture on experimental results assumes that culture is the only causally efficacious component of belonging to a certain nationality. In fact, the interactions that a person with a certain nationality has with a robot can be influenced by factors aside from culture. For example, the economic and political circumstances in a particular country. An experiment may find that participants from country X have a more negative perception of robots than participants from country Y, concluding that this illustrates the impact of the national culture of country X. This is erroneous because it ignores potential factors such as health of the job market which could lead to worries about job automation, and thus impact perceptions of robots. In essence, equating culture with nationality fails to isolate culture as a contributing factor in perceptions of robots.

3.2 Ignoring subcultures

Not only do we argue that the current emphasis on national culture is problematic for holistic methodological reasons, but also for fine-tuned methodological reasons. When cultures are treated as homogenised, national entities, it ignores the nuances the emerge as part of subcultures (Haenfler 2014; Harris 2012). Subcultures are groups that off-shoot from a larger group and form a more specific identity within the broader group (Clarke et al. 1976; Cohen 1997; Lieske 1993). Examining subcultures allows researchers to reframe what matters about culture. For example, England has a national English culture, but Manchester has a specific city culture that differentiates it from Leeds, Newcastle, or Bristol. Even within Manchester, other distinct cultures arise, such as the cultural norms and chants that distinguish Manchester United from Manchester City football supporters. Within each subgroup a sub-culture develops with its own norms, rituals, language, attitudes, and customs. In addition, a national-based view ignores certain aspects of how individuals experience culture. For example, an individual who is raised in Los Angeles, California, who has a parent raised in Vancouver, Canada, and another parent from Oaxaca, Mexico, has multiple cultures affecting their individual development. This individual does not experience their environment as simply being in “US” culture, but nor is it purely “North American”, Mexican, or Canadian culture. The individual develops a sense of their cultural context from the multi-cultural context within which they develop.

The focus on subcultures creates a different picture of what culture is and how it develops. Rather than taking a national-essentialist approach, it allows researchers to see culture as dynamically emerging from the interactions of individuals. These interactions are not governed by some laws of culture, but rather are informed by a culture that is created by the interactions. Recent work on personal identity makes a similar shift where individuals report that large macro-cultures are not sufficient to explain or categorise their individual experiences (Binning et al 2009). Only recognising national culture and neglecting subculture constitutes an important knowledge gap that must be plugged if a robot is to produce behaviour that is culturally consistent with and recognises the diverse groups that live in our society.

A subculture allows researchers to change how a cultural problem is understood and discussed. For our purposes, the major question is how to understand, classify, and program human cultural behaviour into a robot. The current literature takes a very broad approach and attempts to do this on a national scale, as illustrated clearly in 50 studies reviewed by Lim et al. (2020). The national scale focus does two problematic things. First is that it attempts to generalise individual behaviours to massive groups where there is a great deal of variation. For example, it may be common for Americans to use handshaking as greetings, and that may be a standard greeting; however, handshaking behaviour is not monolithic; handshakes that move into embraces or shoulder pats also exist. Our claim is that these types of behaviours are best understood in a subcultural microscale where patterns can easily be identified, rather than on a national macroscale.

3.3 Excluded populations and cultural nuances

When culture is understood and generalised nationally the subcultures that are often excluded are those that are of minority, immigrant, and historically marginalised groups. Confounding culture and nationality not only ignores marginal cultures within a nation-state, but also fails to recognise those that fall outside of the definition of nationality, e.g., stateless persons and refugee seekers. As we have observed, this exclusion has profound consequences: (a) it indirectly implies that a person whose residency status is not officially recognised due to a wider political conflict, e.g., war, becomes automatically “culture-less” in the eyes of robots we develop, (b) it also promotes the need for assimilation and abandonment of historical culture in favour of a generalised culture, and (c) it ignores the positive and transformative role that other cultures can have on a national or macroculture. It also means that our designs are entangled in a wider political agenda set by historical events (e.g., the Israel–Palenstine conflict in which some Palestinians are stateless), or immigration offices (who very well may have strict border policies). As we have discussed, this type of exclusion results from the focus on national culture. In addition, it complicates issues within cultural robotics that abstract away from the individuals that culturally sensitive robots can serve.

In short, a majority of studies in cultural robotics define culture as national culture. This definition ignores subcultures, does not recognise the dynamic nature of culture, and threatens to further marginalise already vulnerable groups. To avoid these issues it is important for roboticists to consider conceptions of culture from the social sciences and contemporary cognitive science.

4 Grounding the concept of culture

Thus far, we have discussed three major reasons to reject a nationality-based definition of culture. The first is a methodological claim that a national culture ignores subcultures, or cultures that emerge on a moral local scale. This causes the research to be abstracted away from individual interactions and relies on an interpretation that culturally appropriate behaviour is a set of rules to be followed rather than successful interaction among agents. Second, we have argued that certain cultural nuances, such as bi-cultural and multicultural contexts are ignored. This leads to cultural robotics research being too general and not specific enough to any individual agent. Third, we make an ethical claim that to ensure equal and equitable participation, culture cannot be understood via nationality. National culture definitions exclude small populations, especially historically marginalised groups. As a result, we need to provide a new definition of culture. We will first broaden our search to other disciplines that study culture, namely those in the social sciences and humanities.

4.1 Definitions of culture

Culture can be defined in many diverse ways. Baldwin et al. (2006) collates 313 definitions used in the social sciences and humanities. They argue that there are six general categories of definitions:

  1. 1.

    Enumerative descriptive (contentful, inclusive list)

  2. 2.

    Historical (emphasis on social tradition and collective history)

  3. 3.

    Normative (focus on common values, ideals, and behaviours)

  4. 4.

    Psychological (developmental trajectory, habits, thought patterns, problem-solving strategies, etc.)

  5. 5.

    Structural (organisational structures and patterns)

  6. 6.

    Genetic (symbols, artifacts, and ideas)

Each of these categories serves a specific purpose for each researcher’s corresponding aims. Enumerative definitions define specific sets of behaviours, ideas, beliefs, or practices. These differentiate one culture from another, and are often used to differentiate or identify cultural boundaries, e.g. what separates the cultures of Christian and Hindi Indians. Historical definitions discuss the passing of specific behaviours or ideas across generations and are often the focus of sociological and anthropological studies, for example, where researching the origins of ninth century Viking burial ritual is defined as cultural research. Normative definitions identify the proper and appropriate behaviours and ideals of members of a group, so that proper adherence and common behaviour constitutes a culture. Psychological definitions view culture as shared purposes and psychological properties, such as seeing the ‘American dream’ of being able to change class and make money in the United States as a normal cultural attitude about work and merit. Structural definitions interpret culture to be organisational, where similar social and familial hierarchies in Medieval France indicate a unified culture. Genetic definitions interpret culture as a type of living organism or narrative that has a causal development trajectory, such as studying Chinese culture by researching the evolution and change of Chinese characters through the tenth and eleventh century. None of these definitions are necessarily incorrect, however, they show that ‘culture’ has no singular definition and that ‘studying culture’ could mean studying anything from table manners to sports fandom. Culture has such a wide range of definitions that defining exactly what matters about culture for the project at hand is what determines which definition a researcher will use. For a majority of the work in cultural robotics, culture is understood as and equated with nationality.

Culture as nationality, the most common definition in cultural robotics, tends to be a combination of the enumerative, normative, and psychological where the goal is to identify and isolate behaviours that are directly linked to a geographic region and common to human–human interactions within that area, and then enable artificial agents to participate in those interactions as a human would. We, however, think that these are insufficient for the aims of cultural robotics. If each definition of culture is used for specific research ends and defines what matters about culture for each research project, then cultural roboticists must similarly ask ‘what matters for cultural robotics?’

We argue that culture cannot be fully captured using any of the above categories, and should be rather viewed as an emergent phenomenon. Emergence, simply stated, is when parts of a whole are less than the whole together. For example, think of a flock of birds. When birds fly together in a flock, their flight patterns and behaviours synchronise as though they are all a connected unit, even though there is no single bird leading the flock. A description of each individual bird’s flight behaviour, however, does not provide the full description of the dynamics of the flock. The descriptions of the individuals are insufficient to describe the behaviour of the group. We can broaden to include independent agents in grouped, or coupled interactions. We argue that culture should be understood similarly. Rather than seeing ‘culture’ defined as an enumerated list of behaviours or psychological traits that individuals adhere to as they learn social standards, we should think of culture as the product of aggregate social interactions and dynamic learning.

Current definitions of culture fail to properly appreciate the nature of culture as an emergent phenomenon. Each of the current definitions are operationally defined, or defined within a specific research goal. We claim that what matters for robotics to properly understand culture cannot be done with the above definitions. An emergent definition requires a different category, and in many ways, breaks the mould of current attempts to study culture. What matters for robotics is not a list of behaviours, a social history, nor a syntactic symbol system, but an understanding of how an individual agent, human or artificial, learns cultural norms, behaviours, and standards. This learning starts at the level of interaction and participation, meaning that the only way to understand how a robot can participate in a culturally appropriate interaction is to focus on how an agent learns to do so in a new culture. Essentially, the key to understanding cultural behaviour is not to start by assuming or positing a specific culture exists, it is to return to basic social robotics and focus on seeing culture as not a rule-based super-structure, but a mutually consisted social system that emerges from interactions of agents within a community.

Viewing culture as emergent has several consequences, all of which we think are key to advancing research in cultural robotics. The first is that culture can be understood holistically. Rather than restricting culture via one definition or exploding the research to include all definitions simultaneously, an emergent view places the focus on how individuals interact at a basic level and how these interactions inform each agent then interacts with others within a community. Second is that culture is seen as a by-product of interactions rather than assuming a static culture is already structuring interactions. This allows shifts and changes in social standards of behaviour, action, and speech to be dynamically modelled, meaning that culture can be understood dynamically throughout a research project rather than requiring preset parameters that cannot be updated. Third, using emergence allows us to better understand the influence of culture on an individual. One example of a now explainable phenomenon is code switching. Cultural code switching is where an individual who is a member of multiple cultural communities is able to switch between culturally appropriate responses rapidly and reliably; this gives the appearance that an individual is able to rapidly identify and change ‘cultures’ based on the needed response (Molinsky 2007). If culture and cultural behaviours are thought of as emergent, then there is no need to think of an agent as actually switching between cultures at a moment's notice. In reality, cultural behaviour emerges as an opportunity to coordinate actions with others. Thinking of culture as an overarching system becomes obsolete, and the individual action becomes the only phenomenon that requires explanation.

Thus, we claim that for cultural robotics to advance further, we need to stop thinking about culture as other social scientists think about it, and research culture as it emerges between two agents. To be clear, we do not mean to suggest that language, social history, or common values are irrelevant to cultural robotics. We simply argue that viewing culture only within these boundaries greatly limits the discipline. For example, if a German person moves to Osaka, Japan, they would be required to learn how to appropriately interact with others in ways that respect Osakan culture. To do this, the individual needs to practice, learn, and adapt their behaviour to determine the proper cultural interactions. Even if the individual learns certain rules, making culturally appropriate behaviour habits requires repeated interaction and practice. For the German to learn a culturally appropriate Osakan greeting requires such practice, but does not necessarily require knowing the social history of Japan, the social history of Osaka, the Japanese language, the common views on greets held by Osakan people, or the customary Osakan dress; all that is required is for the German individual to learn, practice, and adopt the appropriate greeting behaviour by participating in social interactions in Osaka. Before we continue with our defense of why an emergent definition is best for cultural robotics, we will first discuss what emergence is and how we arrived at this application of emergence.

4.2 Culture is emergent

Emergence is typically understood in two senses, weak and strong; we claim that culture should be thought of only as a weak emergent phenomenon.Footnote 2 Weak emergence is where the emergent phenomena is constituted by and generated from underlying processes of a thing, but is also autonomous from the thing it emerges from (Bedau 1997), such as a tornado. A tornado is a mass aggregation of air, water, and other meteorological forces, yet once a funnel forms, it behaves as though autonomous from the area around it with similar meteorological characteristics.

Emergence, therefore, should be understood as a product of the interactions of several individual component parts. Emergence can occur when any two systems pair, where a system can be any semi-autonomous, interacting entity. Coordination is often a key ingredient of emergence. For example, when two people walking in opposite directions attempt to pass each other in a narrow hallway. In order for the two individuals to pass each other without bumping into each other they need to coordinate their behaviour. The two individuals must coordinate their motor behaviours to pass each other. These types of social interaction provide an opportunity to successfully or unsuccessfully coordinate. If one individual turns to their left and the other turns to their right, the two collide. This, of course, is an unsuccessful coordination scenario. While unsuccessful, it provides information to the agent(s) so that the unsuccessful experience can be used as information in an attempt to coordinate properly in another interaction. This is a form of adaptive learning that is done dynamically, or done in real time. If two individuals fail to coordinate passing each other in the hallway, the two are unlikely to simply give up and turn the way they came, they again, likely engage in another attempt and behaviour coordination. The previous failed coordination attempt informs this next attempt.

Repeated coordination attempts, both successful and unsuccessful, facilitate a type of dynamic learning. We argue this is the foundation of cultural learning. Take the two individuals passing each other. One is from Australia the other is from Mexico. Each individual might be able to successfully coordinate the passing behaviour with individuals from their respective country, as others are accustomed to advancing forward in the same direction. However, since Australians drive on the left side of the road, while Mexicans advance on the right side of the road, this changes how each individual will attempt to coordinate with the other, and may cause bumping. Additionally, if there are two individuals from Mexico, but they are both in Australia, one may attempt to adapt their behaviour given the environmental context, which may constitute another failed coordination. When successful coordination occurs, it emerges and is co-created by the entities participating in that coordination.

We argue that this process is how culture emerges. An individual participates in and is a co-creator of culture as that agent interacts with others. Culture is not some static, overarching set of rules, but rather is a product of social interactions and emerges from the aggregation of interactions from agents within a community. Emergence, also, requires a view of culture that is more community focused rather than nationality focused. Culture emerges from individual interactions and is a localised, not national phenomena. This means that cultural robotics ought to and should think of social interactions as the main pathway to understanding culture.

4.3 Social and cognitive science influences

Our definition is based on three key principles from different social sciences. These help us define and outline culture as emergent. From Embodied Cognition, we borrow the basic principle that an agent engages in constant sensorimotor tuning with their environment. This means that sensorimotor capabilities are the basis for any type of learning and ‘knowledge’ production. Rather than viewing knowledge of the environment as an encoded list of details, it is the recognition of the agent embedded into a space as a whole body and not simply as a computational processing apparatus. This emphasises that interacting with the environment is a dynamic process essential to any type of learning. From Ecological Psychology we borrow a basic version of Gibson’s theory of affordances (Gibson 1966). The theory argues that agents do not simply passively take in sense data, but rather they perceive opportunities for action within an environment, meaning that perception is inherently interactive. Dynamical systems argues that chaotic systems can coordinate to produce modellable behaviour. This emphasises that seemingly disparate systems can actively participate in creating emergent coordination.

Embodied cognition, ecological psychology and dynamical systems also extend their reach beyond human psychology making them viable candidates as compatible areas of research. Research has been done applying ecological principles to animals (Corris 2020), plants (Raja et al. 2020; Calvo and Keijzer 2010), and robotics (Lamb et al. 2017). This allows us to further adapt it to cultural robotics.

4.3.1 Embodied cognition

Embodied cognition rejects the traditional computationalist idea of cognition largely being constituted by a series of computations constrained inside the skull. Instead, embodied cognition recognises the contribution that our bodies, the environment, and social interactions make to our cognitive processes. An individual’s physiological characteristics also encode that individual’s ‘knowledge’ (Calvo and Gomila 2008; Chemero 2009; Shapiro 2014). This is illustrated by a simple example from developmental psychology. Somerville et al. (2005) show that 3-month-old infants are only able to pay attention to adult goals such as grabbing objects after they themselves have learned how to grab and hold objects. In other words, knowledge arises from sensorimotor capabilities. Given this, from the embodied cognition perspective, culture is ‘the tuning of sensorimotor systems for situated action’ (Soliman and Glenberg 2014, pp. 209). That is, rather than being rules, norms, and values at the social level, culture in fact arises from the ways our sensorimotor systems attune themselves to act within particular environments and social groups. In short, culture, at least in part, arises from the body.

Soliman and Glenburg (2014) assert that while culture is a useful tool of analysis at the sociological level, culture is not something that exists independently of low-level learning and behaviour. Instead, culture comprises an individual’s sensorimotor tuning that arises from and is guided by social interactions. To illustrate the point the authors make use of the example of interdependent vs independent selves. In individualistic societies, people develop a more independent conceptualisation of the self-centred on freedom of choice, and personal achievement, whereas individuals in collectivist societies are more likely to have an interdependent concept of self that stresses group-connectedness and social harmony. Soliman and Glenburg’s vision of embodied cognition explains this cultural difference as a difference in sensorimotor interactions that arise from different patterns of social interactions, in different environments.

4.3.2 Ecological psychology

Ecological psychology proposes that the role of the environment has been highly underestimated in psychology research, and should be thought of as a dynamic factor in human cognition (Gibson 1979; Heft 2001; Richardson et al. 2008). Gibson’s Theory of Affordances plays a large role, where agents do not simply perceive the world as raw sense data, but rather as opportunities for interaction. Gibson notes that the agent’s cognitive history (sensory, behavioural, social, etc.) helps mediate what opportunities for interaction are perceived in a given environmental setting (Chemero 2003; Turvey 1992). For example, imagine a person walking into a classroom of 21 chairs, where twenty chairs face one chair, which is correspondingly facing the other 20. If a student walks into this classroom, the student will choose from the twenty chairs facing one chair, rather than the one chair facing the other twenty. That same person may later enter the exact same room, but that person is now the instructor; as an instructor that person will elect the one chair that faces the twenty others. The critical takeaway is that physical changes themselves do not necessarily change affordances, rather, perceptions of affordances are dependent on the complete environment–agent relationship.

We apply Gibson’s basic argument to cultural robotics. A robot can only produce the appropriate cultural behaviour if and only if it has the relevant cognitive states and the relevant interaction history. This emphasises the role of both the environmental setting and the unique interaction each individual has with said environment. In the classroom interaction, the same individual interacts with the same space. There is no physical change to either; the only change is the cognitive states (goals and understood role) of the individual regarding their interaction with the classroom. The individual dynamically couples, or has a specific set of interaction opportunities corresponding to agent state and environment, each type that agent needs to interact with said environment. In ecological psychology, this yields the common adage, the same person never steps into the same river twice; this is because each interaction, or coupling, is a unique opportunity for an agent to interact with their environment. As the agent and environment changes, the coupling changes. Agents and their environment are, therefore, mutually informing where a change in one constitutes a change in the other.

4.3.3 Dynamical systems

Social Science applications of Dynamical systems research emerges from Chaos Theory in Mathematics and Physics and incorporates certain elements of Gibson’s Ecological Psychology (Richardson et al. 2014; Valacher and Nowak 1994). Dynamical system theory looks at how systems coordinate their behaviour to complete tasks, where a task is any general goal of an interaction. Dynamical systems theory, similar to Ecological Psychology, emphasises the importance of environmental factors on an agent’s ability to act in a given situation. Dynamical systems researchers use differential equations to model the coordination behaviour of systems (Ramenzoni et al. 2011; Riley et al. 2011). For example, how the cardiovascular, visual, and musculoskeletal systems coordinate to throw a ball; or how ground, air, and water temperatures interact to form a hurricane; or how two individuals talking coordinate their voices in loud and quiet environments.

One of the major upshots of dynamical systems is that a system can be defined almost any way the researcher needs. Dynamical systems research focuses on identifying how behaviour patterns can be reliably modelled by differential equations, and how patterns emerge from chaotic behaviour. The Lorenz attractor, for instance, is a differential equation that produces chaotic behaviour; it can, however, also be used to model the interaction of atmospheric factors to make weather predictions. Dynamical systems, in its basic form, studies how patterns emerge in the coordination of behaviour.

4.4 Emergent culture

When we take these types of approaches in cognitive science and apply it to culture, we arrive a quite different understanding of what culture is. While each is different, these approaches do all have similar understandings of the relationship between agents and their environment. They reject the paradigm that an agent’s behaviours all result from internal considerations of the external world. Instead, they hold that agents interact with their environment by coupling, or creating a unique relationship with the specific environment. This coupling not only relates the agents and their environment, but a context emerges that constrains how the agent can interact with the presented environment.

When we apply this to cultural robotics, emphasises that cultural learning and cultural participation are the key to cultural robotics. Since this is what matters for cultural robotics, we must then focus on the individual, social interactions between robots and other agents. These approaches refocus what is important about culture for robotics researchers. These approaches allow us to do two things in cultural robotics. One is to show that the learning and interactions of individual agents are what cause cultural phenomena, rather than an agent being dropped into a cultural context. The second is that they provide an approach to study both the agents themselves and what emerges as a by-product of their interaction. As a result, it shows that we should not start with a top-down approach where rules are programmed into robots, but rather we should focus on how a history of interactions provides a context for future interaction. This is the key insight from these cognitive science approaches.

One objection that could be raised here is that we are just advocating social robotics. In reality, this is not an objection, but an important insight. When researchers think about and posit culture as a macro-structure, it limits the types of cultural behaviour that a robot can engage in. Focusing on culture emerging through social interactions, we claim, is the best way to advance research in cultural robotics. This is why we claim that what matters for ‘culture’ in cultural robotics is not what most think it is.

5 Necessary robot capabilities

In the previous section, we argued that culture is not a set of rules and norms, rather culture is an emergent phenomenon. Given that the key to cultural behaviour is to dynamically participate in social interactions that lead to the emergence of culture, roboticists must focus on the capacities and abilities that a robot must possess to participate in such interactions. This section will explain what capacities, abilities and behaviours AI practitioners and researchers in general should focus their research on to enable a robot to participate correctly in the right kinds of interactions such that cultural behaviour emerges.

Consider a scenario where an assistive humanoid robot is placed in the library in Umea, a city in the north of Sweden. The main task of this robot is to check-in/out books for those library users who cannot or prefer to not use the automatic machine for the same purpose. Currently, this backup option is filled by humans in most libraries across the world. For this assistive robot, a set of behaviours that could be culturally sensitive includes the physical distance that the robot should maintain with a human interlocutor, the handing over action (with one or two hands or with/without bowing, for example), as well as greeting gestures. If we were to use the current approaches to culturally aware robots, a set of norms or customs would be implemented. For instance, the robot would stay as far as possible while handing over a book to comply with the stereotype of Swedish people not being social. In Sect. 3, we described extensively why programming such behaviours can be discriminatory and reductionist, and instead we should let culture emerge from the context within which a robot operates. The culturally sensitive behaviours mentioned above are complex, and often originate from a combination of core abilities such as perception, action, learning, adaptation, anticipation, internal stimulation, attention, action selection, reasoning, only to name a few. A more exhaustive list of core abilities and description of what constitutes is available in Kotseruba et al. (2020). Assuming a robot should be equipped with these (or a subset of) core capabilities to participate in "culturally-enhanced" handing over book interaction, the key question then arises is what is supposed to emerge from this interaction to be named as a culture?

In the example above, all mentioned culturally sensitive behaviours require some sort of coordination among agents, for instance, when passing a book from the robot to the human, there is a clear joint action give-and-take, whose underlying dynamics are coupled and coordinated. This coordination emerges from more low-level learning and control abilities. In-group tuning arises from repeated coordination attempts (both successful and unsuccessful). Culture in turn emerges from this in-group tuning. In the context of robotics, so-called in-group tuning requires a robot to continuously adapt to the affordance dynamics and trajectory dynamics offered by interaction with other agents and the environment. Given this, endowing a robot with culture requires less emphasis on cultural norms or rules. Instead, we should aim to identify and model the relevant affordance dynamics that underlie the selection of the different action modes required by the task as well as the trajectory dynamics of each agent involved. For example, handing over two books requires two arms due to the heaviness of the books (affordance dynamics), and some sort of motion planning for the arm (trajectory dynamics). Note that, in this example, the combined space from which in-group tuning arises can be huge as this space results from the cross product of all possible affordance dynamics of the actions/environment with the trajectory dynamics. This kind of identification and modeling are essential capabilities that enable the robot to tune its behaviour to an in-group via aggregate social interactions and dynamic learning from repeated coordination attempts. In this way, the robot is equipped to learn and co-create cultural behaviour. On top of the ability to identify and model affordance dynamics and trajectory dynamics, the robot must be able to learn online from each interaction. It is online learning that enables the robot to tune its own behaviour to the affordance dynamics and trajectory dynamics over time.

The potential benefit of this approach is we do not put any assumptions on the interacting agents. As a case in point, we do not assume all the library users are Swedish. This is a valid assumption as Swedish libraries like many other countries have books in different languages; hence, potential visitors from immigrants, refugees, students or any others who reside in Umea. Moreover, we might not innately consider handing over a book to be a cultural behaviour. However, there are significant amounts of variation of this task in different contexts; hence, handing a book properly requires a product of social learning and social interaction. So, what our framework suggests is that the emergent model of coupling behaviour of how to hand a book and receive it establishes the culture of borrowing books in that library, for instance.

Consider another example. There are 10 robots that are culturally sensitive as we have proposed. These 10 are then deployed in 10 different countries. Each robot would learn and develop different behaviours. Now, let us say those 10 were deployed into the same country, but different places within that country. Again, they should all develop different behaviours. Even if all 10 robots were deployed into the same town, but different areas of a town there would likely have different behaviours. Since these robots would be dynamically engaging with their environment, no robot will have the same experience (or inputs), just as humans do not have the same inputs. Two robots, in principle, could develop the same behaviour (i.e. handshaking); however, this only happens when each robot learns this via the interactions with their environments.

In general, cultural interaction arises from low-level learning, control and coordination, as a robot adapts its low-level perception and control to in-group dynamics. In the library example, we focused on affordance and trajectory dynamics. Although these dynamics are important, the ability to continuously tune to the interaction dynamics is the determining factor for a cultural robot.

6 Summary and future work

In this paper, we have put forward three contributions: we criticised the current treatment of culture in robotics, we advanced a conception of culture based on research in cognitive science, and we have explored which robot capacities researchers should focus on to realise this revised conception of culture in robotics. We have done this by showing that the most common definition of culture currently used in robotics is culture as nationality. Then we discussed how this conception is divorced from the psychological and social reality of how cultural behaviour emerges. In addition, failing to recognise this impedes the development of culturally competent robots and contributes to ethical issues, mainly the further exclusion of already marginalised groups. Our positive proposal is that cultural behaviour should be seen as an emergent phenomenon that arises from an agent’s body, its interaction with the environment, and its interaction with other agents. This view is supported by research in embodied cognition, ecological psychology, and dynamical systems theory. The upshot of this for robotics is that the development of culturally competent robots does not depend primarily on programming robots with explicit cultural knowledge, but on equipping robots with the fundamental capacities needed for adapting and tuning behaviour to cultural in-groups over time.

A key difficulty in developing culturally competent robots is that introducing a robot into social interactions with humans can change the nature of the cultural interactions themselves. For example, a certain cultural group may use a specific greeting when meeting other humans, but when meeting robots a different greeting pattern could emerge. This greeting pattern would be influenced by the human–human greeting, but would nonetheless be different. Given this, a robot must be able to adapt the ways its own presence alters the cultural landscape. By viewing culture as emerging from interactions, the position we advance in this paper accounts for this. This is an important benefit of our view that we intend to explore further in future work.

In this paper, we explore which robot capacities are key to enabling a robot to participate in and adapt to cultural behaviour, but we have not commented on which technical approaches are appropriate given our view. For example, we emphasise the importance of online learning. There are many machine learning techniques applied in robotics, but only some of them will be effective for the kind of online learning needed for adapting to cultural behaviour. Our future work will focus on appraising technical approaches for realising our vision.