1 What is the ‘Self’?

1.1 Questioning the Question

In modern societies people tend to consider the idea of the ‘self’ as self-evident. Certain civilizations have even suggested that one’s ultimate goal in life is to ‘know thyself’ (e.g., the ancient Greeks referred to it in multiple instances as ‘γνῶθι σεαυτόν’). However, the question of the ‘self’ did not exist from the beginning of the history of culture and human thought, but it arose at a certain level of historical development as a result of deep societal transformations. At different stages of historical development, this question has been addressed in different ways. For instance, Plato (429–347 BC), and before him Homer (ca. Eighth century BC), imagined the ‘self’ as an immaterial spiritual substance (i.e., the psyche or the soul). More specifically, Plato contrasted the eternal form with the ephemeral body, which he thought of as an imperfect copy of the former (Kraut 2017). In fact, we later meet dualistic views on the ‘self’ in various religious traditions, as well as in notable thinkers, such as Platinus (ca. 204–270) and Descartes (1596–1650). Descartes, who famously declared “I think, therefore I am” (or “I doubt, therefore I think, therefore I am”, as paraphrased by Antoine Léonard Thomas), considered mind and body as two distinct entities, which could yet influence each other.

Nowadays, mainstream science has moved away from an idealistic and dualistic view of the ‘self’. Already, Aristotle had argued that the soul could not be separated from the body (cf. Sihvola 2008). Yet, religion, offering the concept of immortality as a solution to the problem of death has played a pivotal role in hindering this transition (Barresi and Martin 2011), ignoring alternative solutions such as the one put forward by Epicurus (341–270 BC), who proclaimed that the problem is not death itself, but the fear of death. In fact, Epicurus and others such as Democritus adopted a monistic perspective, which can be traced into modern times with thinkers such as Pierre Gassendi (1592–1655), Baruch Spinoza (1632–1677) and Ludwig Feuerbach (1804–1872). Despite other fundamental differences, present in diverse philosophical traditions has been the idea of a lawful understanding of the world, at times emphasizing a mechanistic explanation, which largely characterizes the scientific paradigm until today. Indeed, one can draw parallels in today’s neuroscience, which is largely grounded in frameworks, which focus on describing the underlying mechanisms of a phenomenon, e.g., distinct neurobiological mechanisms underlying consciousness.

Various roles have been considered for the ‘self’ and consciousness in (more) modern science and philosophy as well. For instance, John Locke (1632–1704) focused on the relations between basic physical/mental elements, emphasizing sameness: “[…] in this alone consists personal identity, i.e. the sameness of a rational being: And as far as this consciousness can be extended backwards to any past action or thought, so far reaches the identity of that person; it is the same self now it was then; and it is by the same self with this present one that now reflects on it, that that action was done.” (Locke 1694; cf. Barresi and Martin 2011). On the other hand, David Hume (1711–1776) claimed that the ‘self’ is an illusion, as there “are the successive perceptions only, that constitute the mind; nor have we the most distant notion of the place, where these scenes are represented, or of the materials, of which it is compos’d” (Hume 1739; cf. Barresi and Martin 2011). After all, Friedrich Nietzsche (1844–1900) not only famously argued that God is dead, but also noted that the ‘self’ is dead as well (cf. Barresi and Martin 2011). So, what is the ‘self’? Barresi and Martin (2011) answer that the concept of ‘self’ in today’s literature appears divided in a number of different roles, such as ‘self-image’, ‘self-conception’, ‘self-discovery’, ‘self-confidence’ etc.

In this article, we will approach the multi-fragmented paradox of the self, through an integrative perspective, adopting a dialectical and historical perspective. In line with dialectical cultural-historical theories (cf. Vygotsky 1930–1935/1978), we will try to motivate a shift of focus from being to becoming, along multiple temporal scales. In doing so, we will move beyond the individual in the question of the (a-)typical ‘self’, in both conceptual and empirical regards. More concretely, we will argue that the ‘self’ lies beyond the static individual, namely in the unfolding of social relations, as a dialectic of internalization/externalization, over multiple temporal scales. Along these lines, autism and other psychiatric conditions have been recently revisited as processes of cumulative misattunement between persons, rather than mere brain disorders (Bolis et al. 2016, 2017). Subsequently, we will delineate an empirical research framework for scientifically validating relevant questions, i.e. two-person psychophysiology and multi-level analyses of intersubjectivity. Finally, putting this approach into a broader context, we will discuss why challenging the concept of the self is important anyway by describing the practical implications of our approach across various fields of research and practice. Here, we will consider aspects ranging from ethics and pedagogy to psychiatry, neuroscience and artificial intelligence.

1.2 A Dialectical Perspective

To begin, we will make a case for the use of dialectics as a powerful tool for science. To this end, we will first provide a brief introduction to the method and present concrete dialectical insights for the discussion of the self. Dialectics can be thought of as an evolving school of thought, met in various historical and cultural contexts (e.g., Greek, Chinese, Indian, German dialectic; Wong 2006; Dafermos 2015). It asserts that phenomena cannot be meaningfully understood by reducing them into single levels of description or by assuming a metaphysical independence between levels of description. It rather states that phenomena should be studied as processes in their wholeness, inner contradiction and movement. In this light, the self cannot be understood in isolation from the body, social interaction and society (Bolis et al. 2017). More concretely, primarily leaning on views of Vygotsky and colleagues on the dialectical nature of human thought and development, we will try to overcome traditional dichotomies, such as object/subject and organism/environment, by viewing them as both a result and a cause of reciprocal adjustments, or individual/society by considering the whole and the part as, albeit partially autonomous, highly interdependent levels of organization. Along these lines, the self is not to be taken as a static entity bounded by the individual, but rather as the interplay of dynamically and reciprocally interacting factors. More specifically, we will consider it as a process of circular causality among different levels of organization (Fig. 1; e.g., physical, biological, psychophysiological and sociocultural) unfolding over different time frames (e.g., evolutionary, cultural, developmental, psychophysiological and microbiological scales; Vygotsky 1930–1935/1978; Bolis et al. 2017).

Fig. 1
figure 1

Schematic depiction of dynamic interrelationships in the evolution of matter organization across several time scales

In a nutshell, dialectical thought emphasizes change over sameness and becoming over being, by viewing reality as dynamic processes rather than static entities (see also process philosophy; Seibt 2017). As Nietzsche (1844–1900) noted, citing Heraclitus (ca. 535–475 BC): he [Heraclitus] altogether denied being. [...] Louder than Anaximander, Heraclitus proclaimed: “I see nothing other than becoming. Be not deceived. It is the fault of your short-sightedness, not of the essence of things, if you believe you see land somewhere in the ocean of becoming and passing-away. You use names for things as though they rigidly, persistently endured; yet even the stream into which you step a second time is not the one you stepped into before” (Nietzsche 1872/1999, pp. 51–52). Along these lines, Nietzsche strictly denies a dichotomy of object/subject: “[the subject is but a] term for our belief in a unity underlying all the different impulses of the highest feeling of reality”. There is no such unity, only “the fiction that many similar states in us are the effect of one substratum: but it is we who first created the “similarity” of these states; our adjusting them and making them similar is the fact, not their similarity, which ought rather to be denied.” (Nietzsche 1901/2017; see also Barresi and Martin 2011). Nietzsche then goes on to criticize an “absurd overestimation of consciousness” which had been transformed “into a unity, an entity: ‘spirit’, ‘soul’, something that feels, thinks, wills”, provocatively characterizing this as one of the “tremendous blunders” intellectual culture had created (Nietzsche 1901/2017; see also Barresi and Martin 2011). In other words, Nietzsche here rejects the idea of an ‘artificial’ unity of consciousness (or the self). This brings us to a cardinal concept of dialectics, the ‘unity of opposites’.

Put simply, ‘unity of opposites’ defines a phenomenon by its internal oppositions: “All things come into being by conflict of opposites, and the whole flows like a stream” (Diogenis on Heraclitus, ca. Third century BC; cf. Magnus 1970). Later Hegel (1770–1831) elegantly elaborated: “[…] every actual thing involves a coexistence of opposed elements. Consequently to know, or, in other words, to comprehend an object is equivalent to being conscious of it as a concrete unity of opposed determinations, [whereas] the old metaphysic, as we have already seen, when it studied the objects of which it sought a metaphysical knowledge, went to work by applying categories abstractly and to the exclusion of their opposites” (cited in Blunden 2000). In brief, Hegel claimed that ideas and concepts can be only understood in historical terms, as when abstracted, they become meaningless (Grossmann 2018). Importantly, a dialectical account does not merely focus on interpreting a harmonic development of internal contradictions, but also unveils dramatic tensions, conflicts and struggle of opposites. In fact, within dialectical thinking, such inner contradictions are the ones that drive change. Gradual change, in turn, is thought to lead to ‘crises’, which are overcome by qualitative leaps. Taken together dialectics, therefore, assume a constant evolution of phenomena, where change is periodic but not returning to the same point.

With regard to the topic at hand, Hegel suggested that self-consciousness does not emerge through passive and individualistic introspection, but through dynamic and reciprocal relations with others (cf. Barresi and Martin 2011). In fact, Karl Marx (1818–1883), whose work leaned on but also criticized Hegel’s work, proclaimed: “[…] the human essence is no abstraction inherent in each single individual. In its reality it is the ensemble of the social relations” (Marx and Engels 1888). The primacy of the social realm has later been stressed by so-called cultural-historical approaches (cf. Roth 2016). A prime example of this can be found in the work of Lev Vygotsky (1896–1934), who directly applied dialectical thinking to developmental psychology and proclaimed that “through others we become ourselves”. He further suggested that all ‘higher’ mental processes within an individual result from an internationalization of prior social interactions between people. But dialectical thinking, as described here, should not be exclusively attributed to Western philosophy. For instance, according to African Ubuntu “a person is a person through other persons” (Birhane 2017). We also meet forms of dialectical thinking in Buddhism and Taoism (cf., Grossmann 2018). Taken together, in the formation of the self, the social can be assumed to dialectically precede the individual.

In this line of thought, we suggest that interpersonal statistical regularities shape multiscale hierarchical models on an individual level and vice versa. For instance, at the level of perceptual awareness and everyday learning (time scale of seconds to hours), others play an important role in shaping subjective feelings and decision-making. Let us imagine an illustrative scenario (Bolis and Schilbach 2017b): a person, in the process of deciding what is the most appropriate clothing for tonight’s walk, checks current weather out at the balcony. She feels a cold breeze, which initially makes her think that a warm coat might be a good idea; yet a glance down the road makes her change opinion, as all people outside this day are lightly dressed. Such kind of decisions, especially when reinforced by persistent cultural norms (time scale of weeks to years), is possible to form even more stable personal habits. For instance, people in ancient Greece were accustomed to exercising without clothes. In modern societies, despite objective conditions that might call for such a habit sometimes (e.g., warm weather) such a behavior would be considered uncomfortable by most people. Here we see how a socially constructed statistical regularity is internalized at the level of the individual—at such an extent so that its violation directly evokes certain subjective feelings.

Across longer time scales, the cumulative internalization of such interpersonal regularities directly shapes who we are becoming, literally changing our bodies and brains. Both ‘higher level’ mental functions and ‘automatic’ processes can be thought of as emerging due to and through social interactions across the life span. Let us examine a simple example inspired by Vygotsky (1930–1935/1978, pp. 103 of the Greek translation): a child in an effort to maintain interoceptive balance unsuccessfully tries to reach for food with the index finger extended. The caregiver, who observes the effort, helps with bringing the food toward the hand of the child. After a number of repetitions, this kind of interpersonal process, and the statistical regularities associated with it, is internalized by the child in such a way that the extension of the index finger eventually represents a call for attention to a pointed object. Intriguingly, it is not only higher symbolic functions that are culturally shaped, but also more ‘fundamental’ ones, such as eating and drinking. For instance, while babies eat and drink when they feel the need for it, adults regularly do so not for covering direct survival needs but rather social ones (e.g., eating as a part of a break from work or drinking alcohol when socializing). Along these lines even interoceptive control can be thought of having social origins, being developed in this way already from infancy onwards (Ciaunica and Fotopoulou 2017; Fotopoulou and Tsakiris 2017). To probe this further, interaction processes can be thought of as ontologically primary to entities on an ultimately basic level. In this line of thinking, entities actually emerge through interactions (intra-actions for emphasis; Barad 2003) and not vice versa, from within their relationship and not outside. In this light, we view the multifaceted construction of the self as an active process of culturally mediated internalization of social interactions along multiple time scales.

Here, it is crucial to note that internalization plays an important albeit partial role in the formation of the self. It is the dialectic between internalization and externalization that provides a more complete picture of the co-construction of individual and social reality. Internalization can be thought of as the active reconstruction and synthesis of incoming information and past experiences, while externalization can be thought of as the tool-mediated translation of inner processes into collectively transforming the world, including others. The dialectic between internalization and externalization becomes apparent when examining the simple example of holding and manipulating an object (Leontyev 1975/1983; Stetsenko and Arievitch 2004). In this very moment a person transforms not only the world but also herself, as in her effort to act on the environment, she embodies its structure and dynamics. Tools are not to be confused only with conventional material objects. The term here is used to also broadly encompass ‘intellectual objects’ in the service of communication, such as language and art (cf. Vygotsky 1930–1935/1978; Dafermos 2002). In other words, humans change themselves through changing the environment in a socioculturally mediated procedure of mutual adjustment. More broadly, evolution (or developmentFootnote 1) of species, societies, persons and concepts should not be viewed as an one-way adjustment, but rather as a dialectical, namely a dynamic, reciprocal and cumulative process (cf. Levins and Lewontin 1985). We will come back to this crucial insight later, but will now review these cardinal concepts through a Bayesian lens, which will allow us to operationalize our suggestions formally.

1.3 A Bayesian Perspective

The main premises of the “Bayesian brain hypothesis” rest on the idea that the brain represents information accessed via the sensory organs in the form of probability densities, as opposed to single numbers, which are continuously updated, as if following a specific set of mathematical formulas based on Bayes theorem (cf. Bolis et al. 2017). Interestingly, such a perspective brings together under a common umbrella diverse putative cognitive processes of major importance, such as optimal information integration both in time and space, optimal multimodal cue integration, as well as flexible information manipulation without the need to commit to particular decisions at an early stage of processing (Knill and Pouget 2004). In other words, through a Bayesian lens one can view the brain as an organ which calculates and maintains probabilities about events in the world or about the organism itself, via a combination of already gained experience and newly sensed information. Importantly, the more confidence (i.e., precision) is placed on the validity of experience (i.e., prior beliefs) the less beliefs are updated based on new incoming information (i.e., evidence). Notably, a Bayesian ‘belief’ should not be confused with an everyday meaning of the word belief which might be taken to refer to a conscious representation. On the contrary, a Bayesian belief can be thought of as a dynamic state, either conscious (e.g., determination not to eat meat) or unconscious (e.g., glucose levels).

A concrete and prominent implementation of the Bayesian brain hypothesis can be found in predictive processing (i.e., predictive coding and active inference; Friston 2010, 2013; Clark 2013). Within this framework a biological system is essentially viewed as a prediction machine and action generator, which actively tries to align reality with internalized models of reality, as precisely as possible. As noted above, reality embraces both the world and the organism itself. According to such a perspective, the brain’s ultimate goal is the long-term minimization of free energy, by calculating (under certain simplifying assumptions) prediction errors, i.e., the discrepancy between incoming information and generated predictions, based on prior experience. Importantly, this is thought to be accomplished via two main avenues, namely either via updating the (Bayesian) beliefs one holds for aligning them with the environment (i.e., predictive coding; cf. internalization), or through action, which can help to experience the environment in accordance with prior beliefs (i.e., active inference; cf. externalization). Put simply, to survive, an organism obeys the following straightforward rule: adjust yourself to reality or change the reality itself (cf. Friston et al. 2010).

In this framework, the updating of beliefs is accomplished across various hierarchical levels at the same time. More concretely, two processes run in parallel: prediction errors ascend the hierarchy reconfiguring the organism for optimizing predictions, while in parallel predictions descend the hierarchy explaining away prediction errors. The hierarchical organization of this scheme is of immense importance, as it allows for the consideration of multiple levels of increasing abstraction. For instance, social relations along development are not merely stored and represented as concrete memories, but are perhaps more crucially, internalized at higher levels of the hierarchy as generalized cultural norms. The latter can, in turn, be utilized to guide behavior across a multitude of contexts.

As noted above, a process of belief updating should be always thought of in relation to action. Importantly, such a dialectic of internalization and externalization can take either ‘adaptive’ or ‘maladaptive’ forms along various time scales, leading to a cascade of interpersonal (mis-)attunement (Bolis et al. 2017). To give a simple example, abusive interactions with care-givers in early life could influence the way an individual forms relations later, which may help to explain personal tendencies and so-called personalities, but also symptoms across different psychiatric and psychological conditions. In other words, growing up in an interpersonally adversive environment may lead to expectations about how social interactions unfold, which will modulate how future interactions actually play out. From our standpoint, such an example illustrates how the Bayesian perspective may be able to capture and express the inextricable linkage of social and individual reality. Seen through a Bayesian and dialectical lens at the same time, we can, therefore, view the ‘self’ as a non-linear dynamic process, rather than as a static and unified entity.

Notably, predictive coding and active inference can be thought of as a dialectical framework in and of itself. Perception and action become two dialectical facets of the same process, i.e. the minimization of prediction error. Current internal (e.g., perceptual) states inform future actions, while informed interaction with the environment (including others) greatly modulates internal states. Furthermore, the interrelation between the environment and the ‘self’ is controlled by the synthesis of an organism’s current state and incoming information, either through updating current beliefs or the environment itself. In these terms the ‘self’ can be considered as the dialectic between predictive coding (cf. internalization) and active inference (cf. externalization) processes (Bolis et al. 2017). Taken together, multilevel computational frameworks grounded in predictive processing (cf. Bolis and Schilbach 2017b; Ramstead et al. 2017) can, therefore, serve as a formal bridge between philosophical arguments and neuropsychological evidence for revisiting the ‘self’ as a historical product of dialectical attunement.

2 The Dynamic Self in Action

2.1 The Dialectic of Internalization/Externalization: Insights from Evolutionary & Developmental Psychology, Neuroscience & Psychiatry

In the following section, we selectively review results and insights from different disciplines in order to add empirical findings to the argument that the self can be regarded as a (historical or developmental) dialectic of internalization/externalization over multiple scales.

Across an evolutionary scale, the change to upright position comprises perhaps one of the most important qualitative leaps. In fact, bipedal walking has been crucial to the evolution of the self for various reasons. Perhaps, most importantly, walking on two feet allowed the development and use of sophisticated tools. The latter revolutionized the way humans adapt to the environment, allowing them to actively and dialectically transform the world they inhabit according to their needs. That is, it is not only humans who change the environment, but the environment in turn changes them in face of their impact on it (cf. Levins and Lewontin 1985). In brief, contrary to a perhaps common belief, humans (and other organisms) do not evolve via passive adaptation, but they fundamentally change themselves via socioculturally mediated transformations of the environment. However, having said that, this development has not come without compromises.

It has been hypothesised that bipedal walking has imposed certain constraints on the birth canal, which does not allow the birth of a fetus much older (and thus bigger) than 9 months. Additionally, according to the ‘metabolic crossover hypothesis’ (Ellison 2001; Dunsworth et al. 2012) the mother may not be able to support an older and more energetically demanding fetus. Consequently, while apes and other animals quickly master basic skills that grant them relatively early independency after birth, human infants are born unable to survive on their own. Indeed, the brain size of newly born infants is only a quarter of its fully developed size. This means that major development occurs after birth in direct interaction with the environment and others: “Maybe human newborns are adapted to soaking up all this cultural stuff and maybe being born earlier lets you do this […] Maybe being born earlier is better if you’re a cultural animal” (Karen Rosenberg on Adolf Portman; cited in Wong 2012). Such a compromise between early independency and optimal development might actually, in and of itself, define the timing of birth.

Another major evolutionary leap with regards to human cognition is the change from individual to shared intentionality (Vygotsky 1930–1935/1978; Tomasello 1999, 2014; Tomasello et al. 2005; Tomasello and Carpenter 2007), which can be broken down to more intermediate leaps (e.g. from individual to joint and from joint to collective intentionality; Tomasello 2014). The question here is: How did we go from relatively competitive great ape societies to (possibly) cooperative human cultures? It might have been a huge leap if there had not been an intermediate link between our common ancestor and humans. The needs for cooperation (e.g. for foraging) in the early human societies may have led to the transformation of individual to joint intentionality, involving two (or a small number of) individuals (Tomasello 2014). According to this hypothesis, this development has allowed for the coordination of roles and perspectives toward joint objectives, resulting in new forms of perspectival and symbolic representations, socially recursive inference and self-monitoring (regulating one’s own actions from the perspective of a cooperative partner). The practical need for coordination might have actually prompted the development of bodily structures, which subsequently supported more abstracted cognitive functions beyond the ‘here and now’. One tempting line of thought here would be to consider human body (e.g., eye and face) and brain evolution as reciprocally driven in the context of collaborative social interaction (cf. Kobayashi and Kohshima 2001; Powell et al. 2010; Dobson 2012). From a Bayesian perspective, ascending the hierarchy of a neural network, information gets more and more abstracted (e.g., from dealing with the probability of an event, to dealing with volatility, volatility of volatility and so forth; cf. Mathys et al. 2011). Taken together, we hypothesize that such a kind of evolution, which have allowed for abstracting beyond the concreteness of real-time social interactions, might have been toward the direction of extended bodily hierarchies.

Similarly to development at the scale of phylogeny, development at the scale of ontogeny can be also thought of as unfolding in socioculturally mediated interaction with the environment and others, undergoing a series of qualitative leaps along the lifespan (e.g., from individual to collective intentionality; Tomasello and Rakoczy 2003). More concretely, the acquisition of language, which can be considered as a particularly transformative leap for social cognition and interaction, is thought to emerge out of various pre-speech communicative acts (cf. Bruner 1974). An initial basic form of dyadic interaction (between the infant and the caregiver) could serve as the substratum for the development of joint attention, as well as more complex forms of interaction. For instance, dyadic (face-to-face) and triadic (including an object) interactions have found to be developmentally linked (Striano and Rochat 1999). Furthermore, joint attention, which is observed before fully developed social-cognitive awareness (Brooks and Meltzoff 2005), can predict future linguistic ability (Morales et al. 2000; Mundy et al. 2007). Additionally, maternal sensitivity (Hobson et al. 2004) and synchronicity (Carpenter et al. 1998) have found to correlate with infants’ propensity to engage in social interactions and language development respectively.

Also in so-called psychiatric disorders, here thought of as disorders of social interaction or cases of so-called atypical social interaction, we find an interrelation between the manifestation of the organic condition and interpersonal difficulties (Vygotsky 1930–1935/1978; Schilbach 2016; Bolis et al. 2017). When it comes to autism, synchronicity in earlier play interactions between the child and the caregiver was found to correlate with the development of subsequent communicative forms, such as joint attention and language (Siller and Sigman 2002). In fact, it has been suggested that autism can be viewed not as a mere brain disorder, but rather as an evolving interpersonal misattunement encompassing various levels of description (Bolis et al. 2017). An attunement between the child and the caregiver along development is crucial in language acquisition. Yet, even when an autistic individual becomes able to talk, in most of the cases they achieve a propositional attunement (knowing that), as opposed to a pragmatic attunement (knowing how), a fact which largely prevents an intuitive participation in interactions with others. This alone, we suggest, might have direct implications in the formation of the self in autism due to the crucial dialectical nature of language.

Our discussion on tool mediated evolution holds also for individual development: language can be viewed as a communicative tool used for transforming the (social) world, but also the self itself (Vygotsky 1934/1962). This dialectical nature of language becomes evident when examining its dual role, in speech (interpersonal) and thought (intrapersonal), which should be thought in unity, rather than in external (even tight) relation (Vygotsky 1934/1962). In other words, contrary to a common assumption that speech is merely an enacted thought, speech and thought unfold together, inextricably entangled. Indeed, recent evidence demonstrates neural coupling during production and comprehension of real-life speech (Silbert et al. 2014). Importantly, the interpersonal aspect of language should be still thought of as temporally and conceptually preceding the intrapersonal one. That is, in contrast to a Piagetian perspective, we adhere to the Vygotskian idea that it is social interaction that drives development and not vice versa.

In sum, basic forms of interpersonal sensorimotor contingencies gradually evolve into more complex forms of interactions, such as joint attention and multi-person interactions. This kind of initial social interactions might be exactly what (reciprocally) drives development of social cognition for dealing with beyond ‘the now and here’ (cf. Theory of Mind; Baron-Cohen 1991; Tomasello 1995). At the neural level, it has been suggested that joint attention might be the outcome of two interacting systems, namely the posterior and the anterior attention system (Mundy and Newell 2007). The posterior system, which is relatively involuntary and common to many primates, begins to develop during first months of life and can be, simply speaking, thought of as serving for an understanding of “where others’ eyes go, their behaviour follows” (Jellema et al. 2000; Mundy and Newell 2007). The anterior system, which is considered volitional and goal-directed, develops later and can be, along similar lines, thought of as serving an understanding of “where my eye’s go, my behaviours follows” (Mundy and Newell 2007). We take this as suggestive of a claim that the ‘self’ develops tightly connected to the understanding of the ‘other’ and that in fact the latter proceeds.

It might actually be the case that it is exactly in our effort to understand others that we develop an understanding of ourselves. Here, three tangled modeling loops are considered: (i) the inner loop, dealing with the prediction of internal bodily processes (cf. interoception), (ii) the perception–action loop, which involves the anticipation of the consequences of one’s actions on the world and (iii) the self-other loop, which deals with modelling other minds (Timmermans et al. 2012). Exactly the latter loop, through social interactions, might be what ontogenetically forge sophisticated bodily structures that are later deployed for reflective social cognition (e.g., Theory of Mind; Schilbach et al. 2010, 2013; Frith and Frith 2012), via neural reuse (Anderson 2010). There is empirical evidence suggesting that unconstrained cognition, emotional processing and social cognition might all share common neural networks in the dorso-medial prefrontal cortex and in the precuneus (Schilbach et al. 2012). Interestingly, the latter brain networks partially comprise the Default Mode Network, which is putatively activated more when a person does not directly focus on the outer world. Such a neural overlap between ‘social cognition’ and ‘introspection’ can be taken to suggest that not only thinking about others (either implicitly or explicitly), but even thinking about ourselves is driven by social interactions.

Taken together, we construe the self as a historical process of dialectical attunement unfolding over various time scales (Fig. 2). More concretely, we view two cardinal groups of processes dialectically interconnected, namely internalization and externalization. These processes are thought of unfolding along different time scales, e.g., (i) in the time frame of evolution, involving genetic and environmental adaptations, (ii) across generations, as cultural practices, or (iii) during individual development, including bodily and world reconfigurations, such as perception, action and learning. Put simply we view both low- and high-level attunement. Low-level attunement emerges during collective behaviour, when people are coupled together or when they coordinate (cf. De Jaegher and Di Paolo 2007). However, while people interact, and thus act and perceive each other, they mutually co-construct internal models across multiple levels of bodily hierarchies. As we saw before, the construction of such hierarchies allows for consideration of increasingly higher levels of abstraction and thus temporal scales. That means that people in social interactions co-construct each other not only in the ‘here and now’, but also beyond, via co-configuring higher-level abstracted beliefs and patterns of action, on hand in future instances across a variety of interactive contexts or privately (cf. Theory of Mind; Fig. 2). Simply speaking, poetry (from the Greek “poiesis”, literally meaning “making”) can be thought of as an active externalization of internalized social interactions.

Fig. 2
figure 2

Dialectical attunement. Environmental structure (cf. social relations) is transformed within an individual via internalization processes (cf. predictive coding; rightward arrow). Internalized structures serve for co-regulating the external (social) world via externalization processes (cf. collective activity; leftward arrow). Internalization and externalization processes are thought of as unfolding dialectically, that is in a dynamic, reciprocal and cumulative interrelation. Please note, here schematic focus is put on the brain only for convenience; in reality the body participates in the dialectic of internalization/externalization as a whole

Internalization is the set of processes via which the structure of the environment (e.g., social relations) is actively transformed and implemented within an individual. From a Bayesian perspective, internalization entails the creation and maintenance of dynamic hierarchical models of the world in an effort to effectively predict future changes and act accordingly. We consider internalization as being accomplished across various time scales, from genetic information encoding and cultural adaptation, to bodily reconfiguration across development and real-time perception. For instance, in the evolutionary scale, the human visual system is attuned to the peak of the solar radiation spectrum that reaches the surface of the earth. In other words, human species has bodily internalized the environment in terms of electromagnetic conditions. Interestingly, similar attunement to environmental condition is also observed along developmental scales. For instance, experiments have demonstrated that extreme exposure to a restricted range of visual stimuli (e.g., exclusively vertical visual orientation), early in development, modifies the morphology of neurons in visual cortex accordingly (e.g., Tieman and Hirsch 1982). Furthermore, with regards to shorter time scales, perception and action can be seen as real-time bodily attunement to the environment. Finally, undeniably people are also culturally attuned in multiple aspects. For instance, what is considered beautiful or delicious seems to be different across sociocultural contexts, both across time and space.

In fact, humans used cultural models for describing, predicting and manipulating the environment already in the cradles of civilization. For instance, ancient societies have construed natural phenomena, such as weather or earthquakes, as behavioural expressions of personified deities. At first sight, this might appear as a rather naive approach. However, we consider this as an ingenious tactic that might have allowed pre-scientific communities to recruit powerful cognitive capacities, originally developed for dealing with the undoubtedly complex social realm. Any level of abstraction can be considered as a model of the world. To come back to the example of language, a word can be thought of as a sociocultural model in and of itself, which of course presupposes the evolution of both the necessary biological apparatus across evolution and an interpersonal attunement across development. For instance, the word ‘animal’ or ‘wave’ practically captures and summarizes higher level similarities being met in a plethora of diverse natural processes. Here, we should stress that we do not consider the construction of internal models as a passive accumulation of representations.

The construction of internal models allows not only for the prediction of the world, but also the (socioculturally) transformation of it for meeting survival needs, through collective externalization. In other words, dialectical attunement does not merely imply a single-sided adjustment of the individual into the environment, but also transforming thereof across multiple scales: from cooking food, building shelters and developing technology, to transforming social structures and domesticating animal species. The activity of an individual in everyday life is decisively modulated by evolutionary, cultural and developmental factors. For instance, the use of a tool is defined by human anatomy, accumulated collective knowledge and individual learning. As discussed above, though, a change of the environment inherently entails a reconfiguration of the self as well. Externalization directly impacts on internalized models (cf. the interplay between active inference and predictive coding), as well as indirectly via the feedback of a transformed world. For example, learning to use a tool is fundamentally different when it is enacted rather than being merely theoretical, even though in both situations an internal model is developed. Additionally, both mechanical and conceptual tools (see the example of ‘wave’ from above) have helped the construction of modern technology, which in turn continuously modifies humans in multiple aspects and scales (from everyday behaviour to cultural habits and genetics in the long run). Crucially, when it comes to humans, transforming the world is fundamentally social, both with regards to our impact on others and the environment: the former is inherently social, while the latter becomes such via the mediation of sociocultural tools. In sum, we view the self exactly as the dialectic of the abovementioned internalization and externalization processes.

We will come back to this point and its scientific and societal implications during our concluding remarks (Sect. 2.3), after first describing how our hypotheses could be put to the test scientifically. To this end, we will describe experimental and data analytic means for studying the dialectic of internalization and externalization in real-time social interactions and beyond.

2.2 Two-Person Psychophysiology & Multi-level Accounts of Intersubjectivity

Due to conceptual and methodological constraints, research has largely focused on either intrapersonal (e.g. neurobiological and psychological), or interpersonal (e.g. socio-cultural) processes. Here we emphasize the importance of studying intrapersonal and interpersonal processes in their inherent interrelation, as they unfold during social interactions. In what follows, we describe an experimental framework, namely two-person psychophysiology and an analysis scheme, namely multi-level analysis of intersubjectivity that could help us do so.

Two-person psychophysiology appears as a promising avenue for empirical research, which while offering great experimental control, also preserves adequate degrees of ecological validity (Bolis and Schilbach 2017a, b). Traditionally, psychophysiology has enabled the empirical investigation of the relation between physiological and psychological processes (e.g., through physiological monitoring and introspection), offering important insights about individual mechanisms. However informative this kind of approach may have been, the concept of the (a-)typical ‘self’ will remain largely misconstrued until dynamic interpersonal processes are systematically considered, as social cognition might be fundamentally different when we are in interaction with others rather than merely observing them (Schilbach et al. 2013). It has been argued that the most important experience of the other comes from face-to-face situations; that this is the archetypic situation of social interaction, while all other situations are products of it (Berger and Luckmann 1967). It is exactly in this kind of situation that the ‘here and now’ of each other’s subjectivity come together and possibly form an inextricable intersubjective unity (Berger and Luckmann 1967; De Jaegher and Di Paolo 2007; Bolis and Schilbach 2017b).

Building upon empirical frameworks of interpersonal research (e.g. Read Montague et al. 2001; Schilbach et al. 2006; Dumas et al. 2010; Barišic et al. 2013; Froese et al. 2015; Koike et al. 2016; Liu et al. 2016), two-person psychophysiology crucially allows for the empirical investigation and systematic manipulation of face-to-face social interaction, across various modalities and temporal scales. In such a framework (Bolis and Schilbach 2017b), participants sit opposite each other, working on tasks either individually or collectively, while being able to interact, either in real-time or offline, through a micro-camera communication system. Such a two-person framework allows for systematic control and monitoring of processes that live in different levels of description, from (epi-)genetics and culture to interpersonal behaviour and psychophysiology. In fact, via controlling the synchronicity of social interaction and composition of dyads, cardinal aspects of the self can be put into scientific test: Emerging contextual and interpersonal differences in social interactions might prove equally, or even more important than individual traits in defining the becoming of the (a-)typical self (Bolis et al. 2017).

Interpersonal frameworks for empirical research might be an important tool for moving beyond the individual as the unit of analysis, yet not sufficient on their own. Conceptual and experimental practices should be developed hand-in-hand with methods of analysis (e.g. Bahrami et al. 2010; Konvalinka and Roepstorff 2012; Schilbach et al. 2013; Abney et al. 2014; Dumas et al. 2014; Froese et al. 2015; Friston and Frith 2015; Zapata-Fonseca et al. 2016; Fusaroli and Tylén 2016; Sevgi et al. 2016; Bolis and Schilbach 2017a). Here, we suggest a shift from an exclusive focus on the (Bayesian) brain in isolation, toward a multilevel understanding of intersubjectivity and psychopathology. In this framework of analysis, principled accounts of brain function (e.g. predictive processing) are employed for describing crucial neurobiological mechanisms, while being connected to real-life phenomena, which by definition live in an interpersonal space. More concretely, grounded in established models (e.g., Daunizeau et al. 2010; Mathys et al. 2011; Bolis et al. 2015), a two-level modelling scheme could be used for capturing both individual processes (Bayesian level) and collective behaviour (meta-Bayesian level). Put simply, in this scheme intrasubjective parameters will be deployed for capturing individual mechanisms (e.g., neuromodulation), while intersubjective ones to describe emergent processes on the collective level (e.g., interpersonal coupling). Collective parameters refer to sociocultural tools, such as artefacts, communication mediating factors, and generally any co-constructed and commonly held convention. For instance, the efficacy of a communication channel might strongly modulate interpersonal coupling in social interaction (Bolis and Schilbach 2017b).

Such an intersubjective scheme could be exploited for considering emergent phenomena on higher levels of description, such as for instance questions about the autonomy of a dyad or a group of people. To give a more specific example, in the context of collective externalization a non-linear model might explain observed behaviour optimally, thus, providing evidence that the group is different than the sum of individuals. Inversely, this framework could address questions about how collective processes, in turn, shape individual reality. For instance, one could differentially study the potentially distinct impact that a competitive or individualistic versus a collaborative structure might exert upon an individual (Bolis et al. 2017). Collective activity and societal structure are thought of being capable in shaping individual levels (from neurobiology to phenomenology) via internalizing mechanisms. In other words, it is not only lower-level mechanisms that result in emergent collective ones, but internal processes are treated, here, as dynamically internalized interpersonal processes.

Notably, a meta-Bayesian framework can consider observable activity in any level of description, such as neural activity, motor responses or collective behaviour. With regard to social interactions, an interesting avenue for future research might involve studying whether interpersonal coordination on the behavioural level might actually, serve as a prior and modulate, or even relax, the need for inferences about the hidden causes of social behaviour. Furthermore, at a neurobiological level, we hypothesize that activity of different neuromodulators could be related to a subject’s ability of tracking different levels of interpersonal regularities. In short, a Bayesian account of intersubjectivity intends to offer a principled and quantitative description of the dialectic between internalizing and externalizing processes across different levels of description, as discussed above.

2.3 The Dialectical Self: Scientific and Societal Relevance

Our approach shares common ground and most importantly brings together under a dialectical umbrella two seemingly disparate perspectives, i.e., interactionist-enactivist (e.g. Maturana and Varela 1980; De Jaegher and Di Paolo 2007) and computational-Bayesian accounts of cognition (e.g. Clark 2013; Friston 2013). Enactivist accounts have constructively put their focus on the fundamental role of interaction and coupling with the environment, including others. Bayesian accounts of cognition have provided important computational tools for describing individual cognition, mainly through hierarchical models. Our dialectical suggestion, on one hand emphasizes the primacy of (social) interactions. More concretely, it states that for a comprehensive understanding of the (a-)typical self, we will need to move beyond the individual, to the historical unfolding of (social) interactions over multiple scales. On the other hand, our approach extends Bayesian accounts of cognition by situating them in the context of real-time social interaction and providing a description of internalization and collective externalization processes beyond the individual. More precisely, it connects internalization to predictive coding and collective externalization to active inference. By doing so, it describes perception, learning and collective action as a unified process that allows for aligning personal (psychophysiological) and interpersonal (coupling and synchrony) states with environmental (nature and others) conditions. Taken together, via integrating levels of description and time scales such an approach provides a unifying and principled way for studying the self beyond the individual.

In this article we have described the self as the dialectic of internalization and externalization and more concretely as a historical product of dialectical attunement over various temporal scales (see Fig. 2). According to this view, low-level attunement is achieved largely automatically (beyond awareness) during embodied interactions, via mechanisms of collective externalization. High-level attunement is achieved through mechanisms of internalization. For instance, low-level attunement captures human action as an emergent collective phenomenon (cf. interpersonal bodily coupling, coordination and synergy) in the ‘here and now’. High-level attunement captures human mind as an active environmental reflection. In a cultural frame, this takes the form of internalized values and conventions in a society, generalized across multiple temporal and contextual frames. In sum, low- and high-level attunement are dynamically and cumulatively interrelated, via internalization and collective externalization processes, forming the dialectical self.

Yet still one might wonder why even question the question of the self. We believe that any thesis on the self is inherently implicated in numerous fields of science and the society. A dialectical perspective, as the one described here, points toward specific directions that acknowledge the primacy of the social, without neglecting the importance of the individual in their interrelation, co-construction and tension. Additionally, it points toward the necessity of adopting an empirical and principled approach to studying the self. To this end, formal approaches of predictive processing and dynamical systems appear as most promising. Approaching the formation of the self under the unifying umbrella of the dialectic of internalization/externalization might allow formal integration and re-description of seemingly disparate mechanisms across different scales. Yet the implications of such a dialectical approach reach further than the realm of scientific research.

In pedagogy, this is translated into an educational system that would promote collective problem solving as compared to mainstream competitive individual tests. Put simply, taking such an approach seriously, it would make no sense to isolate inherently limited individual cognitive capacity and reward merely the most relevant to a given task. On the contrary, promoting collective problem solving and decision making via active participation and interaction would enhance both cognitive and motivational aspects, yielding superior pedagogical but also practical achievements. In psychiatry, one would not be merely focused on diagnosing and ‘fixing’ individual impairments, but also tuning interpersonal communication and enhancing social inclusion (Fig. 3; Bolis et al. 2017). Within a clinical context, such an approach would suggest the monitoring of not only individual progress, but also interpersonal coupling between a ‘therapist’ and the ‘individual’, as well as between multiple persons during group therapy. In fact, not every therapist might be optimally suited for every patient and therefore matching of therapist and patient might need to be assessed in order to predict whether therapy will eventually work. Within a societal context, ‘tuning’ will not target only the individual with a psychiatric condition, but also her social environment. For instance, anti-stigma and informational campaigns will target tuning of social expectations of others as well, effectively resulting in a reciprocal amelioration of existing interpersonal misattunement. Such developments might help bring a redefinition of what a psychiatric disorder is, situating it back into the social realm within which it emerges.

Fig. 3
figure 3

Dialectical (mis-)attunement and interpersonal re-tuning: (top) a homogeneous dyad interacting ‘smoothly’, (middle) a heterogeneous dyad interacting less effectively, (bottom) retuned interaction via not only targeting a person with a condition, but also others, as well as the interaction itself (cf. Bolis et al. 2017)

In the field of ethics and law, seriously assimilating the idea that the self goes beyond the static individual, a juridical system would not only focus on individual intentionality and responsibility, but also take into account collective factors and societal structure. Along similar lines, confronting social problems such as racism will not merely address educating individuals, but also dealing with social structures, which potentially instigate and maintain such patterns of behavior. Finally, such a perspective would suggest developing artificial intelligence and robotics, not via static pre-configuration, but via allowing interaction for co-constructing and internalizing knowledge. This should be expected to yield not only more robust artificial systems, but insightful conclusions on cardinal questions about human cognition as well. More concretely, in line with cultural historical and enactivist perspectives, we suggest that the role of social interaction and active participation in the co-construction of a culturally shaped self should be taken more seriously, in both research and social practice, as paraphrasing Descartes: ‘we interact, therefore I become’, or put simply ‘I interact, therefore I am’.