# Geometric Design Principles for Brains of Embodied Agents

- 913 Downloads
- 1 Citations

## Abstract

I propose a formal model of the sensorimotor loop and discuss corresponding extrinsic embodiment constraints and the intrinsic degrees of freedom. These degrees constitute the basis for adaptation in terms of learning and should therefore be coupled with the embodiment constraints. Notions of sufficiency and embodied universal approximation allow us to formulate principles for such a coupling. This provides a geometric approach to the design of control architectures for embodied agents.

## Keywords

Sensorimotor loop Embodiment Information geometry Cheap design Universal approximation## 1 Introduction

Within the last few decades, it has become clear that an understanding of intelligence and cognition has to take into account the fact that behaving agents are embodied and situated [9, 19]. At first sight this might appear like an obvious and unimportant observation. After all, it is obvious that agents *do* have bodies and that they *are* situated in some environment. So what? But the more we think about it the more we should understand that this observation has actually far-reaching implications. Any behaviour of an agent is mediated through its sensorimotor apparatus and its interactions with the environment. Therefore, the control of intelligent behaviour is inseparably intertwined with the agent’s sensorimotor constraints. Such a coupling should allow us to reveal design principles for brains and their implemented control mechanisms. In this regard, it has been pointed out that quite complex behavioural patterns do not necessarily require complex control, leading to the notions of morphological computation and the principle of cheap design [19, 24, 25].

In this article, I want to formally address the tight connection between the control architecture and the embodiment of an agent in terms of geometry, in particular information geometry [1, 3]. In order to do so, I apply a theory of the sensorimotor loop in terms of a causal model, as developed in [7, 13, 16], and propose an approach for designing controller architectures that utilise the sensorimotor constraints. My aim is not to provide the most general results, but to exemplify the strength of the theory by integrating initial results from an existing body of work [2, 4, 8, 12, 14, 15, 16, 25], and discussing them, for the first time, from a unifying perspective. The main intuition for connecting control architectures to sensorimotor constraints is based on two assumptions. On one hand, the architecture should be sufficient for the expression of desired behaviours, given the embodiment constraints. This clearly requires some amount of richness or complexity of the architecture. On the other hand, it should be concise in some natural sense. I discuss various versions of these two assumptions and derive related results. In particular, I address the notion of embodied universal approximation and contrast it with the standard notion of universal approximation. This is closely related to the recent work [16] which was initiated by the general line of research sketched in the present article. Being based on stronger assumptions, on the other hand, the results of [16] are more refined than the corresponding ones presented in this broad treatment.

Section 2 provides the conceptual and formal definition of the sensorimotor loop of an embodied agent. To the agent, the world appears as a black box which contains the agent’s body and its environment. Sensorimotor mechanisms and their generation of behaviour, which takes place in the world, are formalised. Section 3 discusses and formalises extrinsic and intrinsic sensorimotor constraints. Intrinsic constraints are given in terms of a controller model, which allows the agent to adapt to the extrinsic constraints. The notions of sufficiency and embodied universal approximation are introduced as basis for a geometric design of controller models. Finally, Sect. 4 exemplifies the developed geometric methods in the context of policy models and derives four examples of embodied universal approximators.

## 2 A Formal Model of Embodied Agents

### 2.1 The Basic Components of the Sensorimotor Loop

*black box perspective*. In particular, the boundary between the body and the environment is not directly “visible” for the brain and has to be identified or actively constructed as result of the interaction with the world.

### 2.2 The Mechanisms of the Sensorimotor Loop

*w*. For instance, if the sensor is noisy, then its response will not be uniquely determined. If the sensor is noiseless, that is deterministic, then there will be only one sensor state as possible response to the world state

*w*. In any case, the response of the sensor given the world state

*w*can be described as a distribution \(\beta (w)\) on the sensor states

*s*. Therefore, the mechanism of the sensor can be formalised as a Markov kernel

*ds*. From this, we can calculate the probability of any (measurable) set \({\fancyscript{S}}' \subseteq \fancyscript{S}\) by integration:

*c*and generates, based on

*c*, a distribution \(\pi (c)\) of actuator states. Again, we have a Markov kernel

### 2.3 From Mechanisms to Embodied Behaviours

*w*and an initial controller state

*c*, we have the following Markov kernel that describes the transition to the new world and controller states:

*T*times defines a joint process \((w^1, c^1), \cdots , (w^T, c^T)\) of the world and the controller, conditioned on the initial joint state \((w^0, c^0)\). Behaviour is a process that takes place in the world, for instance as a particular movement of the agent’s body which we considered to be part of the world (see Fig. 1). This implies that only the world process \(w^1, \dots , w^T\) will be of relevance for the study of behaviour. Marginalising out the controller process \(c^1, \dots , c^T\) leads to

*T*times. This Markov kernel encodes all information that is required for the evaluation of behaviour.

## 3 Intrinsic and Extrinsic Sensorimotor Constraints

*universal approximators*, is helpful for exploring the ones that are optimal for the given constraints \(\Sigma \) of the agent. Here, optimality involves two requirements. On one hand, the control architecture should be sufficient in the sense that it enables the agent to adapt to the embodiment constraints \(\Sigma \). On the other hand, it should be concise in order to efficiently implement this adaptation. The field of embodied intelligence offers many case studies as evidence for such kinds of cheap control. This field highlights, in particular, the fact that quite complex and useful behaviours do not require much control [19].

*controller model*or simply

*model*. Most generally, a model can be any subset \({\mathcal M}\) of \(\Gamma \). Typically, a model \({\mathcal M}\) is defined as the image of a map \(\eta \), referred to as a

*parametrisation*of \({\mathcal M}\), from a parameter set \(\Theta \subseteq {\mathbb {R}}^d\) to \(\Gamma \). Figure 4 illustrates the important parametrisation in terms of synaptic couplings \(w_{ij}\) of neurons

*i*and

*j*. We call a model \({\mathcal M}\) together with a parametrisation \(\eta \) a

*parametrised model*. Clearly, any map \(\eta : \Theta \rightarrow \Gamma \) is a parametrisation of the model given by its image.

In order to be useful in applications, a parametrised model is typically assumed to have further properties which are context dependent, for instance smoothness properties up to some order.

*behaviour map*:

*T*go to \(\infty \). Now, say that we have given for all \(\sigma \in \Sigma \) a set \(\mathcal {O}_\sigma \) of optimal or desired behaviours, such as behaviours with maximal predictive information [25] or maximal expected reward [8]. In this article, optimality of behaviours is not further specified and should not be confused with the optimality of a model, which plays the central role in this paper. Optimal models should, at least, satisfy the following natural sufficiency condition.

### **Definition 1**

We say that a model \({\mathcal M}\) is (*geometrically*) *sufficient*, if for all \(\sigma \in \Sigma \) and all corresponding behaviours \(\delta \in \mathcal {O}_\sigma \), there exists \(\gamma \in \overline{\mathcal M}\) that generates that behaviour, that is \(\psi _{\sigma }(\gamma ) = \delta \).

Here, the bar over the set \({\mathcal M}\) denotes its topological closure in \(\Gamma \). In principle, this will depend on the underlying topology of \(\Gamma \) for which one has various natural choices. I am not going to address these topological questions in further detail. The main results of the next section refer to the case where the state sets are finite and therefore the topology is simply the standard one of a finite-dimensional real vector space. If the closure of the model \({\mathcal M}\) equals all theoretically possible intrinsic mechanisms, that is \(\overline{\mathcal M} = \Gamma \), then we say that the model is a *universal approximator*. This corresponds to the most flexible brain architecture which I already mentioned above [see Eq. (3)]. I argue that such a brain is not required for embodied agents in order to be universal at the behavioural level. There are sufficient models, in the sense of Definition 1, that are behaviourally equivalent to, but less complex than, a universal approximator. Clearly one has to specify the notion of complexity here. In any case, the general study of sufficient models in relation to their complexity provides one way to formally address the subject of cheap design which plays a central role within the field of embodied intelligence [19].

*embodied universal approximator (for*\(\Sigma \)):

## 4 Cheap Embodied Universal Approximation

*embodied universal approximator (for*\(\Sigma \)), if the corresponding joint model (7) has this property. Note that the restriction to policy models excludes the possibility of coupling the two intrinsic mechanisms \(\pi \) and \(\varphi \). Such a coupling might provide a further way of reducing the complexity of the controller model.

### 4.1 General Selection of Policy Models

*c*the distributions \(\pi _1(c; \cdot )\) and \(\pi _2(c; \cdot )\) give the same expectation values of the functions

*exponential family*[1]. Using general information-geometric arguments, it is obvious that the parametrisation (11) defines an embodied universal approximator for

- 1.Assume that the world state \(w'\) can not be reached from the world state
*w*in one step, that isIn this case, the corresponding term disappears from the sum (13). In the situation of an embodied agent, the majority of pairs \((w,w')\) has this property, because the physical constraints of the sensorimotor loop exclude most of the transitions from$$\begin{aligned} \alpha (w, a ; w') = 0 \quad \text{ for } \text{ all } a \in \fancyscript{A}. \end{aligned}$$(14)*w*to \(w'\) in one step. - 2.It is not necessary that (14) holds in order to ignore a term from the sum (13). It is already sufficient that there is a constant \(r \in {\mathbb {R}}\) such thatAlthough formally this property is a simple extension of (14), it highlights another important aspect. If \(\alpha (w , a ; w') = r > 0\) for all \(a \in \fancyscript{A}\) then \(w'\) can be reached from$$\begin{aligned} \alpha (w, a ; w') = r \quad \text{ for } \text{ all } a \in \fancyscript{A}. \end{aligned}$$(15)
*w*in one step but in a way that does not involve the actuators. The transition from*w*to \(w'\) is not sensitive to the actuators, and therefore its representation does not play any role in our parametrisation (11).

*feature vectors*, such that every policy of the structure (11) can be expressed in terms of a linear combination of the \(\alpha _k\) and a constant function. This leads to the following simplification of the parametrisation (11).

### **Proposition 1**

*The parametrisation*\(\eta : {\mathbb {R}}^{\fancyscript{C}} \times {\mathbb {R}}^{d_\alpha - 1} \rightarrow \Delta ^{\fancyscript{C}}_{\fancyscript{A}},\)

*defines an embodied universal approximator for*\(\Sigma _\alpha. \)

### **Proposition 2**

*For any injective function*\(f: \fancyscript{A}\rightarrow {\mathbb {R}},\)

*for which*\(f(a) = f(a')\)

*only if*\(a = a',\)

*the parametrisation*\(\eta : {\mathbb {R}}^{\fancyscript{C}} \times {\mathbb {R}}^{2 \, d_\alpha } \rightarrow \Delta ^{\fancyscript{C}}_{\fancyscript{A}},\)

*defines an embodied universal approximator for*\(\Sigma _\alpha \) (

*here*, \(f^k(a)\)

*denotes the*

*k*th

*power of*

*f*(

*a*)).

*f*that is required in Proposition 2 is a generic property of real functions on \(\fancyscript{A}\) so that

*f*can be chosen randomly. Furthermore, given such a function

*f*as the first feature vector, all the other feature vectors are determined as the

*k*-th powers of

*f*. With the set

### **Corollary 1**

*The policy model of Proposition* 2* is an embodied universal approximator for* \(\Sigma _{d_\alpha }.\)

### 4.2 From Models to Architectures

*N*is described in terms of the monomial

*N*, which we assume to be dependent on the controller state \(c \in \fancyscript{C}\). We refer to the cardinality of

*N*as the

*order*of the interaction. For \(| N | = 2,\) we have the important special case of a pairwise interaction, which is of particular interest within the field of neural networks. There, the interaction coefficients are usually interpreted as the synaptic connection strengths between the neurons. If we want to incorporate interactions among nodes of all subsets \(N \subseteq [n],\) then we have to consider the following sum of monomials (20):

*k*. This defines the following policy model, which we refer to as

*k-interaction model*:

*k*, we now address the problem of finding the sufficient order of interaction so that the corresponding

*k*-interaction model is an embodied universal approximator for \(\Sigma _{d_\alpha }.\) For a first simple estimate, we consider again the policy model of Proposition 2. Given that we now assume \(\fancyscript{A}= {\{ -1,+1\}}^n,\) it is possible to define the function

*f*as a linear function, that is \(f(a) = f(a_1,\dots ,a_n) = \sum _{i = 1}^n w_i \, a_i\) so that \(f(a) \not = f(a')\) whenever \(a \not = a'.\) Note that this follows from the finiteness of \(\fancyscript{A}.\) As a linear function defined on \({\mathbb {R}}^n,\)

*f*is never injective, except for \(n = 1.\) For the

*k*th powers of

*f*we obtain

### **Proposition 3**

*With*

*the*\(k(\alpha )\)-

*interaction model is an embodied universal approximator for*\(\Sigma _{d_\alpha }\) (\( \lceil x \rceil \)

*denotes the smallest integer*\(\ge x\)).

*restricted Boltzmann machine*(RBM). In order to define it, we extend the above system of

*n*nodes by further

*m*so-called

*hidden nodes*with the same state space \(\{-1,+1\}.\) The overall state is then given as a pair \((h,a) = (h_1,\dots ,h_m, a_1,\dots ,a_n) \in {\{-1,+1\}}^{m + n}.\) We now consider the following family of kernels, which involves at most pairwise interactions between the hidden nodes and the actuators:

*controlled restricted Boltzmann machine*(see Fig. 5). Due to general results by Le Roux and Bengio [22] and Montúfar and Ay [15], it is known that any distribution of support cardinality smaller than or equal to \(d_\alpha \) can be represented by an RBM with \(d_\alpha - 1\) hidden nodes. Together with (17), this directly implies the following result.

### **Proposition 4**

*A controlled restricted Boltzmann machine, defined by* (23)* and* (24),* with* \(m = d_\alpha - 1\) * hidden nodes is an embodied universal approximator for* \(\Sigma _{d_\alpha }.\)

The dependence of the parameters \(\theta _{i,j}(c),\) \(\theta _i(c),\) and \(\theta _j(c)\) on the controller state *c* can be quite complicated. Assuming that the controller also has a composite structure with *k* binary nodes, it is possible to represent this dependence in terms of pairwise interactions between the controller nodes and the hidden nodes. This leads to the definition of a *conditional restricted Boltzmann machine* [17]. Cheap control with such machines has been theoretically and experimentally studied in [16]. This study is based on the notion of embodiment dimension, which is a refinement of the dimension \(d_\alpha \) used in the present article.

## 5 Conclusions

In order to approach a general theory of embodied agents, I have introduced a formal model of the sensorimotor loop, which specifies its intrinsic and extrinsic mechanisms, building on previous work [7, 13]. The extrinsic mechanisms represent the embodiment constraints of the system which can be utilised by appropriate adjustment of the intrinsic mechanisms in order to express useful behaviours. This requires some degree of flexibility of the control architecture, which I formalised in terms of a sufficiency notion. As a particular case, I studied in more detail embodied universal approximation and the corresponding design of controller models in terms of geometry. However, I argued that sufficiency should not be the only requirement involved in systems design. In order to address the notion of cheap design within the field of embodied intelligence, we have to identify sufficient controller architectures with low complexity. Clearly, one would assume that selection pressure has generated such architectures of naturally evolved brains, which can cope with limitations of mass and energy resources. For instance, the field of sparse coding has provided evidence for internal representations of external stimuli in terms of sparse neuronal activity [18]. However, there are many aspects that contribute to the overall complexity of a controller model, and we are far from a conclusive definition of complexity that would account for the right notion of cheap control. Therefore, I was not very precise in this regard and used a few complexity notions as motivation for the policy models that I have defined. In this context, low complexity of a policy meant: 1. high entropy of the policy, 2. low number of actuator states used by the policy, 3. low interaction order among the actuators, and 4. low number of hidden nodes in a restricted Boltzmann machine.

There is a further important aspect of complexity, which I did not address explicitly. If we design policy models, we have to distinguish between the complexity of the model itself and the complexity of a particular policy, taken from that model. One can design models for which the individual policies have low complexity, but the overall model is hard to describe. However, it is problematic to implement such a structure. In nature, its information has to be transmitted through genetic inheritance. This information transmission clearly has a limited capacity, which is, for instance, modulated by the mutation rate. Therefore, robustness issues have also to be taken into account [5, 20]. To conclude, the right choice of a model should balance the complexities of both, the model itself and the policies that are implemented by the model. This is a quite natural idea within complexity theory [6, 10, 11, 21, 23].

## Notes

### Acknowledgments

I would like to thank Keyan Ghazi-Zahedi, Guido Montúfar, and Johannes Rauh for many stimulating discussions on the subject of embodied intelligence, and, in particular, systems design. The proof of Proposition 2, presented in the Appendix, uses an argument by Johannes Rauh, which was not used in the original proof [14], leading to an improvement of the original result. This work has been supported by the DFG Priority Program *Autonomous Learning*.

## References

- 1.Amari S, Nagaoka H (2000) Methods of information geometry. American Mathematical Society, Oxford University PressGoogle Scholar
- 2.Ay N (2002) An information-geometric approach to a theory of pragmatic structuring. Ann Probab 30(1):416–436MathSciNetCrossRefzbMATHGoogle Scholar
- 3.Ay N, Jost J, Lê HV, Schwachhöfer L (2015) Information Geometry, Springer (submitted)Google Scholar
- 4.Ay N, Knauf A (2007) Maximizing multi-information. Kybernetika 42(5):517–538MathSciNetGoogle Scholar
- 5.Ay N, Krakauer DC (2007) Geometric robustness theory and biological networks. Theory Biosci 125(2):93–121Google Scholar
- 6.Ay N, Müller M, Szkoła A (2010) Effective complexity and its relation to logical depth. IEEE Trans Inf Theory 56(9):4593–4607CrossRefGoogle Scholar
- 7.Ay N, Zahedi K (2014) On the causal structure of the sensorimotor loop. In: Prokopenko M (ed) Guided self-organization: inception. Springer, Berlin, HeidelbergGoogle Scholar
- 8.Ay N, Montúfar G, Rauh J (2012) Selection criteria for neuromanifolds of stochastic dynamics. Springer, Post-conference proceedings Advances in Cognitive Neurodynamics (III)Google Scholar
- 9.Brooks RA (1991) Intelligence without representation. Artif Intell 47(1–3):139–159CrossRefGoogle Scholar
- 10.Gell-Mann M, Lloyd S (1996) Information measures, effective complexity, and total information. Complexity 2:44–52MathSciNetCrossRefzbMATHGoogle Scholar
- 11.Jost J (2004) External and internal complexity of complex adaptive systems. Theory Biosci 123:69–88CrossRefGoogle Scholar
- 12.Kahle T (2010) Neighborliness of marginal polytopes. Contrib Algebra Geom 51(1):45–56MathSciNetzbMATHGoogle Scholar
- 13.Klyubin AS, Polani D, Nehaniv CL (2004) Tracking information flow through the environment: Simple cases of stigmerg. In: Pollack J (ed) Artificial Life IX: Proceedings of the Ninth International Conference on the simulation and synthesis of living systems, pages 563568. MIT PressGoogle Scholar
- 14.Matúš F, Ay N (2003) On maximization of the information divergence from an exponential family. In: Vejnarová J (ed) Proceedings of WUPES’03, University of Economics Prague pp 199–204Google Scholar
- 15.Montúfar G, Ay N (2011) Refinements of universal approximation results for deep belief networks and restricted Boltzmann machines. Neural Comput 23(5):1306–1319MathSciNetCrossRefzbMATHGoogle Scholar
- 16.Montúfar G, Zahedi K, Ay N (2015) A theory of cheap control in embodied systems. PLOS Comput Biol. arXiv:1407.6836 (in press)
- 17.Montúfar G, Ay N, Zahedi K (2015) Geometry and expressive power of conditional restricted Boltzmann machines. J Mach Learn Res. arXiv:1402.3346 (in press)
- 18.Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381:607–609CrossRefGoogle Scholar
- 19.Pfeifer R, Bongard JC (2006) How the body shapes the way we think: a new view of intelligence. The MIT Press (Bradford Books), CambridgeGoogle Scholar
- 20.Rauh J, Ay N (2013) Robustness, canalising functions, and systems design. Theory in Biosci. doi:10.1007/s12064-013-0186-3Google Scholar
- 21.Rissanen J (1989) Stochastic complexity in statistical inquiry. World Scientific, SingaporezbMATHGoogle Scholar
- 22.Le Roux N, Bengio Y (2008) Representational power of restricted Boltzmann machines and deep belief networks. Neural Comput 20(6):1631–1649MathSciNetCrossRefzbMATHGoogle Scholar
- 23.Vapnik VN (1998) Statistical learning theory. Wiley, New YorkzbMATHGoogle Scholar
- 24.Zahedi K, Ay N (2013) Quantifying morphological computation. Entropy 15(5):1887–1915. doi: 10.3390/e15051887 CrossRefzbMATHGoogle Scholar
- 25.Zahedi K, Ay N, Der R (2010) Higher coordination with less control: a result of information maximisation in the sensori-motor loop. Adapt Behav 18:338–355CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.