1 Introduction

The desire to unveil and reproduce the brain’s functions is the point of connection of many research fields and it has facilitated interdisciplinary studies, as in the case of neuroscience and robotics (Floreano et al. 2014).

In the early period of computational motor control, neuroscientists exploited control theory and closed-loop systems to model and understand how the brain generates and controls movements (Loeb et al. 1990; Campos and Calado 2009). In recent years, neuroscientists have made use of robots together with models of biological nervous systems to implement the perception-action loop and then validate and refine their hypotheses.

Roboticists draw inspiration from neuroscience findings on perception and control to design and build robots interacting with unknown and changing environments like humans. The state-of-the-art robots realize manipulations with high precision only for well-defined tasks. The existing bottlenecks, which limit the use of robots in many areas, can be broken by developing an artificial brain that senses the environment and learns how to control robots without using predefined procedures for each sensory-motor pattern that could be encountered (Qiao et al. 2021; Li et al. 2019; Cutsuridis and Taylor 2013).

Over the years, it has become clear that the brain, the body, and the environment have to be kept together to advance our understanding of the brain’s functions. The physical properties of the body and environment alter the behavior commanded by the brain through neural signals, simplifying or complicating the control (Nishikawa et al. 2007). It is in this context that musculoskeletal robots and neurorobotic devices were born (Li et al. 2019), representing two lines of research destined to meet.

Musculoskeletal robots mimic the mechanical properties of the human body by imitating the skeleton, joints, muscles, and the way muscles drive joints (Richter et al. 2016). Musculoskeletal robots show many control challenges due to their many degrees of freedom (DoF) and the strong redundancy and nonlinearity (Asano et al. 2017). Controllers that explicitly define a mapping between the joint angles and the force, length, or excitation of the muscles are not able to dominate the complex nonlinear dynamics of musculoskeletal robots. The movements generated by musculoskeletal robots are limited in terms of generalization, precision, and speed, despite the many model-free and model-based control methods proposed in the literature and reviewed in Qiao et al. (2021).

Neurorobots are devices engaged in a behavioral task, situated in a structured environment and whose controller mimics the brain’s architecture and dynamics (Krichmar 2018; van der Smagt et al. 2016). They can be subdivided into bio-inspired controllers and biomimetic controllers. Bio-inspired controllers drew inspiration from biology without faithfully reproducing the neural structures or incorporating biological constraints; they include, for example, iterative learning controllers (Wang et al. 2009), adaptive controllers (Wang and Chellai 2018), active inference-based controllers (Pezzato et al. 2020) and muscle-synergy-based controllers (Chen and Qiao 2020). Biomimetic controllers, which include neurocomputational models, are realized by modeling biological structures and processes. They have been developed for controlling locomotion, for better understanding functional and computational models of different brain regions, and even to facilitate movements of living organisms (Tolu et al. 2022).

Models of the brain are realized at different levels of detail, depending on the scientific question one wants to answer. The brain itself is organized in distinct anatomo-functional levels, which could be grouped into three spatial scales ranging from nanometers to meters: microscale, mesoscale, and macroscale (D’Angelo and Wheeler-Kingshott 2017). These spatial scales range from molecules to large-scale networks, through neurons and neuronal networks. The microscale level investigates single neurons and synaptic connections that underlie the cerebral microcircuits, the mesoscale level studies columns and mini-columns of neurons, while the macroscale level is focused on the connection of distinct neuronal populations (Milardi et al. 2019). In the last years, it has been remarked the importance of realizing a multiscale model of the brain both to investigate diseases of the nervous system (Lytton et al. 2017), as well as to improve our knowledge about the correspondence between model parameters at one level and physical variables (D’Angelo and Wheeler-Kingshott 2017). Models of the human nervous system are embodied in robots that act in the real world as well as in software agents acting in a virtual environment. Software simulations allow repeatable and systematic evaluations of models because the environmental conditions are kept under control. Robots allow us to evaluate the robustness of models by testing them in an environment where perturbations are not under control (Casellato et al. 2014). Software simulations are particularly useful for incremental design and the evaluation of complex models of the brain devoted to controlling complex musculoskeletal systems.

This paper reviews the system-level neurocomputational models of human sensorimotor control, with particular reference to the learning, execution, and control of upper-limb movements. These models belong to the class of biomimetic models because they reproduce the biological systems in terms of the architecture, the learning process, the way movements are controlled and executed, and the adaptability to changes in the environment (Conforto et al. 2009). System-level neurocomputational models reproduce the human neuronal structures by means of artificial neural networks interconnected according to the anatomy of the brain. Differently from biophysical models, system-level models focus on the computational properties of real neurons and ignore aspects related to their anatomy (Wiecki and Frank 2010).

This paper represents a perspective on how neurocomputational models of the mammalian sensorimotor system and neurorobotics have been used as tools for understanding the learning and execution of arm movements. Therefore, this overview includes only the articles in the state of the art that exploit a simulated or real embodiment of neurocomputational models to reproduce and analyze the execution of upper-limb movements. Instead, other surveys in the literature aimed to review the state of the art of brain subsystems that are poorly modeled (Loeb and Tsianos 2015), describe the connectionist models of the nervous system (Smirnitskaya 2022), analyze the advances in reproducing the neural control of movements on robot models (Migalev et al. 2022), and focus on the integration of computational models of the motor cortex, spinal cord circuitry, and skeletal muscle in order to create a model of the corticomuscular pathway (Haggie et al. 2023).

The remainder of the paper continues as follows: Sect. 2 is an introduction to the field of brain simulation via system-level models, Sect. 3 briefly describes the human motor system and the main areas that make it up, Sect. 4 describes the neurocomputational models of the different areas of the motor system highlighting their distinctive features, Sect. 5 describes the applications of these models to neurorobotics, and Sect. 6 summarizes the open challenges of the field. Eventually, Sect. 7 concludes the work by summing up the current status in modeling the brain and controlling musculoskeletal robots.

2 Biological plausibility of computational models

Three disciplines complementarily contribute to the understanding of brain information processing, and how cognition is implemented in the brain: cognitive science, computational neuroscience, and artificial intelligence. Cognitive science aims to explain human behavior by means of computational models that execute cognitive tasks, while computational neuroscience studies the principles and mechanisms that guide the development and the information processing of the nervous system by exploiting neurobiologically plausible artificial intelligence models. Cognitive computational neuroscience has been proposed in Kriegeskorte and Douglas (2018) as the discipline that arises at the intersection of cognitive science, computational neuroscience, and artificial intelligence, whose goal is to explain neuronal activity and behavior observed or measured in animals and humans by means of biologically plausible computational models that perform real-world cognitive tasks.

In the literature, brain activity and motor behavior are investigated via different kinds of computational models, which approach the problem either from an engineering perspective or a biological one (Reeke and Sporns 1993). In this survey, we review computational models that can be ascribed to the field of cognitive computational neuroscience, i.e. models that incorporate, to some extent, realistic architectural and physiological characteristics. In Kriegeskorte and Douglas (2018) these models are called brain-computational models, which mimic the brain information processing by abstracting from biological detail to an arbitrary degree but predicting some aspects of brain activity or behavior.

System-level models are brain-computational models that reproduce the interconnections and the dynamic interactions of different brain areas underlying a target behavior, and the interaction of the brain with the environment (Trappenberg 2009; Mannella et al. 2016). They are adopted for verifying hypotheses related to high-order brain functions and usually reproduce in detail only the areas of the brain whose functions are of interest in the study. Each subsystem is modeled by a neural network whose level of biological detail can be refined by a step-by-step procedure that leads to a multiscale model of the system (D’Angelo and Jirsa 2022). Although different neural models (localist, auto-associative and hetero-associative, deep, and whole-brain networks) exhibit human-like performance, their similarity to biological neural networks needs to be improved by focusing on modelling synaptic plasticity, connectivity within and between areas, neuronal mechanisms from micro to macro scale (Senk et al. 2022; Pulvermüller et al. 2021).

The system-level approach is grounded on the idea that behavioral and cognitive skills arise from a closed-loop interaction between the agents’ nervous system, body, and environment, and aims at investigating the functions of the different areas of the brain. Therefore, system-level models are particularly suitable for neurorobotics applications, when brain models are embedded in physical robots or simulated agents that interact with real or simulated environments. In this regard, two approaches of embodiment have been highlighted in Caligiore and Fischer (2013): the brain-based devices approach (Fleischer and Edelman 2009) and the computational embodied neuroscience approach (Caligiore et al. 2010). Both these approaches embody system-level models of the brain in an agent that interacts with the environment and validate hypotheses about functional mechanisms of the brain by comparing data from behavioral experiments and from the simulated system, but the computational embodied neuroscience approach aims at realizing general models that are able to explain the data from different experiments (Caligiore and Fischer 2013).

3 The motor system

The motor system is the part of the nervous system whose neural activity causes the contraction of muscles and the generation of movements. It is organized in a functional hierarchy and includes regions of the cerebral cortex, the basal ganglia, the cerebellum, the brainstem, and the spinal cord networks. The motor system generates voluntary, rhythmic, and reflexive movements. Voluntary movements are under conscious control and are generated by networks that span different levels of the functional hierarchy; rhythmic and reflexive movements are predominantly controlled autonomously by spinal or brainstem circuitry (Kandel et al. 2000). In this paper, we focus on reaching movements, which are fast voluntary movements towards a given target whose main characteristics are a straight trajectory and a bell-shaped speed profile. Figure 1 shows the general architecture of the motor system.

Fig. 1
figure 1

Overview of the motor system. The cerebral cortex block includes motor and sensory cortices. The spinal cord receives proprioceptive signals from receptors in the muscles that climb up to the sensory-motor cortex via the brainstem. The efferent copies of the commands generated by the spinal cord networks are sent to the cerebellum

The motor area of the cerebral cortex is organized in the primary motor cortex, the premotor, and supplementary motor areas. The primary motor cortex is the most directly involved in the initiation and control of movement and its neural activity encodes kinematic and kinetic features of the movement, from the forces generated by muscles to the direction of arms (Ashe 2005). The corticospinal tract is the major descending pathway that carries movement-related information from the motor cortex to the brainstem and spinal cord. The premotor cortex is mainly involved in the preparation of movements by integrating sensory and motor information while the supplementary motor area plays a role in planning complex sequences of movements. The somatosensory cortex receives and processes sensory information from the body and it is organized in two parts: the primary and the secondary somatosensory cortices. The primary somatosensory cortex processes somatic sensations and includes an area that receives proprioceptive afferents providing information about the position of the body in space. The secondary somatosensory cortex serves as an association area for sensory input.

The basal ganglia are a group of subcortical nuclei that receive projections from many areas, including the cortex and the thalamus, and their major output returns to the cortex via the thalamus. They are important for the selection and initiation of motor actions.

The cerebellum consists of regular and repeating units that contain the same basic microcircuit. It has bidirectional connectivity with the cerebral cortex via the thalamus and with the spinal cord. Several are the fundamental functions ascribed to the cerebellum. First, it implements the feedforward control of muscular contractions by acting before the sensory feedback arises from movement. In fact, the cerebellum builds and runs an internal representation of the body that allows the prediction of the sensory consequences of movement and the update of motor commands if movement errors are detected. Second, it is responsible for the processes of learning and adaptation of motor skills and for motor timing.

The brainstem is involved in the regulation of movement and other physiological functions (Ruder and Arber 2019). It receives inputs from the cerebral cortex and the cerebellum and projects to the spinal cord. The projections toward the spinal cord are divided into two systems: the medial and the lateral brainstem pathways. The posture is controlled by the medial descending system that integrates visual, vestibular, and somatosensory information. Instead, the lateral descending system controls distal limb muscles involved in reaching movements.

The spinal cord is the lowest level of the hierarchical organization and it plays an essential role in motor control. Its neuronal circuits mediate a variety of reflexes, which are direct motor responses to peripheral sensory information. The spinal cord hosts interneurons and motor neurons that integrate descending inputs from upper motor neurons and sensory inputs and then activate the muscles.

4 Neurocomputational models

Neurocomputational models at the system level are a class of models in which neural networks represent different areas of the nervous system. These neural networks capture core neurobiological aspects, skip morphological aspects of neurons, and allow us to globally analyze information processing and behavior.

There are two approaches to building neural models: the top-down and the bottom-up approaches (Eliasmith and Trujillo 2014; Lytton et al. 2017). The top-down approach exploits the knowledge gained through the observation of the behavior of the living system in order to make hypotheses about the network architecture and the behavioral function of each subsystem. The models presented in this manuscript have been built following this approach. The bottom-up approach seeks to simulate each element in the brain with high accuracy, then combine them in a larger model and wait for the emergence of the biological behavior, i.e. without explicitly defining the objective function to link the model to the desired behavior.

In the next subsections, neurocomputational models of the main subsystems belonging to the motor system are presented. These models show a different level of detail because they have been defined in different periods of time, with different computational resources and neurobiological knowledge, and with the purpose of answering different questions (Eliasmith and Trujillo 2014; Loeb and Tsianos 2015). Notwithstanding, the analysis of the literature shows that there are a few ”basic” models that somehow represent major paradigm shifts, so we adopted them in a model-by-model fashion to organize the material of the following subsections.

4.1 Models of the cerebellum

The cerebellum consists of two major parts: the deep cerebellar nuclei, from which originate the output of the cerebellum, and the cerebellar cortex. The cerebellar cortex is organized into three layers: the molecular layer, the Purkinje cell layer, which is the output layer, and the granule cell layer, which is the input layer. The cerebellar cortex receives inputs from the mossy fibers (MFs) and the climbing fibers (CFs). The granule cells (GCs) receive input from MFs and their axons reach the molecular layer bifurcating into two branches called parallel fibers (PFs). Purkinje cells (PCs) integrate information from PFs and CFs and project to the deep nuclei. Figure 2 shows the connections between the different layers of the cerebellar cortex.

figure 2

Box and arrow model of the cerebellum. Mossy fibers originate from many locations in the brain and spinal cord. The mossy fibers from the pontine nuclei bring in a copy of the motor program generated by the motor cortex, while those originating from the spinal cord bring in a copy of the commands integrated by the spinal interneurons. Sensory information about the executed movement enters the cerebellum through the climbing fibers. The cerebellum predicts the sensory consequences of the motor commands and, during the execution of the movement, compares them with the actual incoming sensations; when there is a mismatch it can make corrections to the motor program so that the body will execute what the brain wants. The balance of the two parallel pathways that synapse on the Purkinje cells determines the DCN activity, which can be suppressed or enhanced via disinhibition. The molecular layer interneurons include stellate cells and basket cells. Connections ending with a bubble are inhibitory. Boxes with dashed orange borders are structures belonging to the Brainstem. DCN: Deep Cerebellar Nuclei

The first theoretical model of the cerebellum as a neuronal machine was proposed by David Marr in 1969 (Marr 1969). He suggested that the cerebellar cortex is an associative memory that, thanks to learning, maps proprioceptive information about the state of the body (the context) into the motor commands required for executing the desired action. This associative learning takes place at the synapses between PFs and PCs. In particular, PCs integrate information about both the movement to be executed coming from the CFs, and the context, i.e. the body state, from the MFs. During the learning phase, PCs learn the association between the context and the action and, at the end of learning, the information about the context is enough to fire PCs. A comprehensive simulation of Marr’s theory with a full-scale model was proposed by Tyrrell and Willshaw (1992).

James Albus revised this model in 1971 suggesting that the nearly simultaneous arrivals of action potentials from some PFs and a CF to a PC leads to a long-term depression of the transmission from the PFs to the PC, therefore the strength of PF-PC synapses is depressed rather than potentiated during the association between context and motor action (Albus 1971). This hypothesis about the synaptic plasticity of PF-PC was experimentally verified ten years later by Ito et al. (1982), who also proposed that CF inputs represent error signal (Ito 1970).

The Marr-Albus theoretical model was implemented as an adaptive filter by Fujita in 1982 (Fujita 1982). Adaptive filters have been widely used for modeling how the cerebellum learns the execution of reaching movements (Haith and Vijayakumar 2009) and other kinds of motor actions, such as vestibulo-ocular reflex and eyeblink conditioning (for a complete review see (Dean et al. 2010)).

Albus modeled the cerebellum as a look-up table that stores the motor response to a given input. In particular, he proposed an artificial neural network architecture known as the Cerebellar Model Arithmetic Computer (CMAC) based on the idea that the cerebellum is a pattern classifier (Albus 1975). Although CMAC is no longer used for simulating the function of the cerebellum in behavior, it is still applied for solving adaptive control problems in artificial systems (Tan et al. 2022).

The computational models proposed by Marr, Albus and Ito have been the most influential in neuroscience even if they were insufficient to explain several cerebellar functions such as whole body movements and cognitive functions (Kawato et al. 2020). Marr's and Albus' models didn’t consider the role played by other parts of the central nervous system in motor control; they hypothesized that the cerebellum was the only brain area devoted to control movements. These models, moreover, considered pattern recognition the objective of cerebellar learning. On the contrary, Ito’s model indicated control tasks as the objective of learning. A detailed comparison of the three models and an evaluation of their validity in the light of recent neuro-physiological findings are provided in two recent reviews (Kawato et al. 2020; Sanger et al. 2020).

In the 1980s, the increasing interest in controlling multijoint robotic arms promoted the development of new models of the cerebellum that took into account its interaction with other brain areas belonging to the sensory-motor system.

In 1984, Ito made the start for new models by introducing the concepts of microcomplexes, which are structural and functional unit modules of neuronal circuits in the cerebellum, and internal models (Ito 1984).

Internal models are neural representations learned by the cerebellum through the cerebrocerebellar loop that links the primary motor cortex and the cerebellar cortex. Internal models help the brain to perform movements without feedback from the body and are divided into two groups: forward models and inverse models. Forward models predict the sensory consequences of delivering a motor command to the biomechanical system and therefore are a neural representation of the future state of the system. Instead, inverse models compute the motor command to be sent to the biomechanical system given the desired behavior and the current state of the system. The acquisition of an inverse model is an ill-posed problem because the same behavior can be executed by different motor commands (Bizzi and Mussa-Ivaldi 1998; Uno , Kawato 1994). Kawato proposed a biologically plausible solution to this problem, named feedback error learning (Kawato et al. 1987), which hypothesizes that CFs input to PCs the motor-command errors generated by feedback controllers at the spinal and cerebral levels.

Schweighofer presented a cerebellar model based on the microcomplex hypothesis and on feedback error learning to examine the role of the cerebellum in the coordination of visually guided arm movements (Schweighofer et al. 1998). The cerebellum model was embedded in a functional anatomical model of the motor system addressing the issue of controlling dynamics of an arm model made up of a redundant set of muscles (Schweighofer et al. 1998). The study showed that the cerebellar model gradually learned the part of the inverse dynamics that was not compensated by the feedforward and feedback controllers, which were implemented by the motor cortex and the spinal cord, and reduced the error between intended and planned movements.

In the same period, starting from neurophysiological and neuroanatomical studies of the cerebellar-rubrospinal pathway, James Houk modeled the cerebellum as an array of adjustable motor pattern generators (APGs), each of which is an abstract representation of the corticorubral-cerebellar loops between a cerebellar cell and a motor cortical cell (Houk 1987). He proposed that an array of APGs transmits motor commands to the spinal cord that, in turn, translates the commands into muscle activations. APGs were implemented as neural networks and they were able to control a nonlinear dynamical model of the neuromuscular system (Sinkjær et al. 1990), as well as to regulate movement by learning how to anticipate the sensory feedback (Barto et al. 1999). Later, an array of APGs was supposed to activate a muscle synergy, i.e. a group of muscles, rather than a single muscle (Fagg et al. 1997), and then it was proposed that groups of APGs are responsible for generating motor commands driving primitives in the spinal cord (Vahdat et al. 2006).

van der Smagt et al. (2016) have suggested categorizing the models of cerebellum according to their understanding of the cerebellar operation: Functional models, which are based on functional understanding of the cells, State-Encoder-Driven models, which assume that the granule cells are on-off units, and Cellular-Level models, which adopt differential equations for modeling neurons. CMAC and APG models are examples of State-Encoder-Driven models (Albus 1975; Houk 1987) while the Schweighofer-Arbib model is an example of Cellular-Level models (Schweighofer et al. 1998a, 1998b).

In the last years, the cerebellar circuit has been modeled using a bottom-up strategy and incorporating realistic details at neuronal and synaptic level (Solinas et al. 2010; Casellato et al. 2014). D’Angelo and colleagues have proposed a detailed review of the new strategies for modeling the cerebellar circuit and their embodiment in neurorobotic models (D’Angelo et al. 2016).

4.2 Models of the basal ganglia

Computational models of the Basal Ganglia (BG) focus on how these groups of nuclei at the base of the forebrain implement cognitive and motor functions. The cognitive functions simulated with models of BG include categorization, conditioning, learning, working memory, sequence learning, and automaticity, while the simulated motor functions are reaching, handwriting and eye saccades (Helie et al. 2013). It has been proposed in Gillies and Arbuthnott (2000) to group these models into three classes that highlight different aspects of basal ganglia function: models of reinforcement learning, models of action selection, and models of serial processing. The common basis of most of these models is reproducing the ability of the basal ganglia to detect cortical contexts and select motor or cognitive actions.

The functional role of BG in behavior has been studied, as with other brain areas, with neural network models and abstract mathematical models (Cohen and Frank 2009). Neural network models allow investigations of how behavioral phenomena such as action selection and learning depend on the interactions among the different parts of the circuit and on the modulatory effect created by dopamine and other neurotransmitters. Abstract models describe how the system works with mathematical equations, which usually have no relation with the biological architecture, and are less suited to studying the neurophysiological processes.

Basal ganglia are defined as a system that gates information flow in the frontal cortex and from the frontal cortex to the motor system. BG models usually include the striatum (caudate, putamen, nucleus accumbens), the globus pallidus (GP), which is often divided into an internal (GPi) and an external (GPe) segment, the subthalamic nucleus (STN), the substantia nigra (SN), the ventral tegmental area (VTA).

Models of this system include distinct pathways that fulfill different functions: the direct pathway, which facilitates the active action representation in the frontal cortex (and therefore it is also known as the Go pathway), the indirect pathway, which suppresses action representations in the frontal cortex (and therefore it is also known as NoGo pathway), and the hyperdirect pathway, which is crucial for suppressing erroneous movements. The direct pathway proceeds from cortex to GPi via striatonigral neurons: cortical input excites the striatonigral neurons that inhibit GPi, which in turn disinhibits the thalamus and cortex. The indirect pathway proceeds from cortex via striatum and GPe to GPi: the cortex excites striatopallidal cells that stop the GPe from tonically inhibiting the GPi allowing it to further inhibit the thalamus and avoiding the selection of particular cortical actions. Both the pathways were originally proposed by Albin and colleagues in 1989 (Albin et al. 1989) and they represent the core of any BG model. The hyperdirect pathway was identified in 2000 and it connects the cortex directly to the subthalamic nucleus, bypassing the striatum and the indirect pathway, to excite the GPi (Nambu et al. 2000). The activity of the direct and indirect pathways is modulated by the dopamine released by the substantia nigra pars compacta, while the activity of the hyperdirect pathway is not because it bypasses the striatum. Figure 3 shows the three pathways and the main BG areas that are involved. In the last years, other pathways have been identified, including a direct connection between basal ganglia and cerebellum, but their functions are not entirely clear and require more investigations (Milardi et al. 2019).

Fig. 3
figure 3

Overview of pathways in the Basal Ganglia subsystem. Blue connections represent the direct pathway, red connections represent the indirect pathway and green connections represent the hyperdirect pathway. Connections that end with an arrow represent excitatory glutamatergic inputs, while those that end with a bubble are inhibitory gabaergic inputs. The connection in violet is a modulatory dopaminergic input. GPi: internal globus pallidus. GPe: external globus pallidus. STN: subthalamic nucleus. SN: substantia nigra

Contreras-Vidal and Stelmach in 1995 proposed a population-based model of basal ganglia-thalamocortical interactions during normal and Parkinsonian movements (Contreras-Vidal and Stelmach 1995). The model combines the VITE-WRITE model (Bullock et al. 1993) with a BG model including the direct, indirect, and hyperdirect pathways, even if the term hyperdirect was not used. The model showed that dopamine reduction caused by nigrostriatal pathway damage, as in the case of Parkinson’s disease, has the effect of an imbalance in the outputs of direct and indirect pathways. The simulations suggested that basal ganglia play a role in the sequencing of motor programs and modulating the dynamics of the trajectory generated by the other networks.

Moroney et al. in 2008 extended the previous model to investigate the possible causes of slower movements in Parkinson’s disease patients when they perform voluntary movements (Moroney et al. 2008). In particular, they examined the interaction between two separate and competing BG-thalamocortical circuits, each of which controlled one of the muscles of a single-joint elbow performing flexion/extension movements. The model simulations showed that the hyperdirect pathway has the role of suppressing unwanted movements that could affect the planned action. Moreover, the simulation revealed that loss of segregation between the two BG-thalamocortical circuits, which could occur to impede the dopamine deficiency, contributes to bradykinesia (slowness of movement) because the cortical inputs activate not only the desired striatal neurons, but also other neurons in other motor pathways and, as a consequence, the ability to suppress the unwanted movement among competing motor programs is reduced. Eventually, the model confirmed that depletion of dopamine produces smaller and slower movements because it causes an insufficient reinforcement of cortical input.

In Cutsuridis and Perantonis (2006), and Cutsuridis (2011) the authors proposed a basal ganglia-corticospinal model that was an extension and revision of the VITE-FLETE model presented in Bullock and Grossberg (1989). The population-based model was made up of three modules: the basal ganglia-thalamic module, the cortical module, and the spinal-muscular module. The model was adopted for studying the effects of dopamine depletion on the output of the three modules and it suggested that the observed slowness in Parkinson’s disease bradykinesia is due to dopamine reduction in all three modules and not only in the basal ganglia.

In 1998, Bischoff implemented a neural network model of the basal ganglia and cortical regions involved in arm movements (prefrontal cortex, supplementary motor area, prefrontal cortex, motor cortex, thalamus) (Bischoff 1998). The BG model was realized by hypothesizing that the role of the direct pathway is to represent the next sensory state and provide it to the cortex, while the indirect pathway inhibits competing movements. The activation of these two pathways is balanced by dopamine, whose loss reduces cortical activity by increasing the inhibition of basal ganglia projections to the thalamus. The model was used for simulating elbow flexion and extension, the reciprocal aiming task, and the execution of sequential movements task. Under Parkinsonian conditions, the model was able to simulate bradykinesia and abnormal pauses in sequential movements.

Magdoom et al. proposed a simplified model of the motor cortex and basal ganglia to investigate how these two brain areas interact during the execution of reaching movements of a 2-link arm (Magdoom et al. 2011). The motor cortex was modeled with a perceptron and striatum and corticostriatal connections were not explicitly represented. This model belongs to the class of reinforcement learning BG models with the dopamine level related to the error between the actual and the desired arm position. BG had two roles in this model: one was the real-time correction of the output of the motor cortex on the base of the dopamine level, and the other was to train the cortex using this corrective signal. In this model, as with other models, high values of dopamine (Go regime) activate the direct pathway with the effect that the arm doesn’t change the movement direction, and low dopamine values (NoGo regime) activate the indirect pathway with the effect that the arm has a tendency to move in the direction opposite to the current one. This model, differently from other models, included the explore regime that corresponds to an intermediate range of dopamine that causes the activation of the indirect pathway in such a way that the movement direction of the arm is updated randomly. The simulations showed that the motor cortex became gradually independent of the corrective action of BG as the learning process proceeded, with a reduction in the variability of reaching trajectories. Moreover, the simulations confirmed that Parkinson’s motor symptoms increased with a reduction of the time spent in the Go regime.

Muralidharan et al. (2018) presented an extended version of the simple lumped model proposed in Magdoom et al. (2011). Their model was made up of a sensory-motor loop and a cortico-basal ganglia loop and it was used for investigating the neural mechanisms responsible for reaching movements in normal and in Parkinson’s disease conditions. The model includes the spinal cord, which was modeled with one layer of neurons, the proprioceptive cortex and the prefrontal cortex, which were both modeled as self-organizing maps (SOM), the motor cortex, which was modeled by combining a continuous attractor neural network and SOM, and the basal ganglia. The BG module includes the striatum, the GPi and GPe, the STN, and the thalamus. The striatum was not modeled as a network per se but as an activation function that modulates the cortical input. The simulations confirmed that BG and motor cortex contribute differently as the learning progresses and, moreover, it was shown that the dynamics of the STN-GPe loop conditioned the BG output. In particular, synchronized oscillations in STN-GPe resulted in tremors and rigidity.

Narayanamurthy et al. (2019) used the model described in Muralidharan et al. (2018) to perform both unimanual and bimanual reaching. Two copies of the Muralidharan's model, each of which controlled one of the two arm models, were coupled at the level of the respective motor cortices to simulate bimanual reaching. The final model was used to develop and test rehabilitation strategies for stroke.

Mannella and Baldassarre (2015) evaluated if cortico-striato-nigro-thalamo-cortical loops, which are re-entrant loops between the basal ganglia and cortex, determine the selection of motor repertoires for action execution. Their model included a dynamical reservoir, which was implemented with an echo state network (Lukoševičius and Jaeger 2009), reproducing the dynamics of a cortical module while the basal ganglia module was implemented similarly to the model proposed in Gurney et al. (2001), which reinterpreted the functional anatomy of the basal ganglia into a "selection" and a "control" pathway: the former is devoted to selecting actions that are then modulated by the latter. The model was able to explain how, given the same sensory contextual information and different information about the task to be executed, a neural population in the motor cortex can be recruited to generate different movements. The model was evaluated as the controller of a 3-DoF arm and a 20-DoF hand, in three different periodic behaviors (writing a square, a sideways “8” shape, and a moon-like shape).

Motor adaptation is a trial-by-trial process during which the sensorimotor system learns how to counterweight the perturbations caused by the environment in order to actuate the planned movement. This learning process is based on the visual sensing of the error between the planned and the actual movement. It has been suggested that the cerebellum implements this error-based process by updating the internal model of the body (Izawa and Shadmehr 2011). When visual feedback is absent, it has been shown that motor adaptation is still possible if reward feedback is available. It has been suggested that reward-based motor adaptation is implemented by basal ganglia (Shmuelof et al. 2012).

Kim et al. (2017) evaluated the role of BG in motor adaptation as well as in action selection. Their model included the cortex, the basal ganglia, the spinal cord, and a 2-DoF virtual arm with six muscles. The cortex model included the prefrontal cortex, whose neurons represent the sensory stimuli, the primary motor cortex, whose firing activity commands the basic reaching actions by properly activating the spinal cord, and the premotor cortex, whose neurons modulate the activity of primary motor cortex. Both the prefrontal and the premotor cortices projected to the BG, which in turn projected to the premotor cortex. BG created associations between stimuli and actions by means of a reward-based learning mechanism. Differently from classical models of action selection, which result in the activation of the single neuron in the premotor cortex corresponding to the selected action (Frank 2005), the authors hypothesized that the firing of premotor cortex neurons defines the relative contribution of different basic actions to the final motor response. Moreover, the authors hypothesized a lateral inhibition mechanism in the premotor cortex in order to guarantee the suppression of undesired competing actions. The simulations showed that BG potentiated the direct connections between prefrontal and premotor neurons, and at the end of learning the sensory stimulus triggered the corresponding action without the BG intervention. Moreover, model simulations highlighted two important aspects: the integrity of BG is not critical at the end of learning but the indirect pathway is important for reversal learning because it provides the mechanism for forgetting an association between a stimulus and an action.

Todorov et al. (2019) investigated the role of BG and CB in motor adaptation by extending the BG model proposed in Kim et al. (2017) with a neural network representing the cerebellum, which received input from the cortex and projected to the spinal cord. The study shows that the learning mechanisms operating in the cerebellum and in the basal ganglia cannot operate simultaneously during motor adaptation and the authors hypothesize the existence of a circuit that controls learning rates in both BG and CB and guarantees the switching between the two learning mechanisms.

4.3 Models of the motor cortex

The cerebral cortex is organized into six horizontal layers and into radial cortical columns and minicolumns. Each layer is characterized by neuron types, input sources, and output destinations. There is little knowledge about the local connections among neurons from different layers and, as a consequence, models of motor cortex are less detailed than models of BG and CB from a physiological and anatomical point of view (Loeb and Tsianos 2015). In general, the cerebral cortex is described as made up of sensory, motor, and association areas, each of which includes different regions. Figure 4 shows the cerebral cortex’s areas involved in the processing of motor programs and their main connections with other brain structures.

Fig. 4
figure 4

Block diagram of the cerebral cortex’s areas involved in the processing of motor programs and their main connections with other brain structures (Feher 2012). The prefrontal cortex forms a working memory that maintains information necessary to guide the execution of goal-directed tasks. The somatosensory cortex receives proprioceptive information from the spinal cord via the brainstem and the thalamus. The sensory association cortex integrates sensory information from different senses and extracts the information relevant to the motor task. The premotor cortex and the supplementary motor cortex are involved in the formation of a coordinated motor plan, i.e., the sequence of motor commands that need to be executed to perform the desired behavior. The primary motor cortex triggers the movement by exciting the motor neurons through tracts that descend the spinal cord

Two are the main theories that explain how the motor cortex neural activity relates to movement: the representational perspective and the dynamical systems perspective (Wang et al. 2022; Shenoy et al. 2013). The representational perspective has concentrated on relating the firing of motor cortex neurons with the movement parameters, while the dynamical system view has focused on how the activity of populations of neurons determines the output of the motor cortex.

Most of the neurocomputational models linking motor cortex models to spinal cord and muscle models adhered to the representational perspective and didn’t include any recurrent connections, which are the basis of a dynamical system (Haggie et al. 2023; Shenoy et al. 2013).

Burnod et al. (1992) proposed a neural network model that reproduced the modular organization of the cerebral cortex in cortical columns. The network learned how to transform the visual information about a target point into the motor commands required to move the arm toward it. The network was made up of two maps, one representing the neurons in premotor and motor areas, the other representing the neurons in parietal areas. Premotor and motor areas received inputs from the cortical columns in the parietal areas, which combined three sources of information: visual input of the target position, somatic input of the arm position, and efferent copy of the motor command issued by motor areas toward the spinal cord. The model explained neuronal tuning properties which are observed in the premotor and motor cortex, both at the individual cell level and at the population level.

Bullock and Grossberg (1988) proposed the VITE model, a neural network that models motor cortical operations performed during reaching arm movements. The model computes a difference vector between the current and the desired positions that codes information about the direction and the magnitude of the desired movement. In Bullock et al. (1998) VITE model was extended to generate movements with variable speeds and forces. Simulated neurons showed similar delay period activation, kinematic and kinetic sensitivities, and latency of activity onset to real neurons in the primary motor cortex and the parietal cortex. Moreover, the simulation showed that psychophysical properties of goal-directed movements, such as bell-shaped velocity profiles and the speed-accuracy trade-off, were reproduced by the model.

Lukashin et al. (1996) simulated the neural mechanisms behind the integration of motor cortex commands by the spinal cord and their translation in muscle force. The neural network was made up of three layers: supraspinal populations, spinal cord interneurons, and motoneuronal pools. The network controlled the model of the arm consisting of six muscles and two joints. The motor cortex was organized into two populations, one encoding the direction and magnitude of dynamic force and the other encoding postural force. Each cortical population consisted of eight pools of neurons, each one encoding one of the eight preferred directions defined with a cosine-like tuning function. Moreover, the magnitude of the encoded force was proportional to the level of the output signal of cortical pools. The spinal cord was modeled with four pools of interneurons and six pools of motorneurons, each one controlling a different muscle. The model simulation suggested that supraspinal signals are correctly transformed into the desired force if the strength of synaptic connections between supraspinal populations and spinal interneurons is correlated to the directional preference of connected neurons. Eventually, simulations proved that the force generated by the activity of both cortical populations is nearly equal to the vectorial summation of the forces generated by each population activated independently.

Chen and Reggia (1996) proposed a self-organizing model that simulated map formation both in the primary proprioceptive cortex and in the primary motor cortex when a 2-link arm is controlled. The model includes four layers of neurons organized in a closed loop: motor cortex, proprioceptive cortex, lower motor neurons, and neurons sensing the length and tension of the muscles. Each neuron of the cortical layers represents a cortical column. The model showed that multiple feature maps spontaneously emerge from initially random connections during unsupervised learning. The spontaneous formation of cortical maps with clusters of neurons driving different muscles is the result of the adoption of the Hebbian learning rule for updating the synaptic strength and the implementation of a ”Mexican hat”-type lateral peristimulus inhibition. The same model was used in Varier et al. (2011) to examine the effects of focal and distributed injuries (lesions of the model) at various stages of development (cortical map formation). Both studies show that cortical maps are not hard-wired but are the result of learning.

Cisek (2006) proposed a computational model of some regions of the cerebral cortex for addressing the questions of how different potential actions are planned and how the brain decides among them. In the model, populations of cells in the frontoparietal cortex represent the multiple potential actions while populations of cells in the prefrontal cortex bias the selection of one of the actions. Each population was organized as a layer of 90 mean-rate leaky-integrator neurons tuned to different spatial directions representing the potential actions. The model included also cells of the posterior parietal cortex, which receive the external inputs representing the potential target directions, and cells of the primary motor cortex, which activate the selected action. A second external input modulated the strength of the premotor cortex projection into the primary motor cortex. The model simulations suggested that planning and decision-making can be performed in parallel by a unified mechanism, which is the lateral interactions between cells. In particular, there is a mutual excitation between cells with a similar preferred direction and inhibition from cells with different preferred directions.

Neymotin et al. (2013) proposed an actor-critic model to investigate how the mapping between proprioceptive state and motor commands is learned. The actor was made up of a spiking neural network modeling the proprioceptive, sensory, and motor areas of the sensorimotor cortex. The proprioceptive area encoded the lengths of four muscles and, in particular, each neuron fired for a narrow range of muscle stretches of one of the muscles. The sensory area encoded the state of the whole arm by integrating spikes delivered by the proprioceptive area. Eventually, the sensory area was projected to the motor area made up of distinct populations of neurons, each stimulating one of the muscles. The critic provided rewards and punishments to the actor by evaluating the variation of hand-to-target visual error. Therefore, the critic modulated the weights of synaptic connections among neurons by means of reinforcement learning. The authors observed that the learning modified the network dynamics by enhancing neuronal synchrony and information flow between neuronal populations. They noticed that, at the end of the learning phase, proprioceptive information led to point-to-point movements allowing the virtual arm to successfully reach a target from any starting point.

Verduzco-Flores and De Schutter (2022) adopted a planar arm with six muscles controlled by a neural model including the spinal cord, sensory, and motor cortex to investigate how, during childhood, primates gain the skill of using these neuronal networks to execute movements. Their hypothesis was that movements are executed to pursue the internal control of homeostatic variables, such as regulating nutrient concentration by finding food, and therefore the feedback corticospinal loop is a negative control system where the motor cortex and the spinal cord cooperate to activate the muscles and to reduce an error in higher-level variables. The authors showed that the existence of the long-loop reflex, which starts when proprioceptive signals are sensed and forwarded to the cortex and ends at the spinal cord when alpha motoneurons are activated, explains the emergence of some phenomena in motor control as the directional tuning of neural populations in the motor cortex and linear combination of convergent force fields in the spinal cord to generate movement. Moreover, the authors showed that this control loop arises if learning happens in the connections from sensory to motor cortex according to differential Hebbian plasticity rules introduced in Verduzco-Flores et al. (2022).

Neurocomputational models that adhere to the dynamical system perspective use recurrent neural networks (RNNs) to model the complex dynamics of the motor cortex and generate muscular activity.

Sussillo et al. (2015) showed that muscle activities recorded from a primate during a reaching task can be reproduced by RNNs. In particular, the natural dynamics of the model resemble the ones observed in the primates’ primary motor cortex when the RNN is trained with heavy regularization.

Michaels et al. (2020) input a modular RNN with the visual features extracted by a convolutional neural network to control a musculoskeletal human arm. The model was able to explain the brain neural activity when the authors introduced sparse connectivity among the three modules of the RNN.

Chen and Qiao (2020) used an RNN to transform the x and y coordinates of a target into muscle excitations and trained the network with a reward-modulated multitask learning method. In the experimentation, the RNN controlled the movement of a 2-DoF musculoskeletal system with nine muscles involved in the execution of a center-out reaching task. The model had a consistent population response of hidden neurons in the RNN and generated muscle excitations compatible with human-like muscle synergies activation.

4.4 Models of the brainstem

Brainstem computational simulation still seems understudied (Mikolajewski and Duch 2018) as a consequence of its complex and irregular structures, and experimental difficulties in acquiring electrophysiological data on this brain area (Duch and Mikołajewski 2020).

Computational models of the brainstem have been proposed for investigating its role in controlling locomotor frequency and gait (Ausborn et al. 2019), saccadic and smooth pursuit eye movements (Rahafrooz et al. 2008).

To the best of our knowledge, a model of the brainstem involved in the control of reaching movements has not yet been proposed.

4.5 Models of the spinal cord

The spinal cord is formed by complex networks of motoneurons and interneurons that receive inputs from sensory afferents and descending tracts. Alpha motoneurons (\(\alpha\)-MN) integrate central and peripheral inputs in order to activate muscle fibers, while Gamma motoneurons (\(\gamma\)-MN) regulate spindles’ sensitivity to muscle stretch. The spinal cord hosts different types of interneurons with different functions in the regulation of muscle activation. Ia inhibitory interneurons (Ia-IN) control the reciprocal inhibition between antagonist muscles, Ib inhibitory interneurons (Ib-IN) receive afferents from Golgi tendon organs of the homonymous muscle and inhibit both synergistic and antagonistic muscles, Renshaw cells (Ren) modulate the activation of antagonist muscles, Propriospinal interneurons (PN) connect multiple spinal cord segments and are involved in the execution of complex motor reflexes (Pierrot-Deseilligny and Burke 2012). Figure 5 shows the main connection of the spinal cord with the other brain’s areas.

Fig. 5
figure 5

Motor commands travel down the corticospinal tract to synapse on interneurons and motor neurons in the spinal cord. The neurons located in the rostral region of the primary motor cortex make monosynaptic connections with interneurons in the intermediate zone of the spinal cord, while the neurons in the caudal part have direct connections with alpha motor neurons (Rathelot and Strick 2009). The firing activity of alpha motor neurons makes muscles contract. Gamma motor neurons regulate spindles’ sensitivity to muscle stretch. The ascending projections from the PN interneurons to the Lateral reticular nucleus may provide the cerebellum with an efferent copy of the descending command for reaching (Alstermark and Isa 2012). The axons of the dorsal root ganglion are the afferent sensory nerve fibers carrying information from the body to the brainstem and the cerebellum. The spindles provide information about the muscle length and the rate of muscle length change. Golgi tendon organs (GTOs) are sensory receptors located at muscle-tendon junctions that convey information about the force produced by the muscle during its contraction. BS: Bulbospinal tract, CST: Corticospinal tract

Spinal cord models differ by the type of interneurons they include, and the adopted neuronal model. Moreover, these models have always been simulated in a closed loop with models of the musculoskeletal system for investigating different behaviors. The musculoskeletal system is also modeled in many ways, with different degrees of freedom, different muscles, and muscle models.

At the beginning of the nineties, Bullock and colleagues proposed FLETE, one of the first models of spinal circuitry (Bullock and Contreras-Vidal 1991; Bullock et al. 1993). The model was developed to verify the hypothesis that muscle length, which determines joint angle, and muscle co-contractive tension must be controlled independently of one another. FLETE included \(\alpha\)-MN, \(\gamma\)-MN, Ia-IN, and Ren cells and was evaluated in the control of antagonistic muscle pairs. Simulated neural activity was analyzed to clarify the behavioral function of the different types of neurons in the spinal cord. Later, van Heijst et al. (1998) showed that the independent control of the length and tension of muscles, which allows higher centers to separately control joint angle and stiffness, is the result of self-organization of spinal circuitry facilitated by Hebbian learning operating during spontaneous activity in the spinal cord.

Stienen et al. (2007) investigated the mechanisms behind the modulation of reflexive feedback strength that was observed in human postural control of healthy subjects and complex regional pain syndrome patients. The spinal cord model consisted of a large population of 2,298 neurons grouped in six antagonistic population pairs (\(\alpha\)-MN, Ia-IN, Ib-IN, inhibitory and excitatory interneurons, and Renshaw cells) and it was an improvement of the model presented in Bashor (1998). The musculoskeletal model was basic and included a pair of antagonistic muscles moving a 1-DoF joint and it has been indicated as the limit of the study (Allegra et al. 2020). Later, Schuurmans et al. (2011) exploited the model proposed by Stienen to investigate the effects of neural and sensory mechanisms on the modulation of reflex gains.

In the last ten years, researchers have made use of more detailed spinal cord models for driving musculoskeletal models that are more complex and realistic than an antagonistic muscle pair. One of the main findings of simulation studies is that the learning of new motor tasks is facilitated and sped up by the activity of the mammalian spinal cord.

Loeb et al. proposed a fairly complete spinal cord model that drove a four-muscle wrist joint in Raphael et al. (2010) and a 2-DoF elbow-shoulder system with six muscles in Tsianos et al. (2011, 2014). In Raphael et al. (2010), the authors investigated how the execution of rapid reaching movements, the adaptation to viscous curl force fields, and the stabilization of the joint’s responses to force perturbations are controlled by the supraspinal centres through the spinal networks. The study showed that the spinal cord, despite its complexity, considerably simplifies the learning and the execution of complex movements. In Tsianos et al. (2011), it was shown that a starting co-contraction of muscles guarantees that the spinal cord model stabilizes the arm when an external force is applied. Eventually, in Tsianos et al. (2014) the authors showed that muscle dynamics of reaching movements were not specified by the cortical motor commands, whose role is recruiting the appropriate spinal circuits, but were driven by the proprioceptive feedback generated during the movement. Moreover, model simulations revealed that the interpolation of descending commands corresponding to reaching movements toward two different targets allowed the generation of a movement toward an intermediate target. Li et al. (2014, 2015) investigated how activities of \(\alpha\)-MN and \(\gamma\)-MN are coordinated by the central nervous system, a question that has not yet been clarified by in-vivo experiments (Prochazka and Ellaway 2012). The authors hypothesized that \(\alpha\)-MN and \(\gamma\)-MN receive two sets of commands from the supraspinal centres: static commands, which control posture maintenance, and dynamic commands, which control movement dynamics. Simulation results validated the hypothesis that \(\alpha\) and \(\gamma\) motoneurons are controlled in parallel (Li et al. 2014). The same authors evaluated the role of PN interneurons in the transmission of cortical oscillatory signals to peripheral muscles. Simulations revealed that Parkinsonian tremor is caused by oscillatory signals originating in the motor cortex and delivered to neuromuscular junctions through propriospinal neurons (Hao et al. 2013). Eventually, in Qu et al. (2017) the authors showed that the same propriospinal neuronal network plays a role in the gating of the central oscillation signal and the generation of essential tremor.

Parziale et al. (2020) exploited a model of the spinal cord to understand what are the neural basis of the trade-off between speed and accuracy, also known as Fitts’ law. The spinal cord model used in the experimentation was inspired by the ones proposed in Tsianos et al. (2014); Raphael et al. (2010), but it included some significant differences. First, each \(\alpha\)-MN received a monosynaptic input from corticospinal neurons, consistent with the findings that the primary motor cortex makes monosynaptic connections with motoneurons innervating shoulder, elbow, and finger muscles (Rathelot and Strick 2009). Second, the stretch reflex latency and the Golgi tendon reflex latency were set to realistic values. The spinal model was simulated with a 1-DoF arm driven by two synergistic muscles and one antagonist muscle. Model simulations revealed that the speed-accuracy trade-off is a behavioral trait and not an intrinsic property of the neuromuscular system. Such a trade-off is a consequence of the strategy adopted by the supra-spinal centres for commanding movements faster than previously learned movements. In particular, the regulation of the firing activity along the monosynaptic connection between cortical cells and \(\alpha\)-MN changed the velocity of a previously learned movement.

Human movements are characterized by some invariants, such as the bell-shaped velocity profile of the end-effector, which are not affected by the interaction torques that arise at one joint due to the motion of other joints. It means that the intersegmental dynamics are compensated during the movement planning at the central level or during the execution by the spinal cord circuitry. Buhrmann et al. exploited a model of spinal circuits to show that the coordination of dynamic muscle forces is managed by the spinal feedback mechanisms and therefore multijoint movements can be controlled without internal models of intersegmental dynamics (Buhrmann and Di Paolo 2014).

Teka et al. (2017) investigated how the activity of cortical neurons correlates with movement parameters by exploiting a model that included six cortical neuronal populations that project to spinal neural circuits. The spinal circuits modulated the contraction of six muscles actuating the reaching movements of a 2-DoF arm. The activity of the six supraspinal neuronal pools was calculated by solving an inverse problem. In particular, the model computed the muscle forces and then the corresponding supra-spinal inputs required for performing each desired trajectory with a defined velocity profile. The experimental results revealed that the activities of the motor cortex and spinal neural, the Ia afferent feedback, the force-length, and force-velocity components of muscle force depend on the movement direction and that directional tuning increases moving from the muscles to the cortex. The author hypothesized that the directional modulation of the primary motor cortex firing rate is a consequence of direction-specific dynamics of muscle lengths.

Sreenivasa et al. (2015) developed a neuromuscular model consisting of a spiking neural network controlling the contraction of two antagonist muscles that flex and extend the elbow. The neural network consisted of pools of motor neurons, sensory neurons, and interneurons whose sizes and relative distances were chosen to approximate their topological organization in the human spine. The model parameters were subject-specific and were identified from human experiments. The model mimicked the biceps stretch reflex and it allowed a quantitative evaluation of reflex behavior, muscle forces, and movement kinematics under varying conditions of external forces and co-contraction levels.

5 Neurorobotic applications

To understand and replicate how the brain perceives and interacts with the environment it is required to identify what are the causal mechanisms between neural activity, perception, and action. Ethical and technical reasons don’t allow in-vivo experiments where the responses to stimuli are measured after manipulations of living brains in humans and animals. Computational models of the brain allow investigating causal mechanisms with in-silico experiments where manipulations and recordings are limited only by the level of detail and complexity used for representing the real system (Markram 2013).

Neurorobotics introduces a new experimental paradigm where the perception-cognition-action cycle is closed by embedding brain models in physical robots or simulated agents that interact with real or simulated environments, as shown in Fig. 6. The key idea of this approach is that the interaction with the environment cannot be ignored to reliably simulate the brain and to study how behavior arises from neural activity. This experimental paradigm allows testing and refining brain models iteratively, removing the approximation of simulating the brain without the body (Amunts et al. 2019). Neurorobotics benefits the understanding of the brain because robots’ states are fully observable, simulations are repeatable and the complexity of the neural model can be modulated as desired (Stratton et al. 2016).

Fig. 6
figure 6

Neurorobotics is based on the interaction between an agent and the environment. An agent is made up of a neurocomputational model of the brain, which works as a controller, and a body. The body can be a physical or simulated robot, which could be an industrial, humanoid, or musculoskeletal robot. Likewise, the environment can be physical or simulated. The interaction between an agent and the environment happens according to the perception-cognition-action cycle. The agent perceives the environmental stimuli, which are captured by the body’s sensors and sent to the brain model, processes them, and then commands actions on the environment. The action changes the environment, which produces new stimuli and the cycle continues

The real brain is organized in many interconnected subsystems that interact with each other to elicit the behavior. The researcher who aims to mimic such a complex system has to deal with the trade-off between the accuracy of the model and the time required for simulating it. The involvement of the brain model in a real-world task requires the simulation of the model in real-time, but the current technology doesn’t allow it when the model is too complex. If embedding the brain model on a physical robot is a desideratum, the model needs to be simplified until it can be simulated in real-time. Instead, the embodiment has to be simulated when the accuracy of the model has to be preserved (Falotico et al. 2017).

The simulation of both the embodiment and the environment guarantees the replicability and reproducibility of the experimental results and therefore their credibility (Mulugeta et al. 2018). In fact, simulation offers a controllable and measurable experimental setup while the complexity of real environments is difficult to control or cannot be controlled at all. In the case of musculoskeletal robots, simulations simplify the measurement of some quantities, such as muscle forces, and therefore foster a quantitative explanation of how movement arises from the interaction of neural and musculoskeletal systems (Pandy 2001). Moreover, if one wants to investigate the influence of growth or aging on the development of cognitive skills, a simulated agent is a forced choice because a real robot cannot grow or age over time (Knoll and Walter 2019).

The Neurorobotics Platform (NRP) is a simulation platform (Falotico et al. 2017) developed in the context of the Human Brain Project (Amunts et al. 2019). NRP allows the simulation of a complete neurorobotic system that acts within a simulated world. In particular, it includes NEST (Gewaltig and Diesmann 2007) to realize models of the brain with spiking neural networks, Gazebo (Koenig and Howard 2004) to define models of robots, and Opensim (Delp et al. 2007) to design models of musculoskeletal systems.

Even though simulating a neurorobot offers all the advantages previously described, we have to remember that the ultimate purpose is to realize an artificial agent that shows human-like behavior, which cooperates with humans, and that could replace living systems in the experimental setup to verify or falsify new scientific hypotheses. Hence, computational models have to be exposed to a dynamic and rich sensory environment in order to be properly validated. Modeling the interactions between body and environment is complex (Pandy 2001) but embodying a brain model on a physical robot acting in a real environment can reveal how the agent manages the effects of unmodeled dynamics and how it works in a scenario that is not under full control of the experimenter. Eventually, as suggested in Stratton et al. (2016), observing the complex and unpredictable closed-loop interactions between a neurorobot and the real world facilitates a deeper understanding of how the brain processes information and controls behavior.

Given its crucial role in motor learning and adaptation, the cerebellum is the brain area that has been most frequently mimicked by artificial controllers. In fact, the most advanced realistic microcircuit models, i.e. models based on a detailed biophysical description of neurons and synapses, are the ones representing the cerebellum (D’Angelo et al. 2013). Most of the papers present robots equipped with cerebellar-like control architectures that do not model the connection with other brain areas. The first cerebellar-like systems for controlling multijoint arms were published in the 1980s. In particular, Kawato et al. controlled a robotic manipulator with a hierarchical neural network model that combined the feedback control by the primary motor cortex with feedforward control by the cerebellum (Kawato et al. 1987).

In order to investigate the neural basis of the human learning process, Casellato et al. (2014) analyzed the activity of a cerebellar spiking neural network embedded in a real robot performing three different sensorimotor tasks in the real world: Pavlovian conditioning task, vestibulo-ocular task and reaching task with perturbations. The experimental results showed that during the three tasks, the cerebellar network learned how to predictively tune its response to different input stimuli. The authors concluded that the system showed the main functions commonly assigned to the cerebellum, which are timing neural processes, sensory prediction, and motor learning.

Capolei et al. (2019) controlled the arm of the humanoid robot iCub with a control system that included a classical PID regulator, which was in closed-loop with the robotic plant, and a feed-forward bio-inspired cerebellar-like neural network controller. The latter was designed by reproducing the functionality and morphology of the cerebellar microcircuit as proposed in D’Angelo et al. (2016), but it received sensory inputs representing angular position and velocity directly from the encoders in the robotic arm and generated the corrective action in terms of corrective torque. The system was embedded in a virtual implementation of the robot in the NRP platform. Simulations showed that the cerebellar distributed synaptic plasticity allows the model to learn corrective torque commands generating fast and precise movements independently of external perturbations.

Abadía et al. (2019) proposed a biologically plausible cerebellar-like spiking neural network controller for driving a 6-DoF industrial robot via torque commands in real-time. The controller was equipped with the spike-timing-dependent cerebellar plasticity mechanism, and it was able to self-adapt and learn from scratch to drive a given robot. The cerebellar network was divided into six microcomplexes, each of which controls one of the robot’s joints. Overall, the controller was made up of 62,000 leaky integrate-and-fire neurons and 360,000 synapses and it was able to achieve smooth movements and fast ballistic movements. The authors demonstrated in Abadía et al. (2021) that their cerebellar-like controller was robust to variable time delays in the transmission of sensory information and motor commands as the biological cerebellum is robust to the sensorimotor pathway delay.

As regards musculoskeletal robots, Richter et al. (2016) presented the proof of principle of a neural control system driving the robot in real-time. Their controller was based on the spiking cerebellar neural network with spike-timing dependent plasticity proposed in Luque et al. (2011) that was able to regulate movements when different objects were manipulated.

Niu et al. (2020) proposed an artificial controller mimicking human reflex to control prosthetic hands. The model was made up of 768 spiking neurons organized in six motoneuron pools, one Hill-type muscle with one muscle spindle. It received electromyographic signals recorded by wrist muscles as input and it was implemented on neuromorphic hardware allowing the real-time control of a prosthetic hand. The prosthetic hand showed human-compliant grasping and it was able to move with different destinations or velocities.

Dura-Bernal et al. (2015) made a physical robot arm reproducing in real time the same trajectories executed by a virtual 2-DoF musculoskeletal arm model controlled in a closed loop by a biomimetic cortical spiking model. The cortical model included three neural populations (proprioceptive, somatosensory, and motor) composed of several hundred spiking neurons. The experimental results revealed that increasing the realism of the musculoskeletal model allowed the generation of smoother trajectories and the physical robot executed movements with low vibrations even with high angular velocities. In Dura-Bernal et al. (2016) the authors simulated lesions in the spiking neural network by either silencing neurons or eliminating synaptic connections. The activity of the cells that survived the lesions was used for creating an inverse model of the neural system. The inverse model allowed the prediction of the neural stimulation pattern required to restore the pre-lesion neural activity and then recover the original reaching trajectory. Eventually, in Dura-Bernal et al. (2017), the authors implemented a more complex spiking neural network that included also a spiking model of the spinal cord and they used the recordings from the premotor cortex of a macaque brain for modulating the activity of the virtual primary motor cortex in order to select the target the virtual arm had to reach.

Zahra et al. (2022) integrated the spiking neural networks of the cerebellum model and a motor-cortex-like differential map to control the motion of a UR3 universal robot. The differential map acted as an inverse model generating the motor commands in the joint space given the desired spatial velocity and the current joint angles of the robot. The cerebellum model acted as a forward model predicting the spatial velocity upon applying the motor command. Hence, it predicted and then anticipated the error by modulating the input to the differential map. The model controlled the robot in a reaching center-out task and it showed a fast learning process, being able to reduce the error in a certain direction in only a few repetitions.

Pimentel et al. (2023) proposed a first attempt to integrate the model of basal ganglia-thalamus-cortex circuitry with the model of the cerebellum. In particular, the cerebellum, the basal ganglia, the cortex, and the thalamus models were made up of 120, 95, 19, and 29 leaky integrate-and-fire neurons, respectively. The model controlled the arm of the humanoid robot iCub to execute pronation and supination movements, both in simulation and in a real environment. Even though the model with both the networks of the cerebellum and the basal ganglia was able to control the robot as desired, the executed movements were not as accurate as the movements executed when the robot was controlled only by the model of the cerebellum. The authors highlighted two critical aspects of the integrated model that could explain that result: not having trained the model of the cerebellum and the model of the basal ganglia at the same time because they were in a loop, the limited numbers of neurons, and mossy and climbing fibers.

Clearly, the greater the complexity of the brain model, i.e. the greater the number of interconnected brain areas, the greater the opportunities for scientists to understand the mechanisms inside the brain and to make new predictions, especially about brain disorders. Izhikevich (Izhikevich and Edelman 2008), Human Brain Project (Markram et al. 2011), Synapse (Preissl et al. 2012), Spaun (Eliasmith et al. 2012) are large-scale brain models with at least one million neurons. Differently from Spaun, the other large-scale models do not relate the vast simulated neural activity to behavioral tasks (Eliasmith and Trujillo 2014).

Spaun is a large-scale model and a cognitive architecture composed of 2.5 million leaky integrate-and-fire neurons and 8 billion connections corresponding to 20 anatomical brain areas and organized in five subsystems, each of which is devoted to a particular high-level function. Spaun’s visual system processes static images while its motor system controls a virtual arm. The model can perform eight different tasks without changes in its architecture. The basal ganglia subsystem regulates the information flow between subsystems depending on the task to be performed. For example, Spaun is able to watch images of handwritten digits and reproduce their shape with its virtual arm. Despite its unique features, Spaun still represents an early step toward the building of brain-scale functional models: it adopts neuron and synapse models with limited biological realism (Duggins and Eliasmith 2022), its perceptual and conceptual representations are limited and it is unable to learn completely new tasks (Rasmussen and Eliasmith 2013).

The simulation of a large-scale model as Spaun is really expensive in terms of time and energy, even though it has 40,000 times fewer neurons than the human brain. As reported in Stewart and Eliasmith (2014), the simulation of 1 s of behavior required 2.5 h of computing on a high-end workstation in 2012. So, the embodiment of complex models has evident problems related to computational resources.

The REACH model, proposed in DeWolf et al. (2016), includes and improves only the Spaun’s motor system. The model is made up of 31,500 spiking neurons subdivided into neurons belonging to the premotor cortex, the primary motor cortex, and the cerebellum. The REACH model controlled the execution of reaching and handwriting movements performed by a virtual nonlinear 3-link arm, also in the presence of environmental perturbations. The model of the premotor cortex is based on a neural implementation of dynamical movement primitives, which can be thought of as elementary blocks that can be combined to generate a more complex trajectory. The premotor cortex generates the desired trajectories in a 2D space. The primary motor cortex compares the current state of the arm with the target positions received from the premotor cortex and transforms the desired action into joint torques. The cerebellum corrects the arm movements by elaborating the commanded torques and the arm state.

Iacob et al. (2020) run the REACH model to control a 3-DoF physical robot arm moving in a 3D space. The experimental results showed that the robot successfully performed reaching movements and the cerebellum network was able to compensate for force field perturbations, even though performance was slightly lower than the simulated model.

Instead, DeWolf et al. (2023) run a simplified version of REACH on neuromorphic hardware (Intel’s Loihi chip) to control the position and orientation of a 7-DoF arm simulated in Mujoco. The SNNs of the primary motor cortex and the cerebellum were implemented on-chip and received the arm feedback and the planned trajectory from the arm simulated on a computer. The controller was evaluated on a center-out reaching task and the authors noticed a reduction in the energy cost by two orders of magnitude, a latency lower than the implementation on a CPU, and a slight deviation between the executed and the ideal trajectories.

Table 1 summarises the main features of the neurorobotic applications described in this section.

Table 1 Neurorobotics applications

6 Challenges and research directions

The synergy of neuroscience, computing, and robotics has launched a new era of brain research, whose next challenges include among others: a multiscale integration of brain models, linking advanced brain models to the spinal cord and musculoskeletal system models, modeling sensorimotor integration and coordination during task execution, developing personalized models for the diagnosis and treatment of brain disorders, and the simulation of complex behavior using robots interacting with the environment (Amunts et al. 2022).

A comprehensive understanding of the brain’s functions and processes requires a multiscale integration of all brain area models up to large-scale brain networks. Modeling brain-scale networks with spiking neuronal networks (SNNs) is the current approach in the literature because SNNs code and transmit information by the timing of spikes, as done by biological neurons in the brain. In this context, an important research direction is represented by neuromorphic computing, which is focused on a hardware architecture that mimics the structure of the biological brain with its energy efficiency and inherent parallelism. Neuromorphic computing facilitates the implementation of SNNs achieving high energy efficiency, low-latency processing, and robustness against noise but relaxing simulation accuracy. At the moment, the neuromorphic hardware SpiNNaker achieves accurate and real-time performance for robotic applications, whose dynamics require a simulation time step of 1 ms to be correctly observed. Instead, it cannot still reach real-time performance for neuroscience applications, where the focus on neural dynamics and neurobiological time scales requires a simulation time step of 0.1 ms (Van Albada et al. 2018).

Traditional computing and robotic technologies are still far from achieving human performance in uncontrolled environments, where variability and unpredictability are at the extreme and being able to adapt to environmental changes is necessary, and in fine motor control tasks, which require closing the perception-cognition-action loop in a very short time for real-time interaction.

Hence, combining neuromorphic circuits and brain-scale networks represents the path to be followed for the development of neurorobots that act autonomously in uncontrolled environments and interact with human beings. In fact, obtaining such a result requires dealing with the trade-off between the fidelity of the simulated network, the energy needs, and the long simulation times. In this scenario, the execution of complex motor tasks is still an open challenge and requires the design of neural primitives that can be mapped onto neuromorphic platforms, the understanding of how to integrate all sensing and computing components to create a stable perception useful for behavior, and the understanding on how to exploit the non-linear properties of biological neural systems to integrate adaptation and learning at different temporal scales (Bartolozzi et al. 2022).

Eventually, a key role in the definition of a comprehensive theory describing the information processing in the brain is played by the so-called ”closed-loop neuroscience”, the experimental paradigm that allows the observation of brain activity and physical performance via the embodiment of computational models in physical robots or simulated agents involved in behavioral tasks (Amunts et al. 2019). The advances in the biological plausibility of brain and musculoskeletal models will allow closed-loop neuroscience to verify more and more hypotheses related to brain diseases and to understand brain mechanisms that lead to deficits in motor and cognitive tasks. The simulation of the neuromusculoskeletal system allows researchers to observe the internal state of the system that is not measurable in in-vivo experiments, therefore it represents an emerging application for the design of assistive devices and neurorehabilitation interventions. The two main challenges in using these models are to personalize the model to the unique characteristics of each subject and to simulate the effect of different treatments (Shourijeh et al. 2020). The future of neurorobotics in neurodegenerative research has been depicted in Tolu et al. (2023): the data acquired by the embodied brain models are compared to biological data through machine learning with the aim of improving the computational models in an iterative process. Once the embodied models are optimized, they can be used for testing therapies and to monitor their neurophysiological and behavioral effects on a neurorobot before prescribing them to patients. So, the next generation of personalized brain models, the so-called "digital-twins", will be generated to answer specific questions and they will continuously evolve by being informed with real-world data (Amunts et al. 2022).

7 Conclusion

Neurocomputational models of the central nervous system have been the common ground for roboticists and neuroscientists since the 1980s, when the first cerebellum-like systems were used for controlling industrial robots. Roboticists are aware that understanding how the central nervous system learns, plans and commands movements is the approach for mimicking the abilities shown by humans in interacting with the environment. In fact, systems based on artificial intelligence methods outperform human beings in the tasks they are trained for, but their abilities are limited when they have to deal with unexpected situations. On the other side, neuroscientists are aware that the brain, body, and environment are one, and therefore behavior and neural activity should not be studied in isolation. Simulated or physical neurorobots are the tool for studying the perception-cognition-action loops and verifying/falsifying hypotheses with a replicable and reproducible experimental setup.

Both for roboticists and neuroscientists, the ideal scenario is the one that allows the simulation of the whole neural system in a physical musculoskeletal robot acting in the real and unconstrained world. Unfortunately, such a scenario is still far from being feasible due to the computational costs of the simulation together with the limited scalability of some models and the difficulty in acquiring full knowledge about a complex system as the human neuromusculoskeletal system is. Such a complexity has been coped with breaking the whole nervous system into subsystems that have been modeled in isolation. As a consequence, the different brain areas have been modeled with different levels of detail and accuracy, and a complete interconnection of these subsystems has not yet been achieved. Large-scale models of brain circuits exist but they are not yet simulated together with detailed musculoskeletal models.

The embodiment of neurocomputational models on physical robots has to deal with the hard constraint that the model has to be simulated in real-time. Therefore, the existing physical neurorobots are equipped with a set of models that are enough for replicating a specific behavior with a sufficient level of accuracy and for replying to the scientific question at the base of the experimentation.

Embodying a brain model in a simulated robot acting in a simulated environment offers the opportunity to run more complex models, especially in recent years with the increasing availability of computational resources. Nevertheless, a simulation of a detailed model of the central nervous system that interacts with a detailed model of the musculoskeletal system, which includes all the degrees of freedom of the human skeleton as well as the same muscular connections, is still missing because the translation of the neural activity into specific muscle activation patterns is computationally demanding.

Large scientific projects like the Human Brain Project are pushing the research toward multiscale models, which connect the molecular level to behavior. In the future, multiscale models will enable personalized medicine allowing the comprehension of how a pharmacological therapy at the molecular level could affect neural dynamics and behavior and, at the same time, how neurorehabilitation strategies could affect synaptic plasticity at the neural level.