Brain-inspired Intelligent Robotics: Theoretical Analysis and Systematic Application

Traditional joint-link robots have been widely used in production lines because of their high precision for single tasks. With the development of the manufacturing and service industries, the requirement for the comprehensive performance of robotics is growing. Numerous types of bio-inspired robotics have been investigated to realize human-like motion control and manipulation. A study route from inner mechanisms to external structures is proposed to imitate humans and animals better. With this idea, a brain-inspired intelligent robotic system is constructed that contains visual cognition, decision-making, motion control, and musculoskeletal structures. This paper reviews cutting-edge research in brain-inspired visual cognition, decision-making, motion control, and musculoskeletal systems. Two software systems and a corresponding hardware system are established, aiming at the verification and applications of next-generation brain-inspired musculoskeletal robots.


Introduction
Robot was acclaimed as "the jewel on the crown of the manufacturing industry". It has become a critical indicator for measuring the comprehensive strength of a country in science and technology. Since the first industrial robot, Unimate, went online in 1962, plenty of robotic systems have been employed in industrial production lines, replacing workers to complete tasks with high repetition, high risk, and heavy load, continuously creating tremendous production benefits for manufacturing enterprises. With continuous advancements in mechanical engineering, control, and artificial intelligence technologies, an increasing number of applications, including national defense, health care, and advanced manufacturing, have put forward urgent demands for robotic applications. However, the performance of existing robotic systems is still limited by sensor precision, repetition accuracy of the systems, and the performance of auxiliary mechanisms. Advancing the performance of robots requires the synergetic development of multiple disciplines, such as mech-anical engineering, material science, control theory, intelligent chips, and information science, which significantly limits the broad application of robots owing to the requirement of high investment and a long development cycle.
In comparison, humans can achieve an outstanding overall performance superior to the independent capability of each body unit under the condition of low precision of sensory feedback, behavior control, and limited computation dissipation. It mainly benefits from the extraordinary ability of humans to integrate information from multiple brain regions and sufficiently exploit the advantages of the body structure. Therefore, research on human-inspired intelligent robots by mimicking their biological structure, behavioral features, intelligent principles, and control mechanisms will be significant in developing new-generation robots.
Humans have long been considered the reference standard for improving robots. At present, according to the difference in motivation, the research work of humaninspired robots mainly includes two routes [1] . The first is to mimic the functionality of humans. Based on the observation and analysis of human behavior, this research route aims to endow robots with the ability to complete human work. For example, the most widely used articu-lated robots are composed of rigid links, gearboxes, joint reducers, and auxiliary tools. Although they are very different from humans in appearance and actuating principles, they can replace humans in completing specific operation tasks with the help of sophisticated sensors and specialized control programs. Another typical example is a humanoid robot with a human-like appearance and function designed to mimic human mechanisms from outside-in. Representative studies include advanced step innovative mobility (ASIMO) (1997C) by Honda Motor [2] , BHR (2000C) by the Beijing Institute of Technology [3,4] , iCub (2008C) by the Italian Institute of Technology [5] , and Atlas (2016C) by Boston Dynamics [6] . These robots are normally endowed with bipedal locomotion and have a preliminary self-learning ability and adaptability to complex tasks by introducing an artificial intelligence algorithm. By deploying highly redundant actuators, humanoid robots are capable of human-like dexterity; however, challenges for the design of controllers are introduced simultaneously. Many mathematical algorithms have been proposed to control humanoid robots with highly redundant actuators, such as the zero moment point [7−9] , capture point [10−12] , and central pattern generator algorithms [13,14] , for stable bipedal locomotion, modelfree reinforcement learning [15,16] , and imitation learning based on demonstration [17,18] for diverse manipulation. Although these methods are still in the early stages of research, they exhibit remarkable performance and might be a promising avenue for promoting the application range of humanoid robots.
The second route is oriented by mimicking the core and fundamental mechanism of humans, forming the research of brain-inspired intelligent robots. Brain-inspired intelligent robots imitate humans, from their inner mechanisms to their external structures, by referring to the neural mechanisms and structural characteristics of visual cognition, decision-making, motion control, and musculoskeletal systems [19] . By integrating multiple disciplines of brain science, artificial intelligence, and robotics, research on brain-inspired intelligent robots aims to build brain-inspired information processing algorithms with high computational efficiency and general intelligence and musculoskeletal systems with high flexibility and strong robustness, laying the foundation for next-generation robot development. In the past few years, because of the nature of exploring humans and the urgent need for the treatment of mental diseases, many countries, including the United States, the European Union, Japan, Republic of Korea, Canada, Australia, and China, have proposed brain projects to support research in brain science and brain-inspired intelligence. As an indispensable part of brain-inspired intelligence, brain-inspired intelligent robots act as comprehensive systems integrated with software and hardware, which are of great significance in verifying the research results. Many research groups have conducted preliminary research and have achieved many milestones in brain-inspired algorithms [20−24] , neuromorphic chips [25,26] , and musculoskeletal robotic systems [27−29] . As a typical application form of the musculoskeletal system, wearable exoskeleton system has been a hot spot in recent years. In [30−34], some promising techniques and frameworks, such as region-based control and interactive framework, human-in-the-loop control, and barrier-energy function design, have been employed to improve the performance of wearable exoskeleton system in assisted walking and weight-bearing work. These works are significant in musculoskeletal applications, and represent the state-of-the-art research in human-machine interaction using bio-inspired wearable exoskeleton systems. Owing to the introduction of human internal mechanisms, braininspired intelligent robots are expected to achieve empathy with humans and realize deeper cooperation with co-workers.
Compared with previous relevant reviews of brain-inspired robotics [19,35] , this study introduces the cuttingedge works along the research chain of brain-inspired robots, which are composed of instructive neural mechanisms, designing of the brain-inspired algorithm, building the software platform, and development of hardware system. In particular, the latest progress in the system integration of software and hardware is introduced. The remainder of this paper is organized as follows. Section 2 introduces the core neural mechanisms in vision, decisionmaking, control, and body structure and the corresponding brain-inspired algorithm. Section 3 introduces the software and hardware system integration. The simulation platform for brain-inspired robots integrates brain-inspired algorithms in vision, decision making, and movement control, providing efficient tools for researchers from different fields. The hardware platform was designed to mimic the human musculoskeletal system, providing a physical system to validate the performance of the brain-inspired algorithm. Section 4 summarizes and concludes the study.

Neural mechanisms of the visual cortex
Compared with other perception methods, visual perception is a highly complex process with continuous learning, development, and evolution. It has many good characteristics, such as a large amount of information, noncontact, fast speed, strong integrity, and high accuracy. In the past decades, neurobiologists have found many neural mechanisms (e.g., attention mechanism, two-level cognition, and regional cooperation) of visual perception through various experimental and analytical methods.
First, many layers of the visual cortex can produce a salience map through attention mechanism. The visual system tends to selectively perceive some objects or elements in the field of view based on salience, and this selection process is very rapid. The V1 layer can generate a bottom-up salience map through comparing the local and overall differences between features [36,37] . The ventral inhibition mechanism between V1 layer neurons can inhibit the responses of adjacent neurons with similar regulatory characteristics. Neurons in the V2 layer primarily respond to top-down modulation [38] . In the V4 layer, the bottom-up salience and top-down regulation are integrated, and an integrated salient feature map is output [39,40] .
Second, in the process of visual cognition, classification and recognition will not be carried out simultaneously [41,42] . A functional magnetic resonance imaging study of the human brain [43] showed that the fusiform face area (FFA) is involved in low-level face detection and face recognition. Electrophysiological studies on monkeys have shown that the activation selectivity of neurons in the advanced visual area increases with exposure duration [44] . The initial firing activity of neurons is sufficient to meet the needs of object detection and classification tasks, whereas object recognition tasks rely on the deeper firing activity of neurons [42] .
Third, each visual cortex region has a specific division of labor, and it only completes the perception of a limited range. Recognition and memory of different types of objects are associated with different brain regions. The overall perception of visual information is completed through the cooperation of multiple visual cortical regions. The recognition of objects with shareable features is associated with the activity of the lateral fusiform gyrus, while the recognition of objects with few shareable features is generally associated with the intermediate fusiform gyrus. The perirhinal cortex of the anterior medial temporal lobe was used to identify highly similar objects [45] . In addition, the parahippocampal gyrus area can distinguish between scenes and objects, and the FFA is more sensitive to face and body recognition than to general object recognition [46] .
Knowledge of neural mechanisms of the visual cortex benefits the design of artificial intelligence algorithms. First, it provides clear working principles for brain-inspired visual models that imitate specific visual cortical areas. Second, semantic features and concept formulations can be introduced into visual models to enhance their robustness. Third, lateral and feedback connections enable visual models to associate with primitive memory and discover more discriminative features. Next, we list a few examples.

Brain-inspired visual models and algorithms
Brain-inspired visual models can be divided into two categories: cellular-level models and neural pathway-level models, as shown in Table 1.
Cellular-level models work by simulating the coding mechanisms of visual neurons. Riesenhuber and Poggio [49] proposed the hierarchical-max (HMAX) V1 cell model that simulates the information mapping process of simple and complex cells in the V1 region of the visual cortex. Azzopardi et al. [50] proposed a push-pull combination of receptive fields model with the merits of improved signalto-noise ratio and contour enhancement based on the characteristics of simple cells in the V1 region. McIntosh et al. [47] established a convolutional neural network-structured model that reflected the internal structure and function of the retinal sensory circuit. Klindt et al. [48] proposed a convolutional neural networks (CNN) architecture for neural system identification that contains a sparse readout layer that separates space and time.
Neural pathway level models are based on the structures of visual neural pathways. Serre et al. [51] proposed the HMAX model, which mimics the ventral pathway of the primate cortex and is capable of learning from only a few samples. Dura-Bernal et al. [52] developed a Bayesian network with a similar architecture to HMAX and used loopy belief propagation to simulate selectivity and invariance operations. Subsequently, more studies [53,54] extended and developed the HMAX model. Rolls [55] proposed VisNet, a feature hierarchy model that builds invariant representations of visual input through temporal and spatial statistics using self-organized learning.
There are some deficits to the above existing brain-inspired visual models, such as lacking integral modeling of the visual pathway, modeling of high-level mechanisms of the visual cortex (e.g., semantics, memories, and conception), and robustness to ambiguity and occlusion. Next, we present some studies that address these issues.
Yin et al. [24] proposed an integrated dynamic visual cognition model based on several mechanisms of the human visual system: 1) semantic extraction via communications between the hippocampus and the medial prefrontal cortex [56−58] , 2) structural learning by neurons in V4 [59] , and 3) selective attention involving areas like the frontal eye fields [60] . Experiments on four datasets show that the proposed model achieves higher accuracy than existing visual recognition methods and manifests robustness to semantically ambiguous samples.
Qiao et al. [21] introduced the mechanisms of memory and association into HMAX. Memory is divided into similarity discrimination and recall matching stages based on episodic and semantic memories [61] using semantic and Table 1 Overview of brain-inspired visual models

Models References
Cellular level models Retina [47,48] Cortex [49,50] Neural pathway level models Perception [51][52][53][54][55] Cognition [21,23,24] situational features [62,63] . Experiments show that the new model can produce semantic descriptors for recognition and achieve a higher accuracy than HMAX. Qiao et al. further introduced active attention adjustment based on [21]. The recognition stage uses the local occlusion information of the object, and the classification stage is actively adjusted according to the initial cognitive information. Experiments proved the robustness of the proposed model for occlusion.

Brain-inspired decision making based on emotional mechanism
Studies on the decision-making tasks of robots have shown increasing significance. Some related works have been classified into Table 2. Although it is difficult to make a non-controversial definition of robotic decisionmaking, we try to describe that the purpose of decisionmaking is to work out solutions to reach the goal based on sensory cognition results, providing target and guidance to the robotic control module, such as which movement to make next and when to make it. Robotic tasks, such as action selection and path planning, can be considered robotic decision-making. Research on this topic has undoubtedly helped robots improve their performance in flexible robotic movement [75] , human-robot interaction & corporation [76] , multi-agent collaboration [77] , etc. In the past time period, reinforcement learning has become a powerful tool for solving Markov decision process (MDP)-modeled robotic decision-making problems [78] . These learning-based methods have achieved excellent performance in terms of robotic knowledge acquisition and skill mastery. However, traditional reinforcement learning-motivated robots are still unsatisfactory in some human-like aspects, such as low learning speed, frustrating generalization ability, flexibility, and lack of capacity to adapt to rapid environmental changes [19] . Interaction between amygdala and orbitofrontal cortex (OFC) [70,71] Hormone regulation [72] Neural circuit regulation [73,74] In contrast, humans can quickly build causality between perceived states and actions based on limited ex-perience under brain control, which shows excellent learning efficiency and generalization. Meanwhile, emotion, one of the advanced functions of the brain, enables humans to balance the decision-making mode when faced with a situation where the environment or condition changes rapidly, which is the important reason for humans to conduct flexible reactions, adapt to rapid changes, and achieve excellent decision-making. Therefore, it is convinced that brain-inspired methods with emotional mechanisms significantly help increase the performance of human-like capabilities in robotic decisionmaking tasks.
The fundamental problem with such methods is the integration of emotion mechanisms into the decision-making framework. A class of studies focuses on the approaches to building up artificial emotion, expecting robots to benefit from emotional learning and memory. Scheutz concluded the twelve possible functions of emotion in artificial agents, which is instructive for the problem [64] . Moerland et al. [65] investigated several modeling methods for artificial emotion modeling, including reward adjustment, state modification, action selection, and metalearning. For example, in [66], emotion was integrated as an internal reward for learning an algorithm that encodes value information from the perceived states. In [67], the degree of fear was considered a part of the state space, playing the role of motivation for guidance actions. Other studies have also integrated emotion to balance the exploration/exploitation trade-off [68] . The above methods succeeded in taking advantage of emotion to improve the performance of robots in decision-making tasks. However, the implementation of emotion in these works is heuristic, and the mechanism of artificial emotion differs from that of the brain.
Meanwhile, another class of studies follows the emotional modulation mechanism in the human brain, which makes the decision-making methods benefit from biological plausibility. For example, Taylor and Fragopanagos [69] integrated the functions of emotion-related brain regions into an attentional control model, which brings up selective attention through emotion regulation. Another successful brain emotion learning (BEL) model proposed by Balkenius and Morén [70] attempts to mimic the interaction between the amygdala and OFC to explore the emotion learning process of mammals, where the rapid emotional response originates from the amygdala model and is then modulated by feedback signals from the OFC model. The proposed model has been successfully applied to robot control tasks [71] and is comparable to linear quadratic regulator control in terms of performance.
A series of recent works have paid attention to the two pathways of the emotion-modulated decision-making process in the human brain hormone and neural circuits. This is inspired by the opinion that emotion can modulate the learning process by adjusting the levels of sever-al neurotransmitters [79] . Huang et al. [72] try to mimic the corresponding process in the algorithm and modulate the meta-parameters by emotional valence, including the learning rate, reward prediction error, and randomness of action selection. The proposed framework is used to upgrade the Hebbian and Oja learning rule, which is applied to the operation of the musculoskeletal arm, thereby achieving higher precision and faster learning speed. A second study [73] considered the neural circuit between the amygdala and striatum in the human brain, which is responsible for the shift between goal-oriented decisionmaking and habitual modes. It proposed a biologically plausible computational amygdala model that generates emotional signals to adjust the learning type between model-based and model-free. The computational emotion model was also used for another decision-making task for mobile robots in sparse-reward environments [74] . Besides, the raised amygdala model shows properties that fit the fear conditioning learning experiment in neuroscience, proving that with mechanism simulated and biologically plausible model built, brain-inspired methods can reflect some properties of the human brain, which is hopefully of help to studies in cognitive neuroscience, thereby promoting the development of related research.

Brain-inspired motion control 2.3.1 Methods based on control theory and artificial
intelligence The highly complex dynamic characteristics of musculoskeletal robots lead to many problems, such as the difficulty in solving the inverse dynamics of the system and huge demand for computing resources. Several methods have been proposed for the control of musculoskeletal robots. Depending on whether explicit models of musculoskeletal robots are established during the solution process, these methods can be divided into two categories: modelbased and model-free methods as shown in Table 3. The details are as follows: Table 3 Overview of motion control methods of musculoskeletal robots

Control of musculoskeletal robots References
Methods based on control theory and artificial intelligence Model-based methods [80][81][82][83][84][85][86] Model-free methods [87][88][89][90][91][92][93][94] Brain-inspired methods Muscle-synergies-inspired methods [95,96] Cortex-inspired methods [97,98] Hierarchical-mechanisminspired methods [99,100] Cerebellum-inspired methods [101,102] Many model-based control methods for musculoskelet-al robots have been proposed by establishing kinematic and dynamic models of musculoskeletal systems. First, static and dynamic optimizations were used to study musculoskeletal robots. Thelen et al. [80] proposed a computed muscle control method. This method brings the calculation results of the forward model into the signal solution process to optimize the error at each step, which improves the efficiency of model optimization. Second, proportional-derivative control [81] , iterative learning control [82] , and adaptive control [83] have made some achievements in the control of musculoskeletal robots. Tahara and Kion [84] designed a proportional-integral-type iterative learning control algorithm for the task space. This algorithm collects the time-series errors of the position and velocity of the end-effector to form a dataset and takes the result of multiplying the error by the gain term into the next iteration, which avoids the real-time inverse dynamics solution process for redundant systems. Jäntsch et al. [85] proposed a scalable joint space control scheme based on the computed torque control. This method first uses a multilayer perceptron to establish the mapping from muscle force to joint torque and then uses a proportional-integral-derivative controller to calculate the joint torque required to complete the motion task. Kawaharazuka et al. [86] proposed an antagonist inhibition control method that avoided unnecessary muscle tension or tendon slack caused by model errors so that the musculoskeletal robot could safely perform wide-range motion for a long time. The proposed method was verified using the Kengoro hardware platform. However, model-based methods rely on accurate models of complex musculoskeletal systems, which are usually not readily available. Therefore, these algorithms are primarily applied to simplified musculoskeletal systems and simple motor tasks. Correspondingly, some model-free methods do not depend on the characteristics of the models and are also used to realize the control of musculoskeletal robots. First, supervised learning can predict muscle signals, activating muscle forces to produce movements. Nakada et al. [87] used 20 deep neural networks to simulate the neuromuscular motor and visual sensory subsystems. Based on the supervised training of a large amount of synthetic data, a full-body musculoskeletal robot containing 352 muscles can efficiently learn visuomotor control of its eyes, head, and limbs. Rane et al. [88] used convolutional neural networks to learn the mapping from the movement space to the muscle space, which accurately predicted the muscle force and the musculoskeletal model. Second, reinforcement learning also provides new inspiration to solve this problem. Deep reinforcement learning methods, such as deep deterministic policy gradient, proximal policy optimization, and trust region policy optimization, have been successfully used to control the lower limb musculoskeletal system to run as long as possible while overcoming obstacles, such as slippery floors and muscle weakness [89] . The natural policy gradient algorithm was used to implement dexterous manipulation tasks for musculoskeletal fingers [90] , such as tip-reaching, key-turning, and pen-twirl. The actor-critic algorithm was introduced to realize the motion control of a three-joint musculoskeletal arm [91,92] . The deep deterministic policy gradient and hindsight experience replay algorithms were combined to control the musculoskeletal arm to perform motor tasks [93,94] , such as playing taichi. Although these algorithms perform well on musculoskeletal systems, they incur enormous computational overheads. Meanwhile, model-free methods regard musculoskeletal systems as black-box environments, lacking specific analysis of system structures and functional characteristics. Additionally, they did not consider the further enhancement of the motor learning ability of robots based on biological neural mechanisms.

Brain-inspired control methods
Humans can realize all types of high-precision and flexible movements and manipulations in complex environments, significantly benefiting from the control of the central nervous system. According to neuroscience research, the motor cortex, cerebellum, spinal circuits, and neural mechanisms related to motion control are essential for generating precise motion commands. First, some neuroscientists proposed that the motor cortex can be described as a dynamical system that embeds muscle-like commands in the population response of neurons [103,104] . For some motions, such as fast ballistic movements, the motor cortex will form the initial state of the dynamical system through a motor preparation stage, which will largely determine the subsequent evolution of neural activity and behavior [105] . Secondly, in some motions, the cerebellum can generate correction values for motor commands as a feedback controller to make the movement more accurate and adaptable. Furthermore, muscle synergy is found in muscle activity, which can be regarded as a specific type of motor primitive [106] .
Some methods to obtain inspiration from these findings have been proposed, which have improved the performance of musculoskeletal robots. Chen and Qiao [95] proposed a novel muscle-synergy-based neuromuscular control method. In this method, a computational model of time-varying phasic and tonic muscle synergies is constructed, which is modulated by a radial basis function neural network according to reaching directions and distances. The introduction of synergy effectively reduces the control dimension and enhances the speed and accuracy of motion learning. The concept of muscle synergy has also been applied to the manipulation tasks of musculoskeletal robotic systems. In [96], the muscle synergies were first extracted as movement primitives. Subsequently, using a strategy based on an attractive region in the environment, the combination coefficients of the primitives are generated by an iterative learning control-ler, and muscle excitations are further computed to drive the musculoskeletal robotic system to realize human-like manipulation. Additionally, based on the dynamic encoding hypothesis in the cortex, a recurrent neural network (RNN) with a cortex-like consistent population response was designed and applied to generate motor commands [97] . Combined with the reinforcement learning algorithm, motion learning and multi-task learning were implemented using the RNN, and motion precision and multi-task learning performances were improved compared with traditional methods. Inspired by the hypothesis of motor preparation in the cortex, a motion-learning framework based on an RNN modulated by initial states was proposed [98] . The initial state defines the evolution of the RNN and the motion of the musculoskeletal robotic system. With the modulation of the initial states, the initial learned states can be explicitly expressed as the knowledge of movements and can be utilized to construct a proper initial state corresponding to a new movement target, significantly improving the generalization efficiency for new movements. To accelerate solution space exploration and reduce the difficulty of learning, considering that the learning goal of humans changes stepwise with the progress of learning, Zhou et al. [99] proposed a phased target learning framework with hierarchical task architecture that provides different targets to learners at varying levels. This realized a tracking task based on the musculoskeletal arm. Inspired by the motion modulation mechanism of cortex-basal ganglia circuits, a hierarchical motion-learning framework decoupling between the target position and motion control was proposed [100] . Introducing a hierarchical mechanism enhances the adaptability of musculoskeletal robot systems for uncertain tasks. Inspired by the prediction and correction mechanism of the cerebellum in motor control, Zhang et al. [101] proposed a cerebellum-inspired feedforward control model. The cerebellum prediction network predicts the motion result of the musculoskeletal robot under the control of the original muscle signal. Subsequently, the cerebellum correction network outputs the correction value of the original signal based on the prediction error, effectively improving the motion precision of the robot. Additionally, inspired by habitual planning theory and the information propagation mechanism from the cerebellum to the spinal cord, Qiao et al. [102] proposed a method to calculate the muscle control signal based on motion primitives. Some primitives obtained by optimization can form muscle control commands corresponding to new motion targets through a combination, which improves the fast response and learning ability of robots.
The above methods for musculoskeletal robotic control are classified and listed in Table 3.

Bio-inspired structures and dynamics
Compared to traditional joint-link robotic systems, the drive structure of musculoskeletal systems has several advantages worth learning. Inspired by the performance of complex human operations, muscle dynamics and redundant muscle distributions are regarded as critical points and have been widely researched [107][108][109] . Therefore, bio-inspired musculoskeletal robotics is a promising direction for next-generation robots.

Anti-interference analysis of musculoskeletal dynamics
In contrast to the approximately linear motor-driven systems in most robotic systems, the actuators of the musculoskeletal system are highly nonlinear and redundant. These provide a strong anti-interference ability for neural noise and environmental disturbances. Since Hill et al. [110−115] proposed contractile muscle dynamics to describe the relationship between muscular activation and muscle forces, numerous different muscle models have been studied to better fit the features of muscle dynamics. In addition, research has been conducted on artificial muscles in robotics. Tandem springs and dampers are combined to mimic the structures and dynamics of Hilltype muscle models in the theoretical analysis [116−120] . Pneumatic and cable-driven muscles are the two main types of artificial muscles in application research and have realized flexibility and robustness in robotics. However, theoretical analyses of the benefits of muscle dynamics are scarce.
Inspired by the fact that humans can realize high-precision and robust operations under huge neural noise, Wu et al. [121] established sophisticated musculoskeletal dynamics to analyze the potential advantage of nonlinear muscular driving mechanisms for such performance. In this work, three crucial reasons for the anti-interference ability of the activation signal for the humanoid musculoskeletal system were proposed after mathematical derivation. The nonlinear and indirect driving mechanisms reduce the effect of input error, the inter-system feedback caused by the coupled structure can regulate muscle force to improve control accuracy, and the redundant muscle space can provide a broader solution space for the activation signal. Furthermore, comparison experiments between the musculoskeletal and joint-link robots were designed and conducted to verify the analytical results. This study provides a theoretical reference for the application of bio-inspired musculoskeletal robotic systems.

Bio-structure inspired motion control
Organisms have an inherent talent to complete stable, rapid, and diverse free movements with their complex body structure. Realizing such extraordinary movement ability relies on the cerebral cortex to complete the complex calculation and benefits from the cooperative mechanism between the physiological structures formed in the long-term evolution. For example, the Henneman size principle proposed by Henneman indicates that during movement, muscles with a low innervation ratio but strong fatigue resistance are recruited first, while muscles with a high innervation ratio and high explosiveness are recruited later. This coordination mechanism between the muscles provides a physiological basis for the stability and rapidity of movement [122] . The knee-jerk reaction is a typical example. When the length of a muscle changes due to an unexpected disturbance, alpha motor neurons in the spinal cord quickly produce a regulatory signal for the muscle to contract back to its original length. This mechanism guarantees limb stability and postural balance under disturbance conditions [123] .
The equilibrium point hypothesis is one of the classical theories on how organisms realize motion control by leveraging the coordination mechanism of physiological structures. During research on the synergistic regulation of spinal cord interneurons in muscles, neuroscientists have found that applying constant micro-electrical stimulation to specific areas of the spinal cord activates motion-related forces in the extremities. The distribution of the extremity force in motion space converges to an equilibrium point, forming a so-called convergent force field [124] . Neurophysiological experiments in animal models, such as frogs and monkeys, have further verified the equilibrium point hypothesis, providing an important viewpoint to explain the efficient and stable motor control ability in organisms [125−127] . Recently, this mechanism has also inspired the study of motion control strategies for musculoskeletal robots [128,129] . Inspired by two types of natural constraints, convergent force fields [124] and attractive regions in the environment [130] , Zhong et al. [128] proposed a structure-transforming optimization algorithm for constructing a constraint force field (CFF) in a musculoskeletal robot. By optimizing the arrangement of the muscles, a CFF with the target position as the equilibrium center can be constructed in the task space of the musculoskeletal robot. In the constructed CFF, the robot can reach the target position with high accuracy using constant control signals, effectively reducing the requirement of the sensor feedback of the control signal. Meanwhile, the system also exhibits strong robustness to noise disturbances in the control signal. Therefore, it provides an important inspiration for designing a new type of variable-structure musculoskeletal robotic system to achieve high-precision robotic manipulation under limited precision of the structures and sensors.

Simulation platform for bio-inspired musculoskeletal system
Research on brain-inspired intelligent robotics requires a comprehensive platform that can mimic the mechanisms and behavior of humankind. Integrating visual, decision-making, and motion control models within the same system is often necessary to achieve more complex tasks, which would be very challenging for researchers in certain areas. For studies focusing on motion and the decision cortex, it is advised to study the mechanisms of a musculoskeletal system, which should be highly redundant and flexible and have variable stiffness.
A flexible virtual platform is a very efficient way to implement such ideas. Thus, researchers can first use the virtual musculoskeletal system to train and optimize brain-inspired models, which can be further transferred to other software and hardware platforms, which will be introduced in Section 3.1.1. The algorithms are flexibly organized in terms of vision, decision, and motion. Therefore, they can develop end-to-end models from vision to motion and integrate their own algorithms with existing algorithms to build a complete model for more complex tasks. Virtual platforms also enable researchers to design and develop a series of standard tasks for evaluating brain-inspired robotic algorithms. With standard tasks on the platform, researchers would have a better benchmark for comparing different algorithms and incidents for further improvement.
We develop such a platform which is a web-based platform and enables users to use it remotely. As shown in Fig. 1, the whole virtual platform can be divided into three modules, integration back-end, front-end demon-stration, and task definition. Currently, most musculoskeletal platforms focus on the dynamics module; however, defining the task and a comprehensive output are also important. In this subsection, we first introduce the three modules and their implementations. We then provide some examples of the applications.

Integration back-end module
The integration back-end module is based on the dynamic engine of the musculoskeletal system but integrates the vision and decision parts. Several current platforms can serve as dynamic engines, including Open-Sim [131] and MyoSuite [90] . In brain-inspired robotics, researchers need an engine with biological plausibility and execution efficiency and have different preferences for a certain project. In our virtual simulation platform, we developed a unified application programming interface (API) to enable researchers to easily switch between different back-end engines. We also developed tools to transfer the models to different platforms so that researchers can translate their custom models from one engine to another. The vision and decision parts are also integrated into the back-end module, and more details are discussed in the following sections.
We provide a series of API interfaces in the integrated back-end module. The basic structure of the API follows that of OpenSim, including the model/agent, simulation, and visualization. The model defines the forms Brain-inspired robotics model System framework of the simulation platform Fig. 1 System framework of the simulation platform, including integration back-end, front-end demonstration, and task definition modules. Users can set up a task and the corresponding environment and agent in the task definition module, then combine their models and the models from model-zoo. The back-end module integrates vision, decision, and motion models and interacts with the dynamics engine via the unified API. Then all the simulations and corresponding results will be displayed in the front-end module. and parameters of the virtual musculoskeletal system. The simulation is related to a dynamic engine. It uses activation signals from the algorithms and generates the corresponding movement. The visualization module generates a 3D visualization of the system. Although the structure follows OpenSim, the platform also wraps different back-end engines with the same API. Thus, different engines can be easily switched using a line of code.
We developed an integration function to adaptively compose these algorithms to better integrate vision, decision, and motion algorithms. The function was automatically generated according to the task definition and algorithm parameters. For example, if the task is a motion control task, the function will only use the motion control algorithm as the input, and the vision and decision algorithms will be ignored. However, if the task is a hybrid task that requires both vision and decision, the function connects the vision, decision, and motion models to complete the tasks. In this way, the platform is compatible with end-to-end, vision, decision, and motion models.

Front-end demonstration module
Front-end and task definition modules are often overlooked by the current platform. However, such modules are essential for evaluating brain-inspired robotic algorithms. Most front-end display modules on current platforms are designed simply to display a simulation environment but not for the results of the algorithms. To evaluate the results, one has to write their own demonstration module or use another tool, such as a Tensor-Board. To combine these demands into one platform, we simultaneously integrated the simulation and algorithm results. Our front-end module is a web-based platform that enables researchers to visualize the results of braininspired algorithms. It shows the animation of the simulation engine, the trend of the model parameters, and the numerical training and testing results simultaneously. This will provide researchers with a more intuitive illustration. For example, one could watch the movement of a musculoskeletal upper arm while simultaneously observing changes in muscle force and kinetics.
To implement the front-end module, we designed a flexible framework to combine the simulation results and signals from the algorithms. The framework is based on Vue.js, a JavaScript framework used for building user interfaces. This framework provides a simple but powerful way to organize the results. In the current version, the simulation results are displayed as a video from the viewpoint assigned by researchers. This will be more interactive in the future, illustrating the 3D visualization in WebGL canvas. The framework also synchronizes all the outputs in the same timeline so that the researchers can simultaneously replay all the results and observe how the activation signal triggered the movement.

Task definition module
The task definition module is a special part that al-lows researchers to define their tasks and experimental environments. The module provides many predefined experimental settings. In this manner, a comparison between algorithms is implemented on a series of standard tasks. It also enables researchers to design and share their tasks and environments with other users on a platform. Details of the task definition module are introduced in the application section.
In the task definition module, we transfer the tasks into a set of parameters, including the environment settings, musculoskeletal models, and how the entire model is organized. The parameters are stored in a JavaScript object notation (JSON) file and can be easily shared with other users. We provided a series of preset standardized parameters for comparison. After the parameters are defined, the platform builds up a custom environment from a template for each task in a separate docker, and the researchers can use the environment to train their models. It will also generate the integration function mentioned above and integrate the vision, decision, and motion models.

Application
As discussed above, the platform can be used for not only single types of models, such as vision or motion alone, but also integrated models. In this section, we introduce two applications of the virtual platform: the standard reaching and vision-based assembly tasks. Both tasks were executed on an upper arm model with 15 muscles. 1) Standard reaching task: The reaching task is a standard experimental paradigm in biomechanics and neuroscience [132,133] . From the starting point of the hand, the subject will be given a set of targets and try to reach one of them each time. Researchers will study the entire moving process. In this experiment, we provided standard task settings. The targets appeared on the circle around the starting point of the hand. Researchers can define different positions of the starting point, targets, and the number of targets.
The muscle-synergy-based neuromuscular control method [97] was applied in this experiment. As introduced in Section 2.3.2, based on muscle-synergy mechanisms, this method generates a series of activation signals for all muscles until the target is reached. In the platform, the algorithm takes the positions of the hand and targets and outputs the signals to the back-end module. The backend module then generates the movement and calculates the corresponding errors. All the related results are displayed by the front-end module. Researchers can focus on developing and tuning the algorithm to achieve the best results and free themselves from the pain of the system.
2) Vision-based peg-hole assembly task vision-based peg-hole assembly is a more complex task. The researchers use a camera to capture the scene and then use a series of algorithms to assemble the peg into a hole. This requires the coordination of vision, decisions, and motion models. This experiment provides a good example of the flexibility of the platform.
During the experiment, researchers could use ground truth, baseline models, or their own models for different parts. For example, in the early stage, users can use the positions of the peg and hole from the platform directly and use the attractive region in such an environment, a method inspired by human behavior, as the decision-making model to generate a viable assembly strategy. In this way, researchers can focus on developing a brain-inspired motion model. Subsequently, the ground-truth vision output can be easily switched to a brain-inspired visual model. Researchers can also connect decision-making with motion control and directly generate the control signal from the visual model output. Thus, an integrated braininspired model will be developed step-by-step. The frontend module simultaneously displays the output of each part.

Musculoskeletal robotic system
To realize flexible motion control and robust manipulations like humans, we construct bio-inspired musculoskeletal robotic software and hardware systems. The software system can be used to test the performance of braininspired algorithms in a simulation environment and then provide models and algorithms for hardware platforms based on the simulation results.

Software system
Considering the research purpose and providing guidance to the hardware constructed, we first established a simulation system in Matlab and CoppeliaSim. We built a realistic muscle dynamics module based on theoretical analysis and research that preserves the nonlinearity and coupling properties. In addition, a fifteen-muscle sevendegree of freedom (DOF) upper limb model was established in CoppeliaSim. The distribution of the muscle attachment was designed and optimized according to the anatomical results and convex optimization theory. The software platform consists of three parts: structural design, dynamics modeling, and controller design. The framework is illustrated in Fig. 2.
A combination of muscle chips and motor cable-driven system is used in this robotics to simulate the relationship between muscle states and muscle forces accurately. Muscle attachment can be parametric to a matrix; therefore, it can be easily modified. The system dynamics can be derived through the inner vector product using the attachment matrix and other size parameters. Forward and inverse dynamics were then established in the module. The kinematics from task space to joint space and then to muscle space is modular in Matlab. Considering the vast data transmission between the control algorithms and the robotic model in Matlab and CoppeliaSim, re-spectively, the Socket module with the user datagram protocol (UDP) protocol is used to guarantee timely communication.

± 2
This software platform provides a simulation environment for two different control methods. First, a traditional controller can be designed based on the dynamics of musculoskeletal robotics according to control theory. Some beneficial force control methods, such as proportional-integral-derivative control, impedance control, and adaptive dynamic control, have been tested in this system to obtain millimeters position error. Moreover, this environment can be used to train motion learning controllers. Some reinforcement learning algorithms, such as proximal policy optimization (PPO) and deep deterministic policy gradient (DDPG), as well as brain-inspired learning algorithms, have been migrated to obtain better performance. Unlike the first system in Section 3.1, this software aims to obtain realistic hardware system parameters and provide guidance for constructing a physical system. Many research experiments must first be tested in a simulation environment to avoid hardware damage. In particular, the modular components of the system provide alternative models and controllers.

Hardware system
Based on the simulation results, we constructed a hardware platform with the same muscle distribution and structure as shown in Fig. 3. The lightweight musculoskeletal arm (LM-Arm) is composed of four joint limbs (shoulder, elbow, forearm, and wrist joints) with seven DOFs and 15 artificial muscles, and an internal-driven hand according to the arm biological, anatomical structure, and mechanical design principles, as shown in Fig. 1. High-precision angle sensors were installed at each joint. In addition, vision and high-precision tension sensors for each artificial muscle are indispensable. The skeletons were made of nylon using 3-D printing technology. Key connecting components, such as gears, are made of aluminum alloy using computer numerical control technology. We used a motor cable-driven muscle module to mimic the dynamics and coupled relations better. All ar- tificial muscle units were installed in the chest cavity with an arm weight of 2 kg. The lengths of the upper arm, forearm, and hand were 330, 390 and 260 millimeters, respectively. The terminal load was approximately 3 kg. This design is dedicated to highly simulate the musculoskeletal structure of the human arm to reduce the weight of the arm and improve structural accuracy and controllability.
The circuit configuration of the LM-Arm includes four levels. The industrial Ethernet for control automation technology (EtherCAT) protocol was used to connect the physical level with the lower level in the Simulink environment under a 1-millisecond control cycle. The Ether-CAT protocol transferred the feedback of the muscle tension, muscle length, and joint angle from the physical to the lower level. The UDP protocol was used with a cycle time of 2-millisecond to connect the lower-level control to the sensor system and algorithms.
The cable transmission from the chest cavity to the limbs will lead to severe friction. The power loss caused by friction is a huge uncertainty for motion control; therefore, we propose a feed-forward compensation approach to identify the dynamics model of the hardware platform. Friction is mainly divided into pipeline friction generated by the cable through the casing and that generated by the linkage joint. Some slack cable phenomena were also considered in this approach. Model-based compensation and model-free methods are considered in this study. Considering the complexity of muscle and cabledriven dynamics, the effectiveness of the neural network fitting method is superior.
A feed-forward friction compensator can be obtained using the data collected from the uncompensated hardware to train the neural networks. With these methods, the system performance can be obviously improved, as shown in Fig. 4. Based on the robust and relatively accurate hardware platform, brain-inspired algorithms in motion control and decision-making can be verified and then applied to the new robotic system.

Conclusions
The human advantages of dexterous operations are apparent. Although state-of-the-art robotics research can accurately complete millimeter and even micrometer assembly missions, there are still many complex and repeated operations and assembly in the computer, communication, and consumer electronics (3C) field, which need to be handled by humans. Determining the pivotal advantages of completing such tasks and the neural mechanisms behind their performance can significantly promote the development of advanced robotics. Control sparsity and robustness are two crucial advantages found in neuroscience. Brain-inspired motion-learning algorithms can use sparse rewards to realize generalized controlpolicy learning. With this method, robotics can accomplish a series of manipulations after simple training. Continuous learning ability will equip the robot to adapt to a new environment through adjusting a few parameters. Furthermore, system robustness comes from redundancy and anti-interference can improve system reliability. Except for neuron redundancy in neural systems, muscle redundancy expands the range of feasible control signal space. Besides that, the special muscle actuator provides  nonlinear dynamics and coupled feedback modulation, which can reduce the effects of disturbances from the control input and environment.
In the future, next-generation robotics could be developed with numerous brain-inspired algorithms and novel musculoskeletal structures. The new robot can safely complete human-machine collaborations by benefitting from lightweight musculoskeletal body design and brain-inspired cognitive and control algorithms. In addition, organic structural design and hardware construction should be reinforced and emphasized. Meanwhile, we hope that this generation of robotics can provide inspiration and reference for brain-computer interface control. These two systems will be polished, and additional functional modules will be added in future studies. A growing number of research will be simulated and verified using these platforms.

Open Access
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article′s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article′s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.      Target angle 1  Measuring angle 1  Target angle 2  Measuring angle 2  Target angle 3  Measuring angle 3  Target angle 4  Measuring angle 4   Target angle 1  Measuring angle 1  Target angle 2  Measuring angle 2  Target angle 3  Measuring angle 3  Target angle 4 Measuring angle 4 (a) Joint tracking without compensator (b) Joint tracking with neural network compensator Fig. 4 Effectiveness of friction compensator diagram. (a) Trajectory tracking task with inverse dynamic feed-forward control; (b) Add a friction compensator to repeat the same tracking task with the same inverse dynamic feed-forward control. Evidently, the compensator improves the performance of hardware control.