1 Introduction

Legged robots offer significant advantages over wheeled or tracked robots, as these advantages help traverse challenging terrain and unstructured environments. The ability to perform dynamic locomotion over unstructured terrain in unpredictable environments is extremely essential in real-world deployments. However, control of dynamic legged locomotion remains a challenging problem due to the difficulty in dynamical stabilization and the presence of unknown external disturbances. To achieve robust planning and control of dynamic locomotion, it is critical and necessary to adjust the motion plan according to the current robot state.

Fig. 1
figure 1

Planning and control framework. Body velocity commands are sent to the motion planning module, which computes desired body pose, foot positions, and ground reaction forces. The whole-body control module generates desired generalized accelerations and external forces by solving a hierarchical quadratic programming problem. Then the joint torque, velocity and position commands of each joint are computed from the acceleration and force

In recent years, some progresses have been achieved in dynamic legged locomotion with bipedal [1,2,3,4,5] and quadrupedal [6,7,8,9,10,11,12,13,14,15] robots. These solutions can be classified into optimization-based approaches [1, 2, 7, 10, 11] and learning-based approaches [4, 12, 13]. The optimization-based approach has been widely used, but it often needs to solve computationally expensive nonlinear optimization and suffers convergence issues, which makes them challenging to be implemented in real-time. The learning-based approach builds a neural network directly mapping sensor measurements to desired motor control signals but often requires high-quality training data and can not be adjusted partially. Compared with learning-based approaches that lack safety guarantees, the optimization-based approach is more suitable for the motion control of robots because it has the potential to take full-body dynamics and all physical constraints into consideration.

The optimization-based approach can be divided into trajectory optimization approaches [3, 16, 17] and predictive control approaches [7, 10, 11] according to whether they can be run online. The trajectory optimization approach solves the locomotion problem offline over the long-time horizon of a specific motion while optimizing over a large number of variables consisting of contact sequences, contact timings, contact locations and the whole-body trajectory. These variables can be obtained automatically while taking into account physical constraints such as full-body dynamics, joint motion limits, unilateral contact constraints, and friction cone constraints. Conversely, the predictive control approach emphasizes solving the locomotion problem online over the short-time horizon and correcting motions to cope with model mismatches and unpredictable disturbances.

In the last few years, the trajectory optimization approach has been widely used in legged locomotion to generate complex trajectories with whole-body dynamics to explore more possible behaviors, solving the locomotion problem over a long-time horizon at once [3, 16,17,18,19]. For example, a method has been developed for whole-body trajectory generation of multi-limbed robotic systems with unilateral constraints introduced by contact with the environment, which can eliminate the requirement for a priori contact mode ordering and satisfy all dynamic and contact constraints [16]. Although many impressive results are presented, the trajectory optimization approach is too slow to be applied online on the real robot because of the high computational cost of nonlinear optimization problems.

Instead, some works in the predictive control approach focus on the online motion generation and execution on the real robot [6, 10, 11, 20]. For instance, The locomotion problem is formulated as a convex optimization by simplifying the robot dynamics and expressing the robot’s orientation as Euler angles. While realizing the online planning of robot motions, some limitations are also introduced, such as the base cannot have a large range of roll and pitch motions, and the joint torque limits cannot be considered. A representation-free model predictive control framework is proposed to control various 3-D motions of a quadrupedal robot with the rotation matrix representation [11]. Since the rotation matrix is the natural representation of the special orthogonal group SO(3), representing orientation with the rotation matrix instead of Euler angles and quaternions can avoid the singularity and ambiguity issues [21, 22]. Although the orientation limitation is removed, the joint torque limit still can not be imposed. The use of simplified models and the omission of important constraints may make the generated trajectories unrealizable in the real robot.

To combine the advantages of the trajectory optimization approach and predictive control approach, there are some methods [7, 23, 24] trying to solve the locomotion problem online with differential dynamic programming (DDP), considering the whole-body dynamics and other physical constraints. These DDP-like methods only optimize a local control policy rather than control-state trajectories by iteratively solving a low-order Taylor approximation of the nonlinear locomotion problem, which makes them computationally efficient, but also limits their applicability as state constraints cannot be easily incorporated.

Most of the above works use only one model, some use a reduced-order model [6, 10, 11, 25], and some use a full-order model [7, 16, 23, 24, 26]. Actually, the whole locomotion control problem can be decomposed into a motion planning module and a motion tracking module [27,28,29,30]. The latter serves as a trajectory stabilizer calculating joint commands to track the task-space and joint-space trajectory references generated by the motion planner compliantly. The decomposition offers the possibility to use different models in motion generation and execution.

Thanks to the decomposition, this paper proposes a real-time planning and control framework shown in Fig. 1 for robust, dynamic quadrupedal locomotion containing two main modules: motion planning and whole-body control. The motion planner generates task-space references such as base motions, center-of-mass (CoM) motions, feet contact positions and forces, as well as limb motion reference trajectories. The tracking controller can execute various tasks and accomplish multiple control objectives, accounting for whole-body dynamics, manipulating contact forces to control the floating base, and transforming task space references into joint-space commands. Although the gait scheduler and the state estimator are also included in the proposed framework, they will not be explained in detail here since they are out of the scope of this paper. Actually, many gait generation approaches [31, 32] and body pose estimation methods [33,34,35,36] can be used to schedule gait and estimate the base state of quadrupedal robots.

Fig. 2
figure 2

A block diagram illustrates the interaction between the body motion reference, foot placement planning, and contact force optimization submodules

The contributions of this research can be summarized below:

  1. 1.

    A real-time planning and control framework is developed for quadrupedal robots capable of performing robust and dynamic locomotion, mainly consisting of a motion planner and a whole-body controller.

  2. 2.

    The 3-D single rigid body model (3D-SRBM) achieves a balance of expressivity and simplicity in the planning phase, allowing to conduct the planning and tracking of base orientation on SO(3) manifold to realize a wider body rotation range while allowing the planner to run in a receding horizon fashion to cope with disturbances.

  3. 3.

    A trajectory interpolator based on the Bézier curve is integrated with a heuristic-based foothold planner, which provides continuous and smooth Cartesian velocity and position of foot-ends simultaneously.

  4. 4.

    The developed whole-body controller uses hierarchical quadratic optimization coupled with the full system dynamics and physical constraints, which ensures tasks are prioritized based on importance, and joint commands sent to actuators are physically feasible.

  5. 5.

    The effectiveness of the framework is validated in real-world experiments on the A1 robot.

The rest of this paper is organized as follows. The following section presents the motion planner generating body trajectory commands, desired foot reference trajectories, and contact forces. Section 3 presents the whole-body controller formulating the control problem as a series of quadratic programs. Section 4 describes the control framework’s use on the physical robot. Finally, Sect. 5 gives discussions and concludes the paper.

2 Motion Planning

The motion planner receives operator input and generates body motion reference, foot motion trajectories, and optional reaction force profiles. Planning is achieved via three components, detailed in the following sections and illustrated in Fig. 2. The body motion reference sub-module is responsible for generating desired body velocity and pose that are used in the foot placement planning and contact force optimization sub-modules with the scheduled gait and estimated robot state to plan foot positions and reaction forces.

Fig. 3
figure 3

3D-SRBM. Its CoM coincides with the CoM of the robot generated based on the mass and state of the whole robot. However, its orientation equals the base orientation of the robot, and its inertia is generated from a simplified robot body, where each leg is lumped to a point on the corresponding hip

2.1 3D-SRBM

The 3D-SRBM shown in Fig. 3 especially fits those robots with lightweight legs in which the robot base makes the largest contribution to the total mass of the whole robot. The underlying assumptions are: (1) their negligible limb masses do not contribute significantly to the momentum of the whole robotic system, and (2) the leg configuration does not influence the body inertia. This slightly more complex model approximates the robot as a single rigid body subject to external forces at the contact patches, thus it is still possible to consider the angular momentum in the planning phase even though it ignores limb dynamics.

The state of the 3D-SRBM can be collected into a tuple as

$$\begin{aligned} {\varvec{x}}:= \left( {{\varvec{p}}_{B}}, {\dot{{\varvec{p}}}_{B}}, {\varvec{R}}_{B},{^{B}\varvec{\omega }_{B}} \right) \end{aligned}$$

where \({\varvec{p}}_{B} \in {\mathbb {R}}^{3}\) and \({\dot{{\varvec{p}}}_{B}} \in {\mathbb {R}}^{3}\) are the CoM position and velocity, respectively. \({\varvec{R}}_{B} \in \text {SO(3)} = \left\{ {\varvec{R}} \in {\mathbb {R}}^{3\times 3} \vert {\varvec{R}}^\mathrm{{T}}{\varvec{R}} = {\mathbb {I}}, \text {det}({\varvec{R}}) = +1 \right\}\) is the body orientation evolving on the SO(3) manifold. \(\text {det}(\cdot )\) calculates the determinant of a rotation matrix and \({\mathbb {I}}\) indicates the \(3 \times 3\) identity matrix. \(^{B}\varvec{\omega }_{B} \in {\mathbb {R}}^{3}\) represents the angular velocity of the robot body expressed in the body frame \(\{B\}\).

Equations of motion of the 3D-SRBM are given by

$$\begin{aligned} m\ddot{{\varvec{p}}}_{B}&= \sum _{i=1}^{n}{{\varvec{f}}_i} + m{\varvec{a}}_\mathrm{{g}}, \end{aligned}$$
$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\left( {^{B}{\varvec{I}}} {^{B}\varvec{\omega }_{B}} \right)&= \sum _{i=1}^{n}{{\varvec{R}}^\mathrm{{T}}_{B} {\varvec{r}}_i} \times {\varvec{f}}_i , \end{aligned}$$

where m is the body mass; \({\varvec{f}}_i\) and \({\varvec{a}}_\mathrm{{g}}\) are three dimensional vectors representing the contact force, and gravitational acceleration; \(^{B}{\varvec{I}} \in {\mathbb {R}}^{3 \times 3}\) is the equivalent rotational inertia tensor expressed in the body frame \(\{B\}\); n is the number of contacts; \({\varvec{r}}_i\) is the location of i-th foot \({\varvec{p}}_{i}\) relative to the CoM of the robot\({\varvec{p}}_{B}\), which is equivalent to the moment arm of the contact force; \({\varvec{f}}_{i}\) is the contact force of the i-th foot. By default, and unless otherwise stated, all of the above quantities are relative to the inertial frame \(\{W\}\).

In the dynamics shown in Eqs. (2) and (3), \({\varvec{r}}_i\) is known when the foothold location \({\varvec{p}}_i\) is planned in advance. The contact force \({\varvec{f}}_{i} \in {\mathbb {R}}^{3}\) is chosen as the system input. These contact forces create an external wrench

$$\begin{aligned} \varvec{{\mathcal {F}}} = \begin{bmatrix} {\varvec{f}} \\ {^{B}\varvec{\tau }} \end{bmatrix} = \sum _{i=1}^{n} \begin{bmatrix} {\varvec{1}} \\ {\varvec{R}}^\mathrm{{T}}_{B}\hat{{\varvec{r}}}_{i} \end{bmatrix} {\varvec{f}}_{i}, \end{aligned}$$

where \(i \in \{1,2,3,4\}\) is the foot index for the front left (FL), front right (FR), rear left (RL), and rear right (RR), respectively. \({\varvec{f}}\) and \(^{B}\varvec{\tau }\) are the total force and torque, respectively; the hat operator \(\hat{(\cdot )}\) represents the cross-product operation.

Therefore, The dynamics of the rigid body Eqs. (2) and (3) can be rewritten as

$$\begin{aligned} \dot{{\varvec{x}}} = \begin{bmatrix} \dot{{\varvec{p}}}_{B} \\ \ddot{{\varvec{p}}}_{B} \\ \dot{{\varvec{R}}}_{B} \\ {^{B}\dot{\varvec{\omega }}_{B}} \end{bmatrix} = \begin{bmatrix} \dot{{\varvec{p}}}_{B} \\ \frac{1}{m}{\varvec{f}} + {\varvec{a}}_\mathrm{{g}} \\ {\varvec{R}}_{B} \cdot {^{B}\varvec{\omega }_{B}} \\ {^{B}{\varvec{I}}^{-1}} \left( {^{B}\varvec{\tau }} - {^{B}\hat{\varvec{\omega }}_{B}} {^{B}{\varvec{I}}} {^{B}\varvec{\omega }_{B}} \right) \end{bmatrix}. \end{aligned}$$

2.2 Body Motion Reference

The body pose trajectory evolves on the special Euclidean SE(3) group. Although its motion can be divided as a translational motion in vector space and a rotational motion in special orthogonal group SO(3), it is still impossible to directly apply vector operations to the trajectory due to the special structure of SO(3) manifold. Unlike other representations such as the Euler angles and the unit quaternions, the natural nine parameters representation of the body orientation as an element of SO(3) is unique and non-singular. Therefore, it is more appropriate to use the rotation matrix representation.

The desired velocity \({\varvec{V}}_{B,k} = \left( ^{W}\varvec{\omega }_{B,k}, ^{W}{\varvec{v}}_{B,k}\right)\) and pose \({\varvec{X}}_{B,k} = \left( {\varvec{R}}_{B,k}, {\varvec{p}}_{B,k} \right)\) are updated at every time step k. The above values are calculated with a reference velocity input \({\varvec{V}} = \left( \varvec{\omega }_{k+1}^\mathrm{{cmd}}, {\varvec{v}}_{k+1}^\mathrm{{cmd}} \right)\) expressed in the body coordinate from a complex path planner or a user-operated joystick. The desired body velocity and pose are updated as

$$\begin{aligned} {^{W}{\varvec{v}}}_{B,k+1}&= {\varvec{R}}_{B,k} {\varvec{v}}_{k+1}^\mathrm{{cmd}} , \end{aligned}$$
$$\begin{aligned} {{\varvec{p}}}_{B,k+1}&= {{\varvec{p}}_{B,k}} + {^{W}{\varvec{v}}}_{B,k+1} \Delta t, \end{aligned}$$
$$\begin{aligned} {^{W}\varvec{\omega }}_{B,k+1}&= {{\varvec{R}}_{B,k}} \varvec{\omega }_{k+1}^\mathrm{{cmd}}, \end{aligned}$$
$$\begin{aligned} {{\varvec{R}}}_{B,k+1}&= {{\varvec{R}}_{B,k}} \text {Exp} \left( \varvec{\omega }_{k+1}^\mathrm{{cmd}} \Delta t \right) , \end{aligned}$$

where \(\Delta t\) is the timestep, and the mapping function \(\text {Exp}: {\mathbb {R}}^{3} \rightarrow \text {SO(3)}\) is the retraction for \(\text {SO(3)}\).

2.3 Foot Placement Planning

The desired foothold location is determined by a foot placement planner designed based on the Raibert Heuristic [37] and Capture Point [38]. The main idea is to force the leg’s landing angle to be the same as its leaving angle if the robot moves at the commanded speed. the location of foot i is calculated with the following equation:

$$\begin{aligned} {\varvec{p}}_{i} = {\varvec{p}}_{B} + {\varvec{R}}_{z}\left( \psi \right) {^{B}{\varvec{l}}_{i}} + \frac{T_\mathrm{{st}}}{2} {\varvec{v}}_{i} + K \left( {\varvec{v}}_{i} - {{\varvec{v}}_{i}^\mathrm{{cmd}}}\right) \end{aligned}$$

where \({\varvec{p}}_{B}\) is the body position, \({\varvec{R}}_{z}\left( \psi \right)\) is the body orientation expressed as a rotation matrix, only the rotation along z-axis is considered. \(^{B}{{\varvec{l}}_{i}}\) is the position of hip i relative to the body position \({\varvec{p}}_{B}\), \(T_\mathrm{{st}}\) is the stance time for each foot, \({\varvec{v}}_{i}\) is the velocity of hip i, \({\varvec{v}}_{i}^\mathrm{{cmd}}\) is the desired velocity of hip i which can be calculated by the body linear and angular velocity commands, and K represents a gain, usually taking the value 0.03.

Although the expected foothold location is given by Eq. (10), the foot still needs to lift off and touch down periodically to make the robot move. During the swing phase, when the foot lifts off in the air, the planned velocity and position of the foot should be smooth enough for the whole-body controller to track. The Bézier curve is chosen to generate the swing foot trajectory, for it just needs a set of discrete control points to define a smooth and continuous curve.

Fig. 4
figure 4

A schematic diagram showing how to generate swing foot trajectory with the quadratic Bézier curve. Three black circles represent the start point, control point, and end point, respectively. The green thick line represents the generated swing foot trajectory. The light blue denotes the phase signal

Given the start foot-end position \({\varvec{p}}_{i,k}^\mathrm{{s}}\) recorded when the foot lifts off, the desired foot-end position \({\varvec{p}}_{i,k}^\mathrm{{e}}\) calculated by Eq. (10), the phase signal \(\phi _{i,k}\) from the gait scheduler, and the maximum foot clearance \(h_f\), the quadratic Bézier curve that requires only three control points is naturally suitable for interpolating out the reference velocity and position of foot i in swing-phase at the k-th timestep. Figure 4 illustrates the diagram of generating swing foot trajectory with the quadratic Bézier curve.

The conventional procedure to use the quadratic Bézier curve is to first determine the coordinates of the control point \({\varvec{p}}_{i,k}^\mathrm{\mathrm{{c}}}\). Then it inputs the coordinates of start point \({\varvec{p}}_{i,k}^\mathrm{{s}}\), control point \({\varvec{p}}_{i,k}^\mathrm{{c}}\) and end point \({\varvec{p}}_{i,k}^\mathrm{{e}}\) into standard Bézier curve algorithms. If the projection of control point \({\varvec{p}}_{i,k}^\mathrm{{c}}\) on the xy plane coincides with the middle point \({\varvec{p}}_{i,k}^\mathrm{{m}}\), the conventional procedure can be simplified to firstly obtain foot position and velocity with only the start and end points, and then correct the z-component with the desired foot clearance. These two steps are shown as follows

$$\begin{aligned}&B_{k}(\phi _{i,k}) = \phi _{i,k}^{3} + 3 \phi _{i,k}^{2} \left( 1 - \phi _{i,k}\right) \end{aligned}$$
$$\begin{aligned}&B_{k}^{'}(\phi _{i,k}) = 6 \phi _{i,k} \left( 1 - \phi _{i,k}\right) \end{aligned}$$
$$\begin{aligned}&{\varvec{p}}_{i,k}^\mathrm{{ref}} = {\varvec{p}}_{i,k}^\mathrm{{s}} + B_{k}(\phi _{i,k}) \cdot \left( {\varvec{p}}_{i,k}^\mathrm{{e}} - {\varvec{p}}_{i,k}^\mathrm{{s}}\right) / T_\mathrm{{sw}} \end{aligned}$$
$$\begin{aligned}&{\varvec{v}}_{i,k}^\mathrm{{ref}} = B_{k}^{'}(\phi _{i,k}) \cdot \left( {\varvec{p}}_{i,k}^\mathrm{{e}} - {\varvec{p}}_{i,k}^\mathrm{{s}}\right) / T_\mathrm{{sw}}^{2} \end{aligned}$$

where, \(B_{k}\) is the Bézier curve, \(B_{k}^{'}\) is the derivative of the Bézier curve, \(T_\mathrm{{sw}}\) is the swing time for each foot, \({\varvec{p}}_{i,k}^\mathrm{{ref}}\) and \({\varvec{v}}_{i,k}^\mathrm{{ref}}\) are interpolated foot-end position and velocity. The foot height and vertical velocity are corrected with

$$\begin{aligned}{} & {} \begin{aligned}&{\varvec{p}}_{i,k}^\mathrm{{ref}}\left( 2\right) = {\varvec{p}}_{i,k}^\mathrm{{s}}\left( 2\right) \\&\qquad + {\left\{ \begin{array}{ll} B_{k}\left( 2\phi _{i,k}\right) \cdot \left( {\varvec{p}}_{i,k}^\mathrm{{s}}(2) + h_f - {\varvec{p}}_{i,k}^\mathrm{{s}}(2)\right) \frac{2}{T_\mathrm{{sw}}}, \\ \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \phi _{i,k} \le 0.5 \\ B_{k}\left( 2\phi _{i,k}-1\right) \cdot \left( {\varvec{p}}_{i,k}^\mathrm{{e}}(2) - \left( {\varvec{p}}_{i,k}^\mathrm{{s}}(2) + h_f\right) \right) \frac{2}{T_\mathrm{{sw}}}, \\ \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \phi _{i,k} > 0.5 \end{array}\right. } \end{aligned} \end{aligned}$$
$$\begin{aligned}{} & {} \begin{aligned}&{\varvec{v}}_{i,k}^\mathrm{{ref}}\left( 2\right) \\&\quad = {\left\{ \begin{array}{ll} B_{k}^{'}(2\phi _{i,k}) \cdot \left( {\varvec{p}}_{i,k}^\mathrm{{s}}(2) + h_f - {\varvec{p}}_{i,k}^\mathrm{{s}}(2)\right) \frac{4}{T_\mathrm{{sw}}^{2}}, \\ \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \phi _{i,k} \le 0.5 \\ B_{k}^{'}(2\phi _{i,k}-1) \cdot \left( {\varvec{p}}_{i,k}^\mathrm{{e}}(2) - \left( {\varvec{p}}_{i,k}^\mathrm{{s}}(2) + h_f\right) \right) \frac{4}{T_\mathrm{{sw}}^{2}}, \\ \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \phi _{i,k} > 0.5 \end{array}\right. }. \end{aligned} \end{aligned}$$

2.4 Contact Force Optimization

The desired contact forces are calculated by an optimization problem making the robot follow the given body trajectory updated by equations from (6) to (9). To make the optimization convex, some approximations [6] can be made. To be specific, the system dynamics Eq. (5) and the contact friction cone constraint should be linearized.

The body orientation can be parametrized as ZYX Euler angles \(\varvec{\Theta } = \begin{bmatrix} \phi&\theta&\psi \end{bmatrix}^\mathrm{{T}}\), where \(\phi\), \(\theta\) and \(\psi\) are the roll, pitch and yaw angles, respectively. Therefore, the transformation from the body frame to the inertial frame can be decomposed as

$$\begin{aligned} {{\varvec{R}}_{B}} = {\varvec{R}}_{z}(\psi ) {\varvec{R}}_{y}(\theta ) {\varvec{R}}_{x}(\phi ), \end{aligned}$$

where \({\varvec{R}}_{x}(\phi )\), \({\varvec{R}}_{y}(\theta )\) and \({\varvec{R}}_{z}(\psi )\) are rotations of \(\phi\), \(\theta\) and \(\psi\) about the X-, Y-, and Z-axis, respectively.

With this parametrization, the relationship between the body angular velocity and the rate of change of Euler angles can be expressed as

$$\begin{aligned} {^{W}\varvec{\omega }_{B}} = \begin{bmatrix} \cos (\theta )\cos (\psi ) &{} -\sin (\psi ) &{} 0 \\ \cos (\theta )\sin (\psi ) &{} \cos (\psi ) &{} 0 \\ 0 &{} 0 &{} 1 \end{bmatrix} \begin{bmatrix} {\dot{\phi }} \\ {\dot{\theta }} \\ {\dot{\psi }} \end{bmatrix}. \end{aligned}$$

If roll and pitch angles are small and the body is not erected or inverted, the rate of Euler angles is

$$\begin{aligned} \begin{aligned} \dot{\varvec{\Theta }}&\approx \begin{bmatrix} \cos (\psi ) &{} \sin (\psi ) &{} 0 \\ -\sin (\psi ) &{} \cos (\psi ) &{} 0 \\ 0 &{} 0 &{} 1 \end{bmatrix} {^{W}\varvec{\omega }_{B}} \\&= {\varvec{R}}_{z}^\mathrm{{T}}(\psi ) {^{W}\varvec{\omega }_{B}} = {^{B}\varvec{\omega }_{B}}. \end{aligned} \end{aligned}$$

It is worth noting that the rotation order is critical, with other order, the above approximation will not hold.

If the angular velocity is small, the term \(\varvec{\omega } \times \left( {\varvec{I}}\varvec{\omega }\right)\) will not contribute significantly to the body orientation dynamics. Then, the orientation dynamics can be approximated with

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\left( {\varvec{I}} \varvec{\omega } \right) = {\varvec{I}}\dot{\varvec{\omega }} + \varvec{\omega } \times \left( {\varvec{I}} \varvec{\omega } \right) \approx {\varvec{I}}\dot{\varvec{\omega }}. \end{aligned}$$

With small roll and pitch angles assumption, the inertia tensor and the orientation dynamics become

$$\begin{aligned}{} & {} {^{W}{\varvec{I}}} = {{\varvec{R}}_{B}} {^{B}{\varvec{I}}} {{\varvec{R}}_{B}^\mathrm{{T}}} = {\varvec{R}}_{z}(\psi ) {^{B}{\varvec{I}}} {\varvec{R}}_{z}^\mathrm{{T}}(\psi ), \end{aligned}$$
$$\begin{aligned}{} & {} \begin{aligned} {^{W}{\varvec{I}}} {^{W}\dot{\varvec{\omega }}_{B}}&= {\varvec{R}}_{z}(\psi ) {^{B}{\varvec{I}}} {\varvec{R}}_{z}^\mathrm{{T}}(\psi ) {\varvec{R}}_{z}(\psi ) {^{B}\dot{\varvec{\omega }}_{B}} \\&= \sum _{i=1}^{n}{{\varvec{r}}_i} \times {{\varvec{f}}_i} \end{aligned}. \end{aligned}$$

After combining Eqs. (2), (19) and (20), the linearized robot dynamics can be expressed as

$$\begin{aligned} \ddot{{\varvec{q}}}_{B} = {\varvec{M}} {\varvec{f}} + \bar{{\varvec{g}}} \end{aligned}$$


$$\begin{aligned} \begin{aligned} \ddot{{\varvec{q}}}_{B}&= \begin{bmatrix} {\ddot{{\varvec{p}}}^\mathrm{{T}}_{B}}&{^{B}\dot{\varvec{\omega }}^\mathrm{{T}}_{B}} \end{bmatrix}^\mathrm{{T}}, \\ {\varvec{M}}&= \begin{bmatrix} {\varvec{1}}_{3}/m &{} \cdots &{} {\varvec{1}}_{3}/m \\ {^{B}{\varvec{I}}^{-1}} {\varvec{R}}_{z}^\mathrm{{T}}(\psi ) \hat{{\varvec{r}}}_{1} &{} \cdots &{} {^{B}{\varvec{I}}^{-1}} {\varvec{R}}_{z}^\mathrm{{T}}(\psi ) \hat{{\varvec{r}}}_{n} \end{bmatrix}, \\ {\varvec{f}}&= \begin{bmatrix} {{\varvec{f}}_1}^\mathrm{{T}}&\cdots&{{\varvec{f}}_n}^\mathrm{{T}} \end{bmatrix}^\mathrm{{T}}, \\ \bar{{\varvec{g}}}&= \begin{bmatrix} {\varvec{a}}_\mathrm{{g}}^\mathrm{{T}}&{\varvec{0}}_{3}^\mathrm{{T}} \end{bmatrix}^\mathrm{{T}}. \end{aligned} \end{aligned}$$

When the foot is in stance phase, a friction cone constraint must be imposed on the contact force \({\varvec{f}} \in {\mathbb {R}}^{3}\) as follows

$$\begin{aligned} \sqrt{f_{x}^{2} + f_{y}^{2}} \leqslant \mu f_{z}, \quad f_{z} \geqslant 0. \end{aligned}$$

Since the friction cone constraint poses some computational challenges, a better option is to consider its conservative pyramid approximation

$$\begin{aligned} |f_{x} |\leqslant \mu f_{z}, \quad |f_{y} |\leqslant \mu f_{z}, \quad f_{z} \geqslant 0. \end{aligned}$$

Given a desired body pose \({\varvec{q}}_{B}^\mathrm{{des}} = \begin{bmatrix} {\varvec{p}}_{B}^\mathrm{{T}}&\varvec{\Theta }^\mathrm{{T}} \end{bmatrix}^\mathrm{{T}}\) and velocity \(\dot{{\varvec{q}}}_{B}^\mathrm{{des}} = \begin{bmatrix} \dot{{\varvec{p}}}_{B}^\mathrm{{T}}&\dot{\varvec{\Theta }}^\mathrm{{T}} \end{bmatrix}^\mathrm{{T}}\), we use PD control to compute the desired acceleration

$$\begin{aligned} \ddot{{\varvec{q}}}_{B}^\mathrm{{des}} = K_\mathrm{{p}}\left( {\varvec{q}}_{B}^\mathrm{{des}} - {\varvec{q}}_{B}\right) + K_\mathrm{{d}}\left( \dot{{\varvec{q}}}_{B}^\mathrm{{des}} - \dot{{\varvec{q}}}_{B}\right) \end{aligned}$$

where \(K_\mathrm{{p}}\) is position gain, and \(K_\mathrm{{d}}\) is velocity gain. \({\varvec{q}}_{B}\) and \(\dot{{\varvec{q}}}_{B}\) are the current body pose and velocity. With equations from (23) to (24) and Eq. (27), a Quadratic Program (QP) can be constructed to find contact forces, which includes an acceleration error cost, contact selection constraints and linearized friction cone constraints.

$$\begin{aligned} \begin{aligned} \underset{{\varvec{f}}}{\text {min}} \quad&\frac{1}{2} \Vert {\varvec{M}} {\varvec{f}} + \bar{{\varvec{g}}} - \ddot{{\varvec{q}}}_{B}^\mathrm{{des}} \Vert _{{\textbf{Q}}} + \frac{1}{2} \Vert {\varvec{f}}\Vert _{{\textbf{R}}} \\ \mathrm{s.t.} \quad&{\varvec{H}}{\varvec{f}} \leqslant 0 \quad \text {(friction pyramid)} \\&{\varvec{D}}{\varvec{f}} = 0 \quad \text {(contact selection)} \end{aligned} \end{aligned}$$

where \({\textbf{Q}}\) and \({\textbf{R}}\) represent diagonal weight matrices, and \({\varvec{H}}\) is the friction pyramid matrix corresponding to friction cone constraints Eq. (26), and \({\varvec{D}}\) is the contact selection matrix for swing feet.

Fig. 5
figure 5

A block diagram illustrates the interaction between the task formulation, hierarchical optimization, and hybrid joint controller submodules

3 Whole-Body Control

After the motion planner generates reference motions and reaction forces, the whole-body controller computes joint torque, velocity and position commands. Motion tracking is achieved through three submodules and the interaction between them is illustrated in Fig. 5. First, the task formulation sub-module formulates task-space tasks as linear functions of generalized accelerations and external forces and puts them in a task hierarchy. Then, the hierarchy is solved by the hierarchical optimization sub-module to get the optimal generalized accelerations and external forces. Finally, the desired joint torques is calculated from the optimal accelerations and forces in the hybrid joint controller sub-module.

3.1 System Underactuation

The quadruped robot shown in Fig. 6 belongs to the multi-body dynamic system, and its equations of motion is relatively complicated, written as

$$\begin{aligned} {\varvec{M}}({\varvec{q}}) \dot{{\varvec{v}}} + {\varvec{h}}({\varvec{q}}, {\varvec{v}}) = {\varvec{S}}^\mathrm{{T}} \varvec{\tau } + {\varvec{J}}_{c}^\mathrm{{T}}({\varvec{q}}) \varvec{\lambda }, \end{aligned}$$

where \({\varvec{q}} \in {\mathbb {R}}^{n_q}\) and \({\varvec{v}} \in {\mathbb {R}}^{n_v}\) are generalized coordinates and velocities, and for those robotic systems with floating-bases, \(n_q\) is different with \(n_v\). \({\varvec{M}}({\varvec{q}}) \in {\mathbb {R}}^{n_v \times n_v}\) stands for the generalized inertia matrix of the robot, and \({\varvec{h}}({\varvec{q}}, {\varvec{v}}) \in {\mathbb {R}}^{n_v}\) is the nonlinear term (including Coriolis, centrifugal and gravity forces). \({\varvec{S}} = \begin{bmatrix} {\varvec{0}}_{n_{j} \times (n_{v}-n_{j})}&{\varvec{I}}_{n_{j} \times n_{j}} \end{bmatrix}\) is the selection matrix, representing the system under-actuation that the floating-base is not directly actuated by joint torques \(\tau \in {\mathbb {R}}^{n_{j}}\). \({\varvec{J}}_{c} = \begin{bmatrix} {\varvec{J}}_{c_1}^\mathrm{{T}}&\ldots&{\varvec{J}}_{c_{n_c}}^\mathrm{{T}} \end{bmatrix} ^\mathrm{{T}} \in {\mathbb {R}}^{3n_c \times n_v}\) is the Jacobian matrix mapping contact forces \(\varvec{\lambda } \in {\mathbb {R}}^{3n_c}\) to the joint-space torques.

Fig. 6
figure 6

Full body dynamic model. Blue represents task-space motion and force references, red indicates the full-body dynamics, and brown represents joint commands

The equations of motion Eq. (29) of a physical system describes the relationship between the accelerations \(\dot{{\varvec{v}}}\), the external contact forces \(\varvec{\lambda }\) and the joint torques \(\varvec{\tau }\). For a quadruped robot, its equations of motion can be decomposed into underactuated and actuated parts

$$\begin{aligned} {\varvec{M}}_\mathrm{{u}}({\varvec{q}})\dot{{\varvec{v}}} + {\varvec{h}}_\mathrm{{u}}({\varvec{q}}, {\varvec{v}})&= {\varvec{J}}_{c,\textrm{u}}^\mathrm{{T}} \varvec{\lambda } , \end{aligned}$$
$$\begin{aligned} {\varvec{M}}_\mathrm{{a}}({\varvec{q}})\dot{{\varvec{v}}} + {\varvec{h}}_\mathrm{{a}}({\varvec{q}}, {\varvec{v}})&= \varvec{\tau } + {\varvec{J}}_{c,\textrm{a}}^\mathrm{{T}} \varvec{\lambda } , \end{aligned}$$

where Eq. (30) excluding joint torques is the first 6 equations of Eq. (29) related to the floating base, and Eq. (31) is the rest of the equations. Equation (30) expresses the relationship between the centroidal momentum of the whole robot and external contact forces. The joint torques \(\varvec{\tau }\) only appear in Eq. (31) and is linearly dependent on \(\dot{{\varvec{v}}}\), so \(\varvec{\lambda }\) can be expressed as

$$\begin{aligned} \varvec{\tau } = {\varvec{M}}_\mathrm{{a}}({\varvec{q}})\dot{{\varvec{v}}} + {\varvec{h}}_\mathrm{{a}}({\varvec{q}}, {\varvec{v}}) - {\varvec{J}}_{c,\textrm{a}}^\mathrm{{T}} \varvec{\lambda }. \end{aligned}$$

For any combination of generalized accelerations \(\dot{{\varvec{v}}}\) and external contact forces \(\varvec{\lambda }\), there always exists a solution for joint torques \(\varvec{\tau }\). All occurrences of the joint torques can be replaced with Eq. (32), so it is enough to use Eq. (30) instead of Eq. (29) as the dynamic constraint during the optimization. The optimization variable \({\varvec{z}}\) can be chosen as

$$\begin{aligned} {\varvec{z}} = \begin{bmatrix} \dot{{\varvec{v}}}^\mathrm{{T}}&\varvec{\lambda }^\mathrm{{T}} \end{bmatrix}^\mathrm{{T}} \in {\mathbb {R}}^{n_{v} + 3n_{c}}. \end{aligned}$$

This decomposition largely reduces the number of optimization variables. As a result, it reduces solution time and allows the controller to run at a frequency of 500 Hz in real test.

3.2 Task Formulation

All robot tasks are often modelled in the form of a general task \({\textbf{T}}\) expressed as constraints on the optimization variable \({\varvec{z}}\), including linear equality and/or inequality:

$$\begin{aligned} {\textbf{T}}: {\left\{ \begin{array}{ll} {\varvec{A}}{\varvec{z}}-{\varvec{b}} = {\varvec{s}}_\mathrm{{eq}} \\ {\varvec{C}}{\varvec{z}}-{\varvec{d}} \leqslant {\varvec{s}}_\mathrm{{in}} \end{array}\right. } \end{aligned}$$

where \({\varvec{A}} \in {\mathbb {R}}^{m \times (n_{v} + 3n_{c})}\) is the equality constraint matrix, \({\varvec{b}} \in {\mathbb {R}}^{m}\) is the equality constraint vector, \({\varvec{C}} \in {\mathbb {R}}^{k \times (n_{v} + 3n_{c})}\) is the inequality constraint matrix, and \({\varvec{d}} \in {\mathbb {R}}^{k}\) is the inequality constraint vector. \({\varvec{s}}_\mathrm{{eq}} \in {\mathbb {R}}^{m}\) and \({\varvec{s}}_\mathrm{{in}} \in {\mathbb {R}}^{k}\) are slack variables. It should be noted that, for a specific task, the above equality constraint and inequality constraint may exist together or independently.

(1) Dynamic consistency To ensure physical consistency, the underactuated part Eq. (30) can be expressed in the general task form Eq. (34) with

$$\begin{aligned} \begin{aligned} {\varvec{A}}_\mathrm{{dyn}}&= \begin{bmatrix} {\varvec{M}}_\mathrm{{u}}({\varvec{q}})&{\varvec{J}}_{c,\textrm{u}}^\mathrm{{T}} \end{bmatrix}, \\ {\varvec{b}}_\mathrm{{dyn}}&= {\varvec{h}}_\mathrm{{u}}({\varvec{q}}, {\varvec{v}}), \end{aligned} \end{aligned}$$

which serves as the dynamic consistency constraint.

(2) Torque saturation limits To ensure the solved joint torques are physically consistent, the joint torque limitations should be considered. The torque limitation constraint can be formulated in the general task form Eq. (34) with

$$\begin{aligned} \begin{aligned} {\varvec{C}}_\mathrm{{tau}}&= \begin{bmatrix} -{\varvec{M}}_\mathrm{{a}}({\varvec{q}}) &{} {\varvec{J}}_{c,\textrm{a}}^\mathrm{{T}} \\ {\varvec{M}}_\mathrm{{a}}({\varvec{q}}) &{} -{\varvec{J}}_{c,\textrm{a}}^\mathrm{{T}} \end{bmatrix}, \\ {\varvec{d}}_\mathrm{{tau}}&= \begin{bmatrix} {\varvec{h}}_\mathrm{{a}}({\varvec{q}}, {\varvec{v}}) - \varvec{\tau }^\mathrm{{min}} \\ -{\varvec{h}}_\mathrm{{a}}({\varvec{q}}, \dot{{\varvec{q}}}) + \varvec{\tau }^\mathrm{{max}} \end{bmatrix}. \end{aligned} \end{aligned}$$

where \(\varvec{\tau }^\mathrm{{min}}\) and \(\varvec{\tau }^\mathrm{{max}}\) are the lower and upper torque limits, respectively.

(3) Contact force limits When the point feet are in contact with the ground, friction cone constraints on the resulting reaction forces \(\varvec{\lambda } \in {\mathbb {R}}^{3n_c}\) should be enforced. All contact force constraints for active contacts have been derived from Eq. (25) to Eq. (26), so the contact force limits constraint is expressed as

$$\begin{aligned} \begin{aligned} {\varvec{C}}_\mathrm{{cf}}&= \begin{bmatrix} {\varvec{0}}&{\varvec{H}} \end{bmatrix}, \\ {\varvec{d}}_\mathrm{{cf}}&= {\varvec{0}}. \end{aligned} \end{aligned}$$

(4) Contact motion task When the point feet are in stance phase, three constraint equations are introduced \({\varvec{p}}_{i}(t) = \textrm{const}\). Then, the constraint is differentiated twice to generate the following:

$$\begin{aligned} {\dot{{\varvec{p}}}_{i}} = {{\varvec{J}}_{\textrm{c}_{i}}}{\varvec{v}} = {\varvec{0}}, \quad {\ddot{{\varvec{p}}}_{i}} = {{\varvec{J}}_{\textrm{c}_{i}}}\dot{{\varvec{v}}} + {\dot{{\varvec{J}}}_{\textrm{c}_{i}}}{\varvec{v}} = {\varvec{0}}, \end{aligned}$$

where \({\varvec{J}}_{\textrm{c}_{i}}\) is the corresponding end-effector Jacobian. By stacking the above constraints for all active contacts together, the contact motion task can be formulated as

$$\begin{aligned} \begin{aligned} {\varvec{A}}_\mathrm{{cm}}&= \begin{bmatrix} {{\varvec{J}}_\textrm{c}}&{\varvec{0}} \end{bmatrix}, \\ {\varvec{b}}_\mathrm{{cm}}&= -{\dot{{\varvec{J}}}_\textrm{c}}{\varvec{v}}. \end{aligned} \end{aligned}$$

(5) Contact force regularization task Given reference contact forces \(\varvec{\lambda }^\mathrm{{ref}}\) generated by Eq. (28), we can also directly penalize the optimization variable \({\varvec{z}}\) from reference values, with

$$\begin{aligned} \begin{aligned} {\varvec{A}}_\mathrm{{cfr}}&= \begin{bmatrix} {\varvec{0}}&{\varvec{I}} \end{bmatrix}, \\ {\varvec{b}}_\mathrm{{cfr}}&= \varvec{\lambda }^\mathrm{{ref}}. \end{aligned} \end{aligned}$$

This contact force regularization task is helpful for directly controlling specific contact forces. It also provides regularization on the optimization variable \({\varvec{z}}\) to solve the QP problem faster.

(6) Cartesian space position control task For the body with position \({\varvec{p}} \in {\mathbb {R}}^{3}\) and linear velocity \(\dot{{\varvec{p}}} \in {\mathbb {R}}^{3}\), the relationship between linear velocity in Cartesian space and the generalized velocity is

$$\begin{aligned} \dot{{\varvec{p}}} = {\varvec{J}}_\mathrm{{pos}} \left( {\varvec{q}}\right) {\varvec{v}}, \end{aligned}$$

which can be differentiated as

$$\begin{aligned} \ddot{{\varvec{p}}} = {\varvec{J}}_\mathrm{{pos}}\left( {\varvec{q}}\right) \dot{{\varvec{v}}} + \dot{{\varvec{J}}}_\mathrm{{pos}}\left( {\varvec{q}},{\varvec{v}}\right) {\varvec{v}}. \end{aligned}$$

The deviation from the desired Cartesian linear acceleration can be penalized with

$$\begin{aligned} \begin{aligned} {\varvec{A}}_\mathrm{{pos}}&= \begin{bmatrix} {\varvec{J}}_\mathrm{{pos}}\left( {\varvec{q}}\right)&{\varvec{0}} \end{bmatrix}, \\ {\varvec{b}}_\mathrm{{pos}}&= \ddot{{\varvec{p}}}^\mathrm{{ref}} + {\varvec{K}}_\mathrm{{p}}^\mathrm{{pos}} ({\varvec{p}}^\mathrm{{ref}} - {\varvec{p}}) \\&\quad + {\varvec{K}}_\mathrm{{d}}^\mathrm{{pos}} (\dot{{\varvec{p}}}^\mathrm{{ref}} - \dot{{\varvec{p}}}) - \dot{{\varvec{J}}}_\mathrm{{pos}}\left( {\varvec{q}},{\varvec{v}}\right) {\varvec{v}}. \end{aligned} \end{aligned}$$

where \({\varvec{K}}_\mathrm{{p}}^\mathrm{{pos}}\) and \({\varvec{K}}_\mathrm{{d}}^\mathrm{{pos}}\) are proportional gain and derivative gain, usually taking diagonal positive definite matrices. \({\varvec{p}}^\mathrm{{ref}}\), \(\dot{{\varvec{p}}}^\mathrm{{ref}}\) and \(\ddot{{\varvec{p}}}^\mathrm{{ref}}\) are specified by a high level motion planner, \({\varvec{p}}\) and \(\dot{{\varvec{p}}}\) are computed with forward kinematics using the estimated state. Many tasks, such as CoM position control task, Swing foot position control task, are specified with this form.

(7) Cartesian space orientation control task For the body with orientation \({\varvec{R}} \in \text {SO(3)}\) and angular velocity \(\varvec{\omega } \in {\mathbb {R}}^{3}\), the relationship between angular velocity in Cartesian space and the generalized velocity is

$$\begin{aligned} \varvec{\omega } = {\varvec{J}}_\mathrm{{ori}} \left( {\varvec{q}}\right) {\varvec{v}}, \end{aligned}$$

which can be differentiated as

$$\begin{aligned} \dot{\varvec{\omega }} = {\varvec{J}}_\mathrm{{ori}}\left( {\varvec{q}}\right) \dot{{\varvec{v}}} + \dot{{\varvec{J}}}_\mathrm{{ori}}\left( {\varvec{q}},{\varvec{v}}\right) {\varvec{v}}. \end{aligned}$$

The deviation from the desired Cartesian angular acceleration can be penalized with

$$\begin{aligned} \begin{aligned} {\varvec{A}}_\mathrm{{ori}}&= \begin{bmatrix} {\varvec{J}}_\mathrm{{ori}}\left( {\varvec{q}}\right)&{\varvec{0}} \end{bmatrix}, \\ {\varvec{b}}_\mathrm{{ori}}&= \dot{\varvec{\omega }}^\mathrm{{ref}} + {\varvec{K}}_\mathrm{{p}}^\mathrm{{ori}} {\varvec{R}} \text {Log} \left( {\varvec{R}}^\mathrm{{T}}{\varvec{R}}^\mathrm{{ref}}\right) \\&\quad + {\varvec{K}}_\mathrm{{d}}^\mathrm{{ori}} (\varvec{\omega }^\mathrm{{ref}} - \varvec{\omega }) - \dot{{\varvec{J}}}_\mathrm{{ori}}\left( {\varvec{q}},{\varvec{v}}\right) {\varvec{v}}. \end{aligned} \end{aligned}$$

where \({\varvec{K}}_\textrm{p}^\mathrm{{ori}}\) and \({\varvec{K}}_\textrm{d}^\mathrm{{ori}}\) proportional gain and derivative gain, usually taking diagonal positive definite matrices. \({\varvec{R}}^\mathrm{{ref}}\), \(\varvec{\omega }^\mathrm{{ref}}\) and \(\dot{\varvec{\omega }}^\mathrm{{ref}}\) are specified by a high level motion planner, \({\varvec{R}}\) and \(\varvec{\omega }\) are computed by forward kinematics with the estimated state, and \(\text {Log}: \text {SO(3)} \rightarrow {\mathbb {R}}^{3}\) is the logarithm map. Some tasks such as Base orientation control task are specified with this form.

(8) Minimum motion task In order to determine all joint states and avoid internal drift, the research also introduces a task spanning the configuration space. For example, the deviation can be penalised from the nominal configuration and velocity with

$$\begin{aligned} \begin{aligned} {\varvec{A}}_\mathrm{{nom}}&= \begin{bmatrix} \begin{bmatrix} {\varvec{0}}&{\varvec{I}} \end{bmatrix}&{\varvec{0}} \end{bmatrix}, \\ {\varvec{b}}_\mathrm{{nom}}&= {\varvec{K}}_\mathrm{{p}}^\mathrm{{nom}}\left( {\varvec{q}}_{n_{\tau }}^\mathrm{{nom}} - {\varvec{q}}_{n_{\tau }}\right) \\&\quad + {\varvec{K}}_\mathrm{{d}}^\mathrm{{nom}}\left( {\varvec{v}}_{n_{\tau }}^\mathrm{{nom}} - {\varvec{v}}_{n_{\tau }}\right) . \end{aligned} \end{aligned}$$

where \({\varvec{q}}_{n_{\tau }}^\mathrm{{nom}}\) and \({\varvec{v}}_{n_{\tau }}^\mathrm{{nom}}\) are the nominal generalized positions and velocities of actuated joints, \({\varvec{K}}_\mathrm{{p}}^\mathrm{{nom}}\) and \({\varvec{K}}_\mathrm{{d}}^\mathrm{{nom}}\) are proportional and derivative gains. It is worth mentioning that these tasks are generally assigned the lowest priority.

3.3 Hierarchical Optimization

The general task Eq. (34) can be formulated as solving a QP problem:

$$\begin{aligned} \begin{aligned} \underset{{\varvec{z}}, {\varvec{s}}_\mathrm{{eq}}, {\varvec{s}}_\mathrm{{in}}}{\text {min}} \quad&\frac{1}{2} \Vert {\varvec{s}}_\mathrm{{eq}} \Vert ^{2} + \frac{1}{2} \Vert {\varvec{s}}_\mathrm{{in}} \Vert ^{2}\\ \mathrm{s.t.} \quad&{\varvec{A}}{\varvec{z}}-{\varvec{b}} = {\varvec{s}}_\mathrm{{eq}} \\&{\varvec{C}}{\varvec{z}}-{\varvec{d}} \leqslant {\varvec{s}}_\mathrm{{in}} \end{aligned}. \end{aligned}$$

The equivalent QP problem is

$$\begin{aligned} \begin{aligned}&\underset{{\varvec{Z}}}{\text {min}}{} & {} \frac{1}{2}{\varvec{Z}}^\mathrm{{T}} {\varvec{P}} {\varvec{Z}} + {\varvec{c}}^\mathrm{{T}} {\varvec{Z}} \\&s.t.{} & {} \bar{{\varvec{C}}} {\varvec{Z}} \leqslant \bar{{\varvec{d}}} \end{aligned}, \end{aligned}$$

where \({\varvec{Z}}=\begin{bmatrix} {\varvec{z}}^\mathrm{{T}}&{\varvec{s}}_\mathrm{{in}}^\mathrm{{T}} \end{bmatrix}^\mathrm{{T}}\) is the equivalent optimization variable, the Hessian matrix \({\varvec{P}}\) and the gradient vector \({\varvec{c}}\) can be calculated from \({\varvec{P}}=\bar{{\varvec{A}}}^\mathrm{{T}}\bar{{\varvec{A}}}\) and \({\varvec{c}}=-\bar{{\varvec{A}}}^\mathrm{{T}}\bar{{\varvec{b}}}\). The expressions of the augmented equality constraint matrix \(\bar{{\varvec{A}}}\), augmented equality constraint vector \(\bar{{\varvec{b}}}\), augmented inequality constraint matrix \(\bar{{\varvec{C}}}\), augmented inequality constraint vector \(\bar{{\varvec{d}}}\) are

$$\begin{aligned} \bar{{\varvec{A}}} = \begin{bmatrix} {\varvec{A}} &{} {\varvec{0}} \\ {\varvec{0}} &{} {\varvec{I}} \end{bmatrix}, \bar{{\varvec{b}}} = \begin{bmatrix} -{\varvec{b}} \\ {\varvec{0}} \end{bmatrix}, \bar{{\varvec{C}}} = \begin{bmatrix} {\varvec{C}}&-{\varvec{I}} \end{bmatrix}, \bar{{\varvec{d}}} = {\varvec{d}}. \end{aligned}$$

For a set of tasks with priorities \({\textbf{T}}_{1}, \ldots , {\textbf{T}}_{n}\), they can be solved under strict prioritization. For the task with priority p, it is defined as:

$$\begin{aligned} {\textbf{T}}_{p}: {\left\{ \begin{array}{ll} {\varvec{A}}_{p} {\varvec{z}}-{\varvec{b}}_{p} = {\varvec{s}}_{\text {eq},p} \\ {\varvec{C}}_{p} {\varvec{z}}-{\varvec{d}}_{p} \leqslant {\varvec{s}}_{\text {in},p} \end{array}\right. }. \end{aligned}$$

The null space projector of this task is defined as:

$$\begin{aligned} {\varvec{N}}_{p} = \text {Null}({\varvec{A}}_{p}) = {\varvec{I}} - {\varvec{A}}_{p}^{\dagger }{\varvec{A}}_{p}, \end{aligned}$$

where \(\{\cdot \}^{\dagger }\) is the singular value decomposition (SVD) based pseudo-inverse. The augmented task Jacobian \(\bar{{\varvec{A}}}_{p}\) is defined as:

$$\begin{aligned} \bar{{\varvec{A}}}_{p} = \begin{bmatrix} {\varvec{A}}^\mathrm{{T}}_{1}&{\varvec{A}}^\mathrm{{T}}_{2}&\cdots&{\varvec{A}}^\mathrm{{T}}_{p} \end{bmatrix}^\mathrm{{T}}. \end{aligned}$$

The null space projector of the augmented task Jacobian \(\bar{{\varvec{A}}}_{p}\) is

$$\begin{aligned} \bar{{\varvec{N}}}_{p} = \text {Null}(\bar{{\varvec{A}}}_{p}) = {\varvec{I}} - \bar{{\varvec{A}}}_{p}^{\dagger }\bar{{\varvec{A}}}_{p}. \end{aligned}$$

An efficient recursive implementation is

$$\begin{aligned} \bar{{\varvec{N}}}_{p} = \bar{{\varvec{N}}}_{p-1} \text {Null}({\varvec{A}}_{p} \bar{{\varvec{N}}}_{p-1}). \end{aligned}$$

Considering tasks \({\textbf{T}}_{1}, {\textbf{T}}_{2}, \ldots , {\textbf{T}}_{p+1}\), the augmented optional solution is

$$\begin{aligned} \bar{{\varvec{z}}}_{p+1}^{*} = \bar{{\varvec{N}}}_{p} {\varvec{z}}_{p+1}^{*} + \bar{{\varvec{z}}}_{p}^{*}, \quad p=1,\ldots , n-1, \end{aligned}$$

where \(\bar{{\varvec{N}}_{1}} = {\varvec{N}}_{1}\) and \(\bar{{\varvec{z}}}_{1} = {\varvec{z}}_{1}\). \({\varvec{z}}_{p+1}^{*}\) is solved from the optimization problem:

$$\begin{aligned} \begin{aligned} \min _{{\varvec{z}}, {\varvec{s}}_\mathrm{{in}}} \quad&\frac{1}{2} \Vert {\varvec{A}}_{p+1}(\bar{{\varvec{N}}}_{p} {\varvec{z}}+\bar{{\varvec{z}}}_{p}^{*}) - {\varvec{b}}_{p+1} \Vert ^{2} + \frac{1}{2} \Vert {\varvec{s}}_\mathrm{{in}} \Vert ^{2} \\ \mathrm{s.t.} \quad&{\varvec{C}}_{p+1} (\bar{{\varvec{N}}}_{p} {\varvec{z}}+\bar{{\varvec{z}}}_{p}^{*}) - {\varvec{d}}_{p+1} \leqslant {\varvec{s}}_\mathrm{{in}} \\&{\varvec{C}}_{p} (\bar{{\varvec{N}}}_{p} {\varvec{z}}+\bar{{\varvec{z}}}_{p}^{*}) - {\varvec{d}}_{p} \leqslant {\varvec{s}}_{\text {in},p}^{*} \\&\qquad \qquad \ldots \\&{\varvec{C}}_{2} (\bar{{\varvec{N}}}_{p} {\varvec{z}}+\bar{{\varvec{z}}}_{p}^{*}) - {\varvec{d}}_{2} \leqslant {\varvec{s}}_{\text {in},2}^{*} \\&{\varvec{C}}_{1} (\bar{{\varvec{N}}}_{p} {\varvec{z}}+\bar{{\varvec{z}}}_{p}^{*}) - {\varvec{d}}_{1} \leqslant {\varvec{s}}_{\text {in},1}^{*} \\ \end{aligned}. \end{aligned}$$

After collecting \({\varvec{z}}\), \({\varvec{s}}_\mathrm{{in}}\) together \({\varvec{Z}}=\begin{bmatrix} {\varvec{z}}^\mathrm{{T}}&{\varvec{s}}_\mathrm{{in}}^\mathrm{{T}} \end{bmatrix}^\mathrm{{T}}\) and stacking inequality constraints to make the notation shorter, similar to the process from Eq. (48) to Eq. (49), the QP problem Eq. (56) becomes:

$$\begin{aligned} \begin{aligned} \min _{{\varvec{Z}}} \quad&\frac{1}{2}{\varvec{Z}}^\mathrm{{T}} {\varvec{P}} {\varvec{Z}} + {\varvec{c}}^\mathrm{{T}} {\varvec{Z}} \\ \mathrm{s.t.} \quad&\begin{bmatrix} {\varvec{C}}_{p+1} \bar{{\varvec{N}}}_{p} &{} -{\varvec{I}} \\ \bar{{\varvec{C}}_{p}} \bar{{\varvec{N}}}_{p} &{} {\varvec{0}} \end{bmatrix} {\varvec{Z}} \leqslant \begin{bmatrix} {\varvec{d}}_{p+1} - {\varvec{C}}_{p+1}\bar{{\varvec{z}}}_{p}^{*} \\ \bar{{\varvec{d}}_{p}} - \bar{{\varvec{C}}_{p}}\bar{{\varvec{z}}}_{p}^{*} + \bar{{\varvec{s}}}_{\text {in},p}^{*} \end{bmatrix} \end{aligned}, \end{aligned}$$

where the Hessian matrix \({\varvec{P}}\) and the gradient \({\varvec{c}}\) can be calculated from \({\varvec{P}}=\bar{{\varvec{A}}}^\mathrm{{T}}\bar{{\varvec{A}}}\) and \({\varvec{c}}=-\bar{{\varvec{A}}}^\mathrm{{T}}{\varvec{b}}\) with:

$$\begin{aligned} \bar{{\varvec{A}}} = \begin{bmatrix} {\varvec{A}}_{p+1} \bar{{\varvec{N}}}_{p} &{} {\varvec{0}} \\ {\varvec{0}} &{} {\varvec{I}} \end{bmatrix}, \bar{{\varvec{b}}} = \begin{bmatrix} -({\varvec{A}}_{p+1}\bar{{\varvec{z}}}_{p}^{*} - {\varvec{b}}_{p+1}) \\ {\varvec{0}} \end{bmatrix}. \end{aligned}$$

3.4 Hybrid Joint Controller

After solving a set of QP problems, the optimal generalized accelerations \(\dot{{\varvec{v}}}^{*}\) can be obtained from Eqs. (56) and (33), and the optimal joint torques \(\varvec{\tau }^{*}\) can be calculated with Eq. (32). The desired joint velocity \({\varvec{v}}_{j}^\mathrm{{des}}\) and position \({\varvec{q}}_{j}^\mathrm{{des}}\) can be computed by adding the joint position and velocity portions to the measured joint velocity \({\varvec{v}}_{j}\) and position \({\varvec{q}}_{j}\) as

$$\begin{aligned} {\varvec{v}}_{j}^\mathrm{{des}}&= {\varvec{v}}_{j} + \dot{{\varvec{v}}}^{*} \Delta t, \end{aligned}$$
$$\begin{aligned} {\varvec{q}}_{j}^\mathrm{{des}}&= {\varvec{q}}_{j} + {\varvec{v}}_{j} \Delta t + \frac{1}{2} \dot{{\varvec{v}}}^{*} \Delta t^2. \end{aligned}$$

The joint action \(\left( {\varvec{q}}_{j}^\mathrm{{des}}, K_{\text {p},j}, {\varvec{v}}_{j}^\mathrm{{des}}, K_{\text {d},j}, \varvec{\tau }^{*}\right)\) is sent the hybrid joint controller to compute motor torque commands as

$$\begin{aligned} \varvec{\tau } = K_{\text {p},j}\left( {\varvec{q}}_{j}^\mathrm{{des}} - {\varvec{q}}_{j}\right) + K_{\text {d},j}\left( \dot{{\varvec{q}}}_{j}^\mathrm{{des}} - \dot{{\varvec{q}}}_{j}\right) + \varvec{\tau }^{*}. \end{aligned}$$

4 Experiment

The proposed planning and control framework is validated through experiments on a quadrupedal robot A1 developed by Unitree Robotics. The robot has 18 degrees of freedom, a height of 0.4 m, and a weight of 14 kg. Four experiments were performed, namely body twisting, sharp turning, multi-gaited locomotion, and disturbance response. A video recording of the experiments is supported by the video submission.

The whole framework including gait generator, motion planner, whole-body controller and state estimator is executed on a dedicated computer with a frequency of 500 Hz. For modelling of kinematics and dynamics, the open-source library Pinocchio [39] is used, which is a C++ implementation of the modern rigid body algorithms for multibody systems [40]. To solve each QP problem numerically, a numerical optimization package OSQP [41] and its C++ interface OSQP-Eigen are used.

All experiments use the same parameters, feedback gains, and task arrangements. The equivalent body property in Eq. (23) and feedback gains in Eq. (27) used in the motion planning module are collected in Tables 1 and 2, respectively. The task hierarchy used in the whole-body control module is summarized in Table 3. The hierarchy first maintains the stability of the robot and then plans movements to achieve task requirements. In detail, satisfying the dynamic constraints and hardware limitations has the highest priority. Then, satisfying the task enforces kinematic contact constraints as a second priority. After that, tasks tracking task-space motion and force references have the third priority. Last, the task of avoiding internal drift and keeping a better-looking posture has the lowest priority.

Table 1 Body Properties
Table 2 Feedback gains
Table 3 Task hierarchy (priority 0 is the highest)

4.1 Body Twisting

To examine the body orientation tracking performance of the proposed framework, the body twisting experiment is conducted. Figure 7 presents video snapshots of the robot performing a body twisting motion. The robot is commanded to rotate its body in x-, y- and z-directions respectively while standing with all feet and its CoM position unchanged. The operator sends angular velocity commands, and then the motion planner calculates the reference body velocity and pose trajectories, which are tracked by a whole-body controller with equations from (6) to (9).

Fig. 7
figure 7

Snapshots of the robot twisting its body. Pictures a and b show the robot twisting around the x-axis, pictures c and d show the robot twisting around the y-axis, and pictures e and f show the robot twisting around the z-axis

Fig. 8
figure 8

Base orientation and angular velocity tracking performance. Graphs a, c and e are orientation tracking in the x-, y-, and z-direction. Graphs b, d and f are angular velocity tracking and these velocities are expressed in the body coordinate. The reference is generated by the motion planner, and the measured data is from the state estimator

Fig. 9
figure 9

Snapshots of the A1 robot making sharp turns while running. Pictures a, b, and c show the robot turning clockwise, and pictures d, e, and f show the robot turning counterclockwise

Figure 8 shows the body orientation reference tracking data. Comparing Fig. 8a, c, e, it can be found that tracking errors in the x-direction are larger than those in the other two directions due to the asymmetry of the robot in the three directions. Also, the robot’s internal frictional resistance in the x-direction is greater, making it more difficult to control. From Fig. 8b, d, f, it can be seen that the angular velocity tracking is more accurate, even when the angular velocity changes rapidly from -2 rad/s to +2 rad/s within 1 s.

The current maximum angle tracking error is about 0.12 rad, which causes by the robot prioritising the stability and limiting the Base orientation control task. It can be improved by adjusting the weight of the Base orientation control task in Table 3 and feedback gains \({\varvec{K}}_\mathrm{{p}}^\mathrm{{ori}}\) and \({\varvec{K}}_\mathrm{{d}}^\mathrm{{ori}}\) in Eq. (46). However, we believe that prioritising the stability of the robot itself is more important in a mission. It is not cost-effective to sacrifice stability and safety for better performance. Also, although the base orientations are presented as Euler angles for intuitive understanding, they are presented as rotation matrices in the planning and control processes, which won’t cause singularity and ambiguity issues.

4.2 Sharp Turning

Fig. 10
figure 10

Sharp turning experiment data. a, b Show the performance of angular velocity and angle in the yaw direction. c shows the actual CoM and foot trajectories

A sharp turning experiment demonstrates the framework’s ability to make the robot turn at a high angular velocity, which is essential when the robot traverses confined and crowded environments. Figure 9 presents video snapshots of the robot performing a sharp turning motion. In the experiment, the robot executes manoeuvres including moving forward, performing an S-Turn, and then moving straight again. During the whole process, the robot runs at a constant forward velocity of 0.8 m/s in the trot-run gait.

Figure 10 presents the sharp turning experiment data. Figure 10a shows a comparison between the commanded yaw velocity and the actual yaw velocity, Fig. 10b shows the reference yaw angle generated by the motion planner and the actual yaw angle extracted by the state estimator. As can be seen from these two sub-pictures, the reference is nicely tracked and the robot can still withstand an instantaneous angular velocity of up to 1.8 rad/s when running fast. Figure 10c shows the top view of the motion in the ground plane, including the xy positions of CoM and all feet extracted from the state estimator.

Fig. 11
figure 11

Snapshots of the robot running with multiple gaits. From top to bottom, the speed of each gait increases respectively

Fig. 12
figure 12

Graph showing the gait type, contact state, velocity and height of the real robot during the multi-gaited locomotion experiment. In the top plot, a coloured region indicates when the foot is in stance phase, while a white space shows that it is in the swing phase. The middle and bottom plots compare the reference values generated by the motion planner and the actual values extracted from the state estimator

Fig. 13
figure 13

Trajectory of the front left foot. The left subplot shows the whole trajectory during the experiment, and the right subplot is a zoomed-in view showing the trajectory of the foot within a gait cycle. \(h_f\) indicated the commanded height of the foot. The red line indicates the reference trajectory planned by the motion planner, and the black line indicates the actual trajectory estimated by a state estimator

Fig. 14
figure 14

Disturbance response experiment. Picture ac shows first trial, and picture df shows second trial. Lateral disturbances are inserted by strongly kicking the robot twice from its left during trotting. In order to keep its balance, the robot step to the side. After the robot regain balance, it moves back to its initial position eventually

Fig. 15
figure 15

CoM velocity and position. The gray area shows the trajectories at the time around a disturbance, where the disturbance force results in a sudden change in the CoM velocity. The red line shows the reference velocity and position, while the black line shows the actual velocity and position

Fig. 16
figure 16

The front-left foot’s contact force and position in the y direction, during the first kicking and its recovery period. The gray area indicates the process of kicking, and the orange area shows the process of walking back to its initial position. The red line shows the reference contact force and foot position from the motion planner, while the cyan line shows the reference force from the whole-body controller, and the black line shows actual position from the state estimator

4.3 Multi-gaited Locomotion

To demonstrate that the planning and control framework can stabilize basic locomotion gaits, the multi-gaited locomotion experiment is presented. A set of gaits including static-walk, trot-walk, trot and trot-run, are tested in this experiment. Figure 11 shows the snapshots of all gaits during the experiment. The description of the gaits are shown below:

  • Static-Walk only one leg is in the swing, and the robot is statically stable at all times.

  • Trot-Walk the robot lifts diagonal legs at the same time at low speed, and exist time all four feet make contact with the ground.

  • Trot the robot lifts diagonal legs at the same time at a higher speed, and only two legs make contact with the ground at any time.

  • Trot-Run the robot lifts diagonal legs at the same time, and exist time when all four feet are in the air.

The difference between trot-walk, trot and trot-run gaits is the longer moment of suspension between each beat.

From the top and middle subplots in Fig. 12, different gaits are selected as the speed changes. This gait selection is consistent with the gait-changing behaviour of quadruped animals. From the bottom subplot in Fig. 12, the height trajectory shows a more dramatic height change as the speed of the gait increase because the longer aerial phase of all feet makes it harder to maintain the height of the body. For the same reason, in the trot-run gait, the aerial phase becomes the longest, where exists time none of the feet is making contact with the ground. This makes the robot unable to maintain body height and causes a lower mean value in the graph.

Figure 13 shows the trajectory of the front left foot in xz plane. From the left subplot, the step length increases and then decreases as its speed increases and decreases, which is enforced by the rule illustrated in Eq. (10). The right subplot shows the process of lifting foot height illustrated in Eqs. (15) and (16).

This experiment shows that the proposed framework is independent of gait type, and can be applied to different gaits. It also shows the stability of the framework between gait changes.

4.4 Disturbance Response

To evaluate the disturbance rejection capability of the control framework, robot A1 gets kicked from the side when it is trotting in place. Figure 14 presents the snapshots of this experiment. It can be seen that the robot can still remain balanced after strong kicks.

Figure 15 shows the changes in CoM velocity and position over time in the presence of external disturbance. These disturbance forces cause a sudden change in the CoM velocity, especially in the y direction, and the increase in y direction velocity results in a side-stepping motion. Since the CoM position control task penalizes deviations from the desired and actual CoM positions, the robot will walk back to its original standing position.

Figure 16 presents the y reaction forces and the y motion for a specific foot (front-left). Two recovery strategies are used when the robot reacts to the unpredictable external disturbance resulting from the first kicking. As shown in Fig. 16a, in the stance phase, the foot adjusts the ground reaction forces to resist the disturbance while satisfying in the constraint of the friction cone. This is realized by optimizing the desired contact forces in the motion planning module illustrated in Sect. 2.4, and tracking these forces via the Contact force regularization task in the whole-body control module. As shown in Fig. 16b, in the swing phase, it reacts to the external disturbance through a side-stepping motion which is realized by adjusting the foothold position according to the actual CoM velocity illustrated in Sect. 2.3. The difference between the reaction force generated by the motion planner and the whole body controller is due to the modelling of the robot being different, where the motion planner uses the 3D-SRBM and the whole body controller uses full-body dynamic model. For the same reason, the foot position planned by the motion planner is different from the one planned by the whole-body controller.

5 Conclusion

A real-time planning and control framework is proposed and deployed in a real robot, which enables the robot to perform various dynamic locomotion and be robust to external disturbance. The existing methods using the same model in the planning and control phases, where simplified models can not ensure physical consistency and full-order models make computationally expensive. To take advantage of both methods, this research employs both the 3D-SRBM and the full-body dynamic model to build a more comprehensive framework suitable for different tasks. The proposed control framework does not have a limitation in gait types and gait transitions. The experiment result proved the practicability of quadruped robots with the proposed framework in open environments and under disturbances. The above advantages are very beneficial to deploy quadruped robots in complex real environment to complete tasks. The immediate next step will be focused on extending the framework to generate base and foot trajectories that guide the motion of the robot over obstacles with a sequence of suitable footholds. To fully explore coupling effects between the base and limbs and provide sufficient flexibility to generate more complex whole-body motions, future work will replace the 3D-SRBM used in the motion planner with a mixed model combining the centroidal dynamics model and the whole-body kinematics model.