With our method, task optimization of robot-assisted dressing (TOORAD), we seek to find a sequence of actions for both the assistive mobile manipulator and the person receiving assistance that are likely to result in successful dressing. Key features of TOORAD are its simulation of cloth-human physics, its simulation of human and robot kinematics, its representations of robot dexterity, its planning of multiple actions for a task, and its planning of actions for both the robot and person using an optimization-based approach. TOORAD is an offline process and it is suitable for situations when the garment and person can be modeled beforehand, when we would like to personalize the actions for the person’s capabilities, and when we would like to configure the robot such that the task is successful despite variations between models and reality.
We first define the problem addressed by TOORAD and important assumptions it makes, followed by the details of the optimization. We provide examples from our implementation of TOORAD used in our evaluation, which is for the task of pulling the two sleeves of a hospital gown onto a person’s arms.
Problem definition
TOORAD aims to find, within the space of the sequence of all actions that the person being assisted and the assistive robot can perform, \(\mathcal {U}\), “What is a sequence of actions that will result in successful dressing?” Notably, TOORAD considers what humans are capable of doing, rather than what humans typically do. This choice follows from the notion that humans are currently more adaptable than robots.
With this approach, we avoid the challenges in modeling actions a person might take in given circumstances. Instead, we model the person’s physical and kinematic capabilities. By taking this approach, we can take advantage of the robot’s strengths and capabilities that may differ from a human assistant. For example, we can equip the robot with sensors, such as a capacitive distance sensor, that can estimate the distance to the person’s arm in a way a human caregiver may find difficult. Additionally, instead of basing the robot actions on those from human caregivers, TOORAD explores strategies for dressing in a simulated environment that might be specific to robotic assistance. Some solutions can be difficult for caregivers or robotics experts to identify without computer assistance due to the high number of degrees of freedom, solutions that may be distinct from human practice, and the complexity of human disabilities that can vary greatly between individuals. The non-anthropomorphic kinematics of the robot presents challenges for approaches based on learning from demonstration or other forms of user control.
We can formulate the problem as optimizing the sequence of actions for the human and robot, \({{\varvec{A}}}_{\text {h}, \text {r}}\), such that
$$\begin{aligned} \begin{aligned} \mathop {\mathrm{arg}\,\mathrm{max}}\limits _{{{\varvec{A}}}_{\text {h}, \text {r}}\in \mathcal {U}}&R_{\text {d}}({{\varvec{A}}}_{\text {h}, \text {r}}) \end{aligned} \end{aligned}$$
(1)
where \(\mathcal {U}\) is the domain of feasible robot and human actions and \(R_{d}\) is an objective function for dressing that we define in Eq. 3 . We show the problem formulation in Table 1.
Table 1 The problem formulation for TOORAD as a single large optimization A challenge to solving this problem is that the space of all possible human and robot actions is large. To achieve tractability, we apply constraints on the action space that structure how interactions between the person and robot will take place. We constrain this space by limiting the robot’s end effector to linear trajectories and alternating between the actions of the person and robot. Additionally, the person and robot base hold still as the robot moves the garment onto the person’s body. This constrained search space, \(\mathcal {U}^\text {c}\), simplifies the actions of the robot and the person. These simplified actions are also straightforward to communicate to people who will be using the robot for assistance.
We limit the sequence of actions to this constrained space, such that \({{\varvec{A}}}_{\text {h}, \text {r}}\in \mathcal {U}^\text {c}\), and modify the optimization as shown in Eq. 2. \({{\varvec{A}}}_{\text {h}, \text {r}}\) can then be defined as a sequence of paired actions by the human and robot,
$$\begin{aligned} \begin{aligned} {{\varvec{A}}}_{\text {h}, \text {r}} = \big \{ \{\varvec{a}_{\text {h},1},\varvec{a}_{\text {r},1}\}, \ldots , \{\varvec{a}_{\text {h},N},\varvec{a}_{\text {r},N}\} \big \}, \end{aligned} \end{aligned}$$
(4)
where \(\varvec{a}_{\text {h},i}\) and \(\varvec{a}_{\text {r},i}\) are actions for the robot and human, respectively, for subtask i. Each pair of actions is performed in sequence. We call each pair of actions a subtask. N is the length of the sequence of actions, the number of subtasks for the dressing task. The actions of the person and robot are taken in that order, first the person, then the robot.
TOORAD does not automatically segment the task into subtasks. Instead, we provide a small set of candidate subtask sequences to the optimization. For example, a candidate sequence we used in our evaluation in Sect. 5.2 was to first dress the entire left arm and then dress the entire right arm. Other candidate sequences used included splitting each arm into two subtasks, first dressing the forearm and then dressing the upper arm. Each subtask defines a kinematic trajectory for the robot’s end effector, which holds the garment.
We narrow our definitions of the human and robot actions, based on our limited action space. The assumptions we use to define these actions are described in Sect. 3.2. We model the human’s actions as only the pose he/she holds, and we ignore the movement of that pose. Therefore, we define the human’s actions \(\varvec{a}_{\text {h},i}\) as
$$\begin{aligned} \begin{aligned}&\varvec{a}_{\text {h},i} = {\varvec{c}}_{\text {h},i}, \end{aligned} \end{aligned}$$
(5)
where \({\varvec{c}}_{\text {h},i}\) is a configuration of the human body that the person holds while the robot performs its action for the subtask, i. In our evaluation, the robot pulls sleeves onto a person’s arms with each subtask using \({\varvec{c}}_{\text {h},i} \in \mathbb {R}^4\) (3 DoF at the shoulder, 1 DoF at the elbow) for the relevant arm.
We model the robot’s actions for a subtask i as the pose of the robot’s base, \({\varvec{c}}_{\text {r},i}\), and the trajectory of the robot’s end effector, \({\varvec{t}}_{\text {r},i}\). We ignore the movement of the robot to achieve the base pose. Therefore, we define the robot’s actions, \(\varvec{a}_{\text {r},i}\), as
$$\begin{aligned} \begin{aligned}&\varvec{a}_{\text {r},i} = \{{\varvec{c}}_{\text {r},i}, {\varvec{t}}_{\text {r},i} \}. \end{aligned} \end{aligned}$$
(6)
We hence refer to \({\varvec{c}}_{\text {r},i}\) as the robot configuration, which is a more general term for robot base pose as it may include other degrees of freedom such as the robot’s spine height. In our evaluation, we used \({\varvec{c}}_{\text {r},i} \in \mathbb {R}^4\) (3 DoF for the robot base pose, 1 DoF for the robot spine height). The robot action is taken in sequence: first attaining the robot configuration and then executing the end effector trajectory.
We can substitute these definitions of the human and robot actions into Eq. 4, to obtain
$$\begin{aligned} \begin{aligned} {{\varvec{A}}}_{\text {h}, \text {r}} = \big \{ \{{\varvec{c}}_{h,1},{\varvec{c}}_{r,1},{\varvec{t}}_{r,1}\}, \ldots , \{{\varvec{c}}_{h,N},{\varvec{c}}_{r,N},{\varvec{t}}_{r,N}\} \big \}. \end{aligned} \end{aligned}$$
(7)
As before, the actions within each subtask occur in order: first the person attains his or her configuration, then the robot attains its configuration, and then the robot executes its end effector trajectory.
Table 2 The optimization performed by TOORAD is split into three levels The goal of TOORAD is to find \({{\varvec{A}}}_{\text {h}, \text {r}}\), as defined in Eq. 7, that maximizes the value of the dressing objective function, \(R_{\text {d}}\), as defined in Eq. 3. We discuss how we formulate the optimization architecture to maximize \(R_{\text {d}}\), in Sect. 3.3. We define and discuss the details of the \(R_{\text {d}}\), in Sects. 3.6, 3.7, and 3.8.
Assumptions
We make five key simplifying assumptions in addressing the problem described above. This list is not exhaustive, but highlights assumptions important to understanding the work. First, we assume that the robot is able to grasp and re-grasp the cloth, which we achieve during experiments through human intervention and special tools. We do not explore dexterous manipulation and grasping of cloth by the robot, as this problem has its own challenges and is being addressed by others (Fahantidis et al. 1997; Osawa et al. 2007; Maitin-Shepard et al. 2010; Miller et al. 2012; Bersch et al. 2011; Berenson 2013). Second, we assume that the participant receiving assistance is collaborative. That is, the participant will move his/her body to the extent of his/her ability in support of the task. Third, we assume the robot can estimate the pose of the participant’s body prior to dressing. Fourth, we assume that the person and robot can achieve the desired configurations with negligible effect on the task. Fifth, we assume that human is able to hold his or her pose for the entire duration of the subtask.
Optimization architecture
We organize the optimization into three levels, shown in Table 2. We use layers of optimization (Sergeyev and Grishagin 2001; Anandalingam and Friesz 1992; Migdalas et al. 2013), to reduce computational cost. This structure reduces computation time compared to a single-layer, joint optimization of all parameters, by decreasing the times higher levels of optimization must be run, and more rapidly pruning areas of the search space. This architecture takes advantage of dependencies between the human configuration, robot configuration, and trajectory that we intend to optimize. The robot configuration and the trajectory of the robot’s end effector depend on the configuration of the human. Both human and robot configurations and the robot trajectory depend on the subtask being optimized. The top-level optimization is used to plan the optimal sequence of subtasks. The mid-level optimization is used to plan the optimal human configuration. The lower-level optimization is used to plan the optimal robot configuration.
An implication of this structure is that the nested optimizations heavily influence the computational requirements. Instead of optimizing the trajectory of the robot’s end effector in a low-level optimization, we use previously selected trajectory policies for each subtask. The policy returns a kinematic trajectory of the robot’s end effector given the human configuration. Trajectory optimization for dressing is complicated, as it involves complex physical interactions between the garment and the person’s body, which can be computationally expensive to simulate. We trade accuracy in exchange for a policy that can rapidly be applied to many human configurations; the trajectory policy only gives a coarse approximation of the optimal trajectory.
Optimization algorithms
We provide TOORAD a small set of candidate subtask sequences to optimize, and we use an exhaustive search to consider all options in the set. In this case, exhaustive search is reasonable because the search space is small, with only a few candidate subtask sequences that are chosen manually. It is an open question how to better create candidate subtasks and search through them. The potential to develop more efficient computational methods for this type of problem is an interesting direction for future research but may not be strictly necessary to benefit people with disabilities.
The space of the human and robot optimization objective functions can be highly nonlinear and challenging to search, and the parameter space is large. These objective functions do not have an analytical gradient and estimating their gradients can be computationally expensive. Joint limits, model collisions, and cloth physics are constraints in the objective function that are highly non-convex and non-differentiable. We perform the optimizations of the human and robot configurations for each subtask using covariance matrix adaptation evolution strategy (CMA-ES) (CMA 2018; Hansen 2006), which works well for derivative-free, simulation-based, local optimization. We have observed that CMA-ES often performs better for local optimization than global optimization, and that starting with a good initialization often improves its performance.
Selecting candidate trajectory policies
We manually defined trajectory policies for each dressing subtask. A rectangular tool holding the sleeve pulls the sleeve first along the forearm, then along the upper arm, and finally moves to the top of the shoulder. Each linear trajectory is defined with respect to a coordinate frame at the base of the link being dressed, with its X-axis along the axis of the link and the Y-axis parallel to the ground plane. Figure 2 shows these axes overlaid on a diagram of the arm. The waypoints written as (x, y, z) in meters, were (0.1, 0.0, 0.1), \((-\,0.03, 0.0, 0.1)\), \((-\,0.05, 0.0, 0.1)\), and (0.0, 0.0, 0.1), with respect to the hand, forearm, upper arm, and shoulder, respectively. The policy is fixed for each subtask. Policies for different subtasks are created using these waypoints. For example, dressing the whole arm would consist of moving through all four waypoints and dressing the forearm would consist of moving through the hand and forearm waypoints.
Using a simulator of cloth-person physics, we verified that this policy succeeds in simulation for many configurations of the arm. For example, Fig. 3 shows the results in simulation of attempting different trajectories for pulling a sleeve onto a person’s forearm for different angles of the forearm. The figure also shows what a successful outcome looks like in the simulator as well as what it looks like when the sleeve catches on the arm. The center of mass of the successful trials for angles between \(30^{\circ }\) and \(-\,30^{\circ }\) is consistently near 10 cm above the axis of the forearm. This result supports our selected policy. When the forearm deviates further from horizontal, deformation of the sleeve’s opening becomes problematic.
The policy \(\pi _i\) outputs a kinematic trajectory of the robot’s end effector, \({\varvec{t}}_{\text {r},i}\), for pulling the sleeve onto the person’s body for subtask i for any human configuration in \(\mathcal {H}_i\), where \(\mathcal {H}_i\) is the set of human configurations, \({\varvec{c}}_{\text {h},i}\), where the trajectory policy succeeds. We write the trajectory as
$$\begin{aligned} \begin{aligned} {\varvec{t}}_{\text {r},i} = \pi _i({\varvec{c}}_{\text {h},i})&\forall {\varvec{c}}_{\text {h},i} \in \mathcal {H}_i. \end{aligned} \end{aligned}$$
(11)
TOORAD uses a fixed-radius neighbors modelFootnote 1 to quickly estimate if a human configuration, \({\varvec{c}}_{\text {h},i}\) lies within \(\mathcal {H}_i\).
Determining the human configuration space of the policy
Using both geometric and physics simulation, TOORAD estimates the space, \(\mathcal {H}_i\), of human configurations, \({\varvec{c}}_{\text {h},i}\), where the trajectory policy succeeds. It verifies that the following criteria hold true:
-
The person’s body is not in self-collision.
-
The robot’s tool does not collide with the person.
-
The trajectory is successful in the cloth-person physics simulator.
If all criteria are satisfied, then we consider \({\varvec{c}}_{\text {h},i} \in \mathcal {H}_i\). Otherwise, \({\varvec{c}}_{\text {h},i} \notin \mathcal {H}_i\). The simulation used to determine the human configuration space where the policy is successful are shown in Fig. 4. The space \(\mathcal {H}_i\) is estimated once for each subtask using a generic human model based on a 50 percentile male from Tilley (2002) and a generic cylindrical sleeve of similar dimension to a hospital gown sleeve.
Constraints
The optimization of the human and robot configurations uses personalized models of the person, wheelchair (when applicable), and garment. Using a geometric simulator and a cloth-person physics simulator along with these models, TOORAD enforces six constraints in its optimization. These constraints are:
-
Stretching limits of the garment.
-
The person’s range of motion.
-
The person’s body is not in self-collision.
-
The robot does not collide with the person, wheelchair or garment.
-
The garment does not experience interference from the wheelchair.
-
\({\varvec{c}}_{\text {h},i}\in \mathcal {H}\)
As the dressing task proceeds and subtasks are completed, TOORAD adds stretching constraints. These constraints are based on measurements of the real garment. For example, the maximum stretch between the two shoulders of the hospital gown we used in our evaluation was 0.5 m. Once the person’s left arm is in the sleeve of the gown, we assumed the left sleeve would remain in place. Therefore, for the subtasks that pull on the right sleeve, the right sleeve can move at most 0.5 m from the top of the person’s left shoulder. These stretching constraints model dependencies between subtasks.
We modeled a person’s range of motion by placing limits on the range of joint angles on a joint-by-joint basis using a 3-axis Euler-angle-joint at the shoulder and a single axis joint at the elbow. Some disabilities were easier to model with other constraints. For example, a range of motion constraint we used for one participant was that the participant’s upper arm could not be raised above parallel to the ground. We personalized TOORAD’s modeled constraints for the participants’ motor impairments described in Sect. 5.2.
Subtask optimization
For the candidate sequences of subtasks we provide, TOORAD performs the optimization in Eq. 8. The optimization is given a set \(\varvec{\varPi }\) of sequences of trajectory policies \(\varvec{\pi }=\{\pi _1,\ldots ,\pi _N\}\). The top-level optimization uses an exhaustive search to plan the best sequence of trajectory policies. We define \(C_{\text {n}}\) as a cost for each additional subtask,
$$\begin{aligned} \begin{aligned}&C_{\text {n}}(N) = N. \end{aligned} \end{aligned}$$
(12)
The function F is the objective function for the mid-level optimization of the human configuration shown in Eq. 9. We manually selected \(\psi = 0.1\) to be small, \(\sim 100 \times \) smaller than the values of F for human configurations that result in successful dressing.
Human optimization
Our formulation of F allows TOORAD to simultaneously consider the comfort of the person, the capabilities of the person, the kinematics of the robot, and the physics involved in manipulating garments onto a persons’ body. We formulate F in Eq. 9 as a linear combination of a few terms that we will define.
\(C_{t}\) is the cost on torque due to gravity experienced at the person’s shoulder in the configuration \({\varvec{c}}_{\text {h},i}\), \(C_{s}\) is the cost on exceeding a soft constraint on stretching the garment, and \(R_{\text {r}}\) is the score of how well the robot can execute the trajectory (see Sect. 3.8). \(C_t\) is normalized to 0–1. \(C_s\) has units of distance with a maximum value of 0.04 m. \(R_{r}\) is unitless and ranges from 0 to 11. With these ranges, we manually selected 0.5, 5.0, and 1.0 for \(\zeta \), \(\eta \), and \(\gamma \), respectively. These gains emphasize functionality of the configuration, with more emphasis placed on the value of \(R_{\text {r}}\) while also considering the other costs.
Cost on torque
To encourage human arm configurations that are more comfortable for the person, we have included a cost on the torque at the person’s shoulder due to gravity. This cost ignores torques due to external forces, such as from the garment, robot, or wheelchair. We divide the torque at the current configuration by the maximum possible torque, to create a normalized torque cost. The maximum torque is when the arm is straight, parallel to the ground. This torque cost is independent of the weight and scale of the person’s arm. We use the body mass values for an average male.Footnote 2 The cost on torque follows as
$$\begin{aligned} \begin{aligned}&C_{t}({\varvec{c}}_{\text {h},i}) = \frac{\text {torque}({\varvec{c}}_{\text {h},i})}{\text {maximum possible torque}} \end{aligned} \end{aligned}$$
(13)
Cost on stretching the garment
In addition to the hard constraint on stretching the garment described in Sect. 3.5, we added a soft boundary cost to encourage human configurations for which the garment is stretched slightly less. For some small distance before the hard constraint, the objective function receives a penalty defined as
$$\begin{aligned} \begin{aligned}&C_{s}({\varvec{c}}_{\text {h},i}, \pi _i({\varvec{c}}_{\text {h},i})) = d_{\text {exceeded}}, \end{aligned} \end{aligned}$$
(14)
where \(d_{\text {exceeded}}\) is the maximum amount by which the soft boundary has been exceeded for all points along the kinematic trajectory of the robot’s end effector. This value comes from the kinematics simulator based on the human configuration and the end effector trajectory from the trajectory policy. For our hospital gown, based on its physical limits, we set a hard constraint on the maximum distance between the two shoulders of the gown to 0.5 m. We set a soft constraint of 0.46 m, resulting in a maximum value for \(C_{s}\) of 0.04.
Robot optimization
The reward function \(R_{\text {r}}\) estimates how well the robot can execute the trajectory for the given human configuration. TOORAD calculates it using the lower-level optimization of the robot configuration, \({\varvec{c}}_{\text {r},i}\), shown in Eq. 10. This optimization is based on Kapusta and Kemp (2019) with modifications that are relevant to dressing tasks. This objective function uses two measures that we have developed to estimate how well the robot will be able to perform the dressing task: \(R_{reach}\), which is based on TC-reachability, and \(R_{manip}\), which is based on TC-manipulability (from Kapusta and Kemp (2019)). These two terms are related to task-centric (TC) robot dexterity.
Robot Dexterity Measures
TC-reachability and TC-manipulability are measures that consider the average values of reachability and manipulability across all goals in a task without order. Because dressing has a specific order that the robot should move to goals in its trajectory, we have added a graph-based search to the calculation of these terms to approximate this order and structure. Goal poses for the robot are determined from sampling the trajectory of the robot’s end effector. For each goal pose, TOORAD uses OpenRAVE’s ikfast (OpenRAVE 2018), to find a sample of collision-free IK solutions. These IK solutions are found for the PR2’s 7-DoF arm by creating an analytical IK solver for discretized values of one of its joints. We used the robot’s forearm roll joint discretized at 0.1 radian intervals. An IK sample is obtained for each discretized value of the forearm roll joint.
From these IK solutions for each goal pose, TOORAD creates a layered directed acyclic graph where a layer is created for each goal pose and each node is a joint configuration of the robot arm. Nodes in a layer are connected to nodes in the next layer if the maximum difference between each robot joint is less than a threshold (we used \(40^{\circ }\)). This threshold is used to predict that a path exists between the two joint configurations without the time cost of running a motion planner. To facilitate use of a standard graph search algorithm, we add a start node connected to each node in the first layer (IK solutions for the first goal pose in the trajectory) and an end node connected to each node in the last layer. Once it creates the graph, TOORAD performs a uniform cost search (Dijkstra 1959) to find a path with maximal JLWKI.
Joint-limit-weighted Kinematic Isotropy (JLWKI)
JLWKI is presented in Kapusta and Kemp (2019) and is based on kinematic isotropy, \(\varDelta ({{\varvec{q}}})\), from Kim and Khosla (1991), shown in Eq. 15.
$$\begin{aligned} \begin{aligned}&\varDelta ({{\varvec{q}}}) = \frac{\root a \of {\text {det}({\varvec{J}}({{\varvec{q}}}){\varvec{J}}({{\varvec{q}}})^T)}}{\left( \frac{\text {trace}({\varvec{J}}({{\varvec{q}}}){\varvec{J}}({{\varvec{q}}})^T)}{a}\right) } \end{aligned} \end{aligned}$$
(15)
Geometrically, kinematic isotropy is proportional to the volume of the manipulability ellipsoid of the manipulator, which is the volume of Cartesian space moved by the end effector for a unit ball of movement by the arm’s joints. It is based on manipulability from Yoshikawa (1984), with a modification to remove order dependency and scale dependency. This metric can be useful when assessing kinematic dexterity between different configurations of the same robot. The values of kinematic isotropy are always in the range of 0 to 1 so they can be more directly compared across robot platforms.
JLWKI modifies kinematic isotropy to consider joint limits by scaling the manipulator’s Jacobian by an \(n \times n\) diagonal joint-limit-weighting matrix \({\varvec{T}}\), defined in Eq. 16, where n is the number of joints of the manipulator.
$$\begin{aligned} \begin{aligned}&{\varvec{T}}({{\varvec{q}}})= & {} \begin{bmatrix} t_1&0&0 \\ 0&\ddots&0 \\ 0&0&t_n \\ \end{bmatrix} \end{aligned} \end{aligned}$$
(16)
\(t_i\) in \({\varvec{T}}\) is defined as
$$\begin{aligned} \begin{aligned}&t_i = 1-\phi ^{\kappa } \end{aligned} \end{aligned}$$
(17)
where
$$\begin{aligned} \begin{aligned}&\kappa = \frac{q^{r}_{i} -|q^{r}_{i} - q_{i} + q^{-}_{i}|}{\lambda q^{r}_{i}}+1 \end{aligned} \end{aligned}$$
(18)
and
$$\begin{aligned} \begin{aligned}&q^{r}_{i} = \frac{1}{2}(q^{+}_{i} - q^{-}_{i}). \end{aligned} \end{aligned}$$
(19)
We set \(t_i=1\) for infinite roll joints. The variable \(\phi \) is a scalar that determines the maximum penalty incurred when joint \(q_i\) approaches its maximum and minimum joint limits, \(q^{+}_{i}\) or \(q^{-}_{i}\), and \(\lambda \) determines the shape of the penalty function. We used a value of 0.5 for \(\phi \) and 0.05 for \(\lambda \). This weighting function and the values for \(\phi \) and \(\lambda \) were selected to halve the value of the kinematic isotropy at joint limits, have little effect in the center of the joint range, to begin exponentially penalizing joint values beyond 75% of the range, and to operate as a function of the percentage of the joint range. JLWKI is then defined as
$$\begin{aligned} \begin{aligned} \text {JLWKI}({{\varvec{q}}}) = \frac{\root a \of {\text {det}({\varvec{J}}({{\varvec{q}}}){\varvec{T}}({{\varvec{q}}}) {\varvec{J}}({{\varvec{q}}})^T)}}{(\frac{1}{a})\text {trace}({\varvec{J}}({{\varvec{q}}}){\varvec{T}}({{\varvec{q}}}){\varvec{J}}({{\varvec{q}}})^T)}. \end{aligned} \end{aligned}$$
(20)
Scoring Metrics
The graph-based search returns either a path from start to end or the longest path achievable towards end. The metric, \(R_{\text {reach}}\), is related to the percentage of goal poses along the returned path to which the robot can find a collision-free IK solution from robot configuration, \({\varvec{c}}_{\text {r},i}\). \(R_{reach}\) is defined as
$$\begin{aligned} \begin{aligned} R_{reach}({\varvec{c}}_{\text {h},i}, {\varvec{c}}_{\text {r},i},\pi _i({\varvec{c}}_{\text {h},i})) = \left( \frac{N_p}{N_{\text {total}}}\right) , \end{aligned} \end{aligned}$$
(21)
where \(N_p\) is the number of nodes in the path and \(N_{\text {total}}\) is the total number of goal poses. \(R_{manip}\) is then defined as
$$\begin{aligned} \begin{aligned} R_{manip}({\varvec{c}}_{\text {h},i}, {\varvec{c}}_{\text {r},i},\pi _i({\varvec{c}}_{\text {h},i})) = \left( \frac{1}{N_{\text {total}}}\right) \sum \limits _{i=1}^{N_p} JLWKI(\varvec{p}_i), \end{aligned} \end{aligned}$$
(22)
where \(\varvec{p}_i\) is the IK solution for the ith goal pose.