1 Introduction

The domain of attraction (DoA) of a stable equilibrium in a nonlinear system is a region of the state space from which each trajectory starts and eventually converges to the equilibrium itself. In the literature, the DoA is also known as the region of attraction or basin of attraction [1, 33]. The DoA of an equilibrium and its computation is of main importance in control applications. However, in most cases, computation of the DoA is quite costly. This paper aims to approximate the DoAs of nonlinear systems in real time by introducing a sampling approach.

Several techniques have been proposed in the literature to compute an inner approximation for the DoA [8], which can broadly be classified into Lyapunov-based and non-Lyapunov methods [11]. Lyapunov-based approaches include, for instance, sum of squares (SOS) programming [4], methods that apply both simulation and SOS programming [32], procedures that use theory of moments [13]. In this approach, first, a candidate Lyapunov function is chosen to show asymptotic stability of the system within a small neighborhood of the equilibrium. Next, the largest sublevel set of this Lyapunov function, in which its time derivative is negative definite, is computed as an estimate for the DoA [25]. Non-Lyapunov methods include, for instance, trajectory reversing [11, 24], determining reachable sets of the system [2], and occupation measures [14, 18]. Figure 1 illustrates a broad classification of the existing techniques for estimating the DoA.

Fig. 1
figure 1

A broad classification of the existing techniques for estimating the DoA. This paper proposes a sampling approach and makes a comparison with optimization-based methods

Although Lyapunov-based techniques have been successfully implemented for estimating the DoAs of various nonlinear systems [8], there are still two main issues with using these approaches. The first is that most of the existing methods are limited to polynomial systems [12, 31]. As such, in the case of non-polynomial systems, first, the equations of motion are approximated by using the Taylor’s expansion and then the DoA is computed based on the approximated polynomial equations. The second is that the available methods are usually computationally costly and time-consuming which make them unsuitable for real-time applications [7].

This paper proposes a fast sampling approach for Lyapunov-based techniques to estimate the DoAs of various nonlinear systems. This method is computationally effective and is beneficial for real-time applications. In this procedure, once a candidate Lyapunov function is chosen, a sampling algorithm searches for the largest sublevel set of the Lyapunov function such that its time derivative is negative definite throughout the obtained sublevel set. The proposed sampling approach is applied to approximate the DoAs of several nonlinear systems, which have been already investigated in the literature, to validate its capability in comparison with the existing methods. In addition, we go beyond these examples and implement it to compute the DoAs of the passivity-based learning controller [28] designed for an inverted pendulum.

This paper is organized as follows. Section 2 reviews the process of estimating the DoAs of nonlinear systems using Lyapunov-based techniques. Section 3 describes the sampling approach and provides a comparison between the estimated DoAs computed by the sampling method and by the existing optimization-based methods. Section 4 illustrates the DoAs approximated for the controller learned for an inverted pendulum. Finally, Sect. 5 concludes the paper after a short discussion on the properties of the sampling algorithm.

2 Estimating the domain of attraction using Lyapunov-based methods

Consider the dynamical system

$$\begin{aligned} \dot{x}=f(x,u) \end{aligned}$$
(1)

where \(x\in \mathcal {X}\subseteq {\mathbb {R}}^n\) is the state vector, \(u\in \mathcal {U}\subseteq \mathbb {R}^m\) is the control input, and \(f:\mathcal {X}\times {\mathcal {U}}\rightarrow {\mathbb {R}}^n\) is the system dynamics. For a specific state-feedback controller \(\varPhi (x)\) the closed-loop system is described by

$$\begin{aligned} {\dot{x}}=f(x,\varPhi (x))=f_{c}(x). \end{aligned}$$
(2)

If \(x^{*}\) is a stable equilibrium of the closed-loop system and \(x(t,x_0)\) denotes the solution of (2) at time t with respect to the initial condition, the DoA of controller \(\varPhi \) is defined by the set

$$\begin{aligned} {\mathcal {D}}(\varPhi )=\left\{ x_0\in \mathcal {X}:\lim _{t\rightarrow \infty }x(t,x_0)= x^{*}\right\} . \end{aligned}$$
(3)

An analytical method to approximate the DoA is defined via Lyapunov stability theory as follows [6, 16].

Theorem 1

[16] A closed set \({\mathcal {M}}\subset {\mathbb {R}}^n\), including the origin as an equilibrium, can approximate the DoA for the origin of system (2) if:

  1. 1.

    \(\mathcal {M}\) is an invariant set for system (2);

  2. 2.

    A positive definite function V(x) can be found such that \({\dot{V}}(x)\) is negative definite within \(\mathcal {M}\).

If the equilibrium is nonzero, without loss of generality, we can replace the variable x by \(z=x-\bar{x}^{*}\), where \(\bar{x}^{*}\) is the nonzero equilibrium. As such, we can study the stability of the associated zero equilibrium [1]. The conditions of Theorem 1 ensure that the approximated set \(\mathcal {M}\) is certainly contained in the DoA.

The choice of a candidate Lyapunov function is not a trivial task and the DoA approximation relies on the shape of the Lyapunov function’s level sets. A procedure to find an appropriate Lyapunov function has been proposed in [10], where gradient search algorithms are implemented to compute a candidate Lyapunov function. Moreover, using composite polynomial Lyapunov functions [29] and rational Lyapunov functions instead of quadratic ones might lead to better approximations, since these have a richer representation power (see, e.g., [9, 34]). Quadratic Lyapunov functions restrict the estimates to ellipsoids which are quite conservative [30]. A rational Lyapunov function is written in the form

$$\begin{aligned} V(x)=\frac{N(x)}{D(x)}=\frac{\sum \nolimits _{i=2}^\infty R_i(x)}{1+\sum \nolimits _{i=1}^{n-2} Q_i(x)} \end{aligned}$$
(4)

where \(R_i(x)\) and \(Q_i(x)\) are homogeneous polynomials of degree i, which are constructed by solving an optimization problem [34]. The sublevel set \(\mathcal {V}(c)\) of the Lyapunov function V(x) is defined by

$$\begin{aligned} \mathcal {V}(c)=\{x\in {\mathcal {X}}: V(x)\le c\}. \end{aligned}$$
(5)

According to Theorem 1, any sublevel set of a candidate Lyapunov function that satisfies the locally asymptotic stability of the equilibrium can be an estimate for the DoA if the time derivative of the Lyapunov function is negative everywhere within the sublevel set. Since the largest sublevel set provides a more accurate estimate, the problem of approximating the DoA is converted to the problem of finding the largest sublevel set of a given Lyapunov function [15]. To attain the largest estimate for the DoA, one needs to find the maximum value \(c\in {\mathbb {R}}\) for \({\mathcal {V}}(c)\) such that the computed set satisfies the conditions of Theorem 1.

Theorem 2

[8] The invariant set \({\mathcal {V}}(c_{*})\), which is a sublevel set of the Lyapunov function V(x), is the largest estimate of the DoA for the origin of system (2) if

(6)

This can be approached as an optimization problem that has been solved by using SOS programming, methods that apply both simulation and SOS programming, and methods that use theory of moments. However, these techniques are typically restricted to systems and Lyapunov functions described by polynomial equations. In this paper, we present an alternative approach using the sampling approach.

3 Sampling method for estimating the domain of attraction

The sampling approach presented in this paper has the same goal as the Lyapunov-based optimization approaches have: Find the largest sublevel set of a candidate Lyapunov function to approximate the DoA. We explicitly evaluate the conditions stated in Theorem 1 for a given Lyapunov function with respect to a randomly chosen state \(x_i\). The level sets associated with the sample \(x_i\) with positive derivative of the Lyapunov function are discarded. We propose two sampling methods, memoryless and with a memory, designed to achieve tighter estimates.

3.1 Memoryless sampling

This method searches for the upper bound of the parameter \(c_{*}\) in (6). First, a state \(x_i\) is randomly chosen within \(\mathcal {X}\) or its user-defined subset and the conditions of Theorem 1 are checked for \(V(x_i)\) and \(\dot{V}(x_i)\). If these conditions are not satisfied, the upper bound of \(c_{*}\), denoted \(\hat{c}_{*}\), is decreased to the value \(\hat{c}_{*}=V(x_i)\) and the sublevel set \({\mathcal {V}}(\hat{c}_{*})\) is computed as an overestimation for the DoA. At the beginning of the algorithm, \(\hat{c}_{*}\) is initialized at \(\hat{c}_{*}=\infty \). As the sampling proceeds for a large number of samples (\(n_{\mathrm {s}}\)) throughout the state space, the value of \(\hat{c}_{*}\) converges to \(c_{*}\) from above and the obtained largest sublevel set \(\mathcal {V}(\hat{c}_{*})\) will be very close to \(\mathcal {V}(c_{*})\). Since this procedure just focuses on the upper bound of \(c_{*}\), the achieved estimates are not tight enough and the condition of \(\dot{V}(x)<0\) may not be satisfied for some regions of the attained sublevel set as the computed value \(\hat{c}_{*}\) is actually larger than the real value \(c_{*}\). This algorithm may exceptionally not exclude very small regions where \(\dot{V}(x) \ge 0\) from the DoA approximated. However, the empirical evidence arising from extensive simulations suggests that, in practice, the proposed algorithm converges to the exact level set for a sufficiently large number of samples.

Based on the practical results, we found that this technique is very fast and its result is very close to the reported estimates in the literature for various classes of systems. Moreover, it does not require computer memory to save the results computed since once a new value is computed for \(\hat{c}_{*}\), its current value is replaced by the new value. Algorithm 1 summarizes this method for estimating the DoA of a given stable equilibrium.

figure a

As an example, consider a pendulum described by the following nonlinear dynamic equations

$$\begin{aligned} \left\{ \begin{array}{l} \dot{x}_1= x_2 \\ \dot{x}_2= -\sin (x_1)-0.5 x_2 \end{array}\right. \end{aligned}$$
(7)

where \(x_1\) is the angle of the pendulum measured from the vertical axis and \(x_2\) is the angular velocity. The state vector is defined by \(x=[x_1~x_2]^T\). We use the sampling method with a uniform distribution to approximate the DoA of the stable equilibrium \(x=(0,0)\). To compute a candidate Lyapunov function, first the dynamic Eq. (7) are linearized around the equilibrium and then the candidate Lyapunov function is computed in the form \(V(x)=x^TPx\), where P is the solution of the Lyapunov equation \(A^TP+PA+Q=0\) with the identity matrix Q. In this example, the candidate Lyapunov function is obtained as

$$\begin{aligned} V(x)=2.25x_1^2+x_1 x_2+2x_2^2. \end{aligned}$$
(8)

Figure 2 illustrates the evolution of \(\hat{c}_{*}\) of the sampling approach with \(n_{\mathrm {s}}=500\) samples. The real value \(c_{*}\) for the candidate Lyapunov function (8), calculated by solving the optimization problem (6), is \(c_{*}=9.287\) and the value computed by our method is \(\hat{c}_{*}=9.702\).

Fig. 2
figure 2

Evolution of \(\hat{c}_{*}\) using the memoryless sampling method for the pendulum example

3.2 Sampling with memory

This method updates both the lower and the upper bounds of \(c_{*}\) denoted \(\underline{\mathrm{c}}_{*}\) and \(\bar{c}_{*}\), respectively. Together, these bounds yield a more accurate estimate for the DoA. At the beginning of the algorithm, the lower bound of \(c_{*}\) is set to \(\underline{\mathrm{c}}_{*}= 0\) and its upper bound to \(\bar{c}_{*}=\infty \). If for a randomly chosen state \(x_i\) we have \(\dot{V}(x_i)<0\) and \( \underline{\mathrm{c}}_{*}<V(x_i)<\bar{c}_{*}\), then the value of \(\underline{\mathrm{c}}_{*}\) is replaced by the value of its associated Lyapunov function, that is \(\underline{\mathrm{c}}_{*} = V(x_i)\). Otherwise, if \({\dot{V}}(x_i) \ge 0\) and \(V(x_i)<\bar{c}_{*}\), then the value of \(\bar{c}_{*}\) is replaced by \(V(x_i)\). As the sampling proceeds, after a large number of samples, the value of \(\underline{\mathrm{c}}_{*}\) increases, but not necessarily monotonically. Eventually it converges to \(c_{*}\) and the largest sublevel set \(\mathcal {V}(\underline{\mathrm{c}}_{*})\) is obtained. Moreover, the value of \(\bar{c}_{*}\) monotonically decreases and converges to \(c_{*}\) from above.

When the conditions of Theorem 1 are satisfied for state \(x_i\), the value of \(V(x_i)\) is stored in an array as a possible estimate for \(c_{*}\). This is required to guarantee that the approximated DoAs computed by the lower bound of \(c_{*}\) always verify the conditions of Theorem 1. This leads to tighter estimates. The array, denoted \(\mathcal {E}\), contains 0 initially. The length of this array, without counting its initial element, is in the worst case \(n_\mathrm {s}-1\). When \(\dot{V}(x_i)<0\) and \(V(x_i)<\bar{c}_{*}\), the value of \(V(x_i)\) is stored in an array \(\mathcal {E}\) as \(\mathcal {V}\left( V(x_i)\right) \) is a potential estimate for the DoA. In the case \(\dot{V}(x_i)\ge 0\) and \(V(x_i)<\bar{c}_{*}\), if \(\underline{\mathrm{c}}_{*} \ge \bar{c}_{*}\) then the algorithm looks for a new lower bound \(\underline{\mathrm{c}}_{*}\) among the values stored in the array \(\mathcal {E}\). The maximum value of \(\underline{\mathrm{c}}_{*}\) is chosen from \(\mathcal {E}\) such that \(\underline{\mathrm{c}}_{*}<\bar{c}_{*}\). Selecting a previously stored lower bound satisfies the condition \(\dot{V}<0\) for the obtained sublevel set \(\mathcal {V}(\underline{\mathrm{c}}_{*})\). In the worst-case scenario, \(\underline{\mathrm{c}}_{*}=0\).

Although the sampling algorithm with memory is a conservative method, it may exceptionally overestimate the DoA, for instance, when the region described by \({\dot{V}}(x)<0\) is not simply connected. In such a case, the algorithm may not exclude small holes inside the region in which \({\dot{V}}(x)\ge 0\). A formal guarantee for convergence of this algorithm does not exist yet, but the empirical result attained from extensive simulations and experiments illustrates that the sampling technique converges to the exact level set for a sufficiently large number of samples, in practice. Algorithm 2 describes the sampling method with memory for estimating the DoA.

figure b

We apply this approach with a uniform distribution sampling to approximate the DoA for the equilibrium of the pendulum example. Figure 3 illustrates the values of the lower and upper bounds of \(c_{*}\) throughout the sampling process with 500 samples where \(\underline{\mathrm{c}}_{*}=9.174\).

Fig. 3
figure 3

Evolution of \(\underline{\mathrm{c}}_{*}\) and \(\bar{c}_{*}\) using the sampling method with memory for the pendulum example

Figure 4 depicts the approximated DoA of the equilibrium. The black ellipsoid represents the DoA estimate with \(\underline{\mathrm{c}}_{*}=9.271\), the dashed blue line, which determines the boundary of the light blue area, represents the region in which \(\dot{V}(x)<0\), and the arrows represent the system trajectories. If the trajectories start inside the DoA estimate, they certainly converge to the origin. The randomly chosen sampling states, which are 500 samples in this example, are represented by red points throughout the state space.

Fig. 4
figure 4

Approximated DoA for the pendulum example using a uniform distribution for sampling. The black ellipsoid represents the DoA estimate, the dashed blue line (the boundary of the light blue area) represents the region in which \(\dot{V}(x)<0\), the arrows represent the system trajectories, and the red points represent the randomly chosen sampling states. (Color figure online)

3.3 Repeatability of the sampling method

To check the repeatability of the proposed sampling approach, we run various instances of the process of estimating the DoA for the equilibrium of the pendulum example. Figure 5 illustrates the mean value of \(\underline{\mathrm{c}}_{*}\) and \(\bar{c}_{*}\) (i.e., \((\underline{\mathrm{c}}_{*}+\bar{c}_{*})/2\)) and its standard deviation by a black line and green bars, minimum of \(\underline{\mathrm{c}}_{*}\) and maximum of \(\bar{c}_{*}\) by blue dashed lines at each sample in a simulation where the sampling method runs 1000 iterations each with 500 samples. The real value of \(c_{*}=9.287\) is represented by a dotted red line. While sampling proceeds, the mean, minimum and maximum values converge to the real value of \(c_{*}\) and the value of the standard deviation decreases. These results validates the repeatability of the proposed sampling technique for this particular model.

Fig. 5
figure 5

Evolution of mean value of \(\underline{\mathrm{c}}_{*}\) and \(\bar{c}_{*}\) and its standard deviation, minimum value of \(\underline{\mathrm{c}}_{*}\), and maximum value of \(\bar{c}_{*}\) for the sampling method in the pendulum example. The real value of \(c_{*}\) is represented by the dotted red line. (Color figure online)

3.4 Directed sampling

In the pendulum example, we used a uniform distribution for sampling the state space or its subset. However, if the structure of the level sets of the Lyapunov function are known, other distributions can be used to avoid sampling in areas of the state space which are already known not belong to the DoA. It is desirable to sample inside the largest level set found so far, specially in its boundary.

In general, sampling with an arbitrary distribution is a challenging problem. Two main approaches exist in the literature: rejection sampling and inverse transform sampling [3]. These techniques focus on sampling the relevant locations of the state space at the cost of computational complexity. In situations where evaluating a particular sample is costly (due to a complicated Lyapunov function or system dynamics), the extra cost incurred by sampling from a complex distribution may be negligible.

To test the trade-off between the speed of convergence and the computational cost, we have applied three different sampling approaches to the pendulum example (7). The uniform sampling on a fixed box (a subset of the state space) is compared with uniform sampling mapped through polar coordinates to lie inside the largest found valid level set, and with exponential sampling mapped through polar coordinates to lie around the boundary of the largest found valid level set. Figure 6 illustrates the sampling points selected by the three types of distributions. The obtained data corroborate the hypothesis that different sampling leads to different convergence rates. Figure 7 presents the convergence statistics for 1000 iterations with 500 samples each. The exponential polar sampling converges the fastest and has the lowest variation between \(\underline{\mathrm{c}}_{*}\) and \(\bar{c}_{*}\) while converging. This can be explained by observing Fig. 6c that most of the samples are focused around the boundary of the level set. For this particular example, the cost of evaluating the Lyapunov function and its time derivative is low, but the computation time increases with the complexity of the sampling algorithm. Table 1 shows the average computation time of each sampling method with 500 samples, implemented in the Mathematica software on an Intel core i7 2.7 GHz microprocessor.

Fig. 6
figure 6

Approximated DoAs for the pendulum example using a a uniform, b polar uniform, and c polar exponential distribution for sampling. In the plots, the black ellipsoid represents the DoA estimate, the dashed blue line represents the region in which \(\dot{V}(x)<0\), the arrows represent the system trajectories, and the red points represent the randomly chosen sampling states. (Color figure online)

Fig. 7
figure 7

Evolution of the mean value of \(\underline{\mathrm{c}}_{*}\) and \(\bar{c}_{*}\), minimum value of \(\underline{\mathrm{c}}_{*}\), and maximum value of \(\bar{c}_{*}\) for the sampling technique implemented for the pendulum example with a uniform, polar uniform, and polar exponential distribution. The sampling method runs 1000 iterations each with 500 samples. The real value of \(c_{*}\) is represented by a dashed black line

3.5 Sampling method versus optimization-based methods

Both the sampling and optimization-based methods require a candidate Lyapunov function for estimating the DoA. Table 2 represents six dynamical systems with quadratic Lyapunov functions selected from the literature. The dynamic equations of the first three examples are polynomial and the equations of the last three are non-polynomial. Examples E3 and E6 are third-order systems and the others are second-order systems. For each system, the maximum possible value of \(c_{*}\) computed by the sampling approach with 1000 samples is compared with the result of optimization-based methods, reported in the literature. The estimates attained by the sampling technique are very close to the estimates derived by optimization-based methods. In some cases, such as example E2, the result of the sampling procedure is even more accurate. The last column of Table 2 presents the simulation time for approximating the DoA of each system using the sampling approach, implemented in the MATLAB R2014a software on an Intel core i7 2.7 GHz microprocessor.

Table 1 Computation time statistics of the sampling methods with various distributions for estimating the DoA of the pendulum example

Similarly, Table 3 illustrates three dynamical systems with rational Lyapunov functions selected from the literature. Example E7 is a second-order polynomial system, E8 is a second-order non-polynomial system, and E9 is a third-order polynomial system. Table 4 presents their corresponding rational Lyapunov functions based on (4). The maximum possible value of \(c_{*}\) obtained by the sampling approach with 1000 samples is compared with the result of optimization-based methods, reported in the literature. The result of this comparison validates the proposed sampling technique particularly for non-polynomial systems. The simulation time for approximating the DoA of each system using the sampling procedure is given in the last column of Table 3. Figure 8 depicts the approximated DoAs obtained by the sampling method for the origins of examples E1–E9.

Table 2 Dynamical systems with quadratic Lyapunov functions
Table 3 Dynamical systems with rational Lyapunov functions
Table 4 Rational Lyapunov functions for the systems of Table 3

Based on the obtained results, it is concluded that the proposed sampling approach is suitable for estimating the DoAs of both polynomial and non-polynomial systems. It is computationally effective and computes the DoA estimate considerably fast. Although the sampling method may offer less accurate estimates for the DoA at times, it is very useful for real-time applications. It is also beneficial for the control schemes applying the controllers’ DoAs such as online sequential composition approaches [2123].

4 Experimental results

Consider the inverted pendulum, shown in Fig. 9, which is modeled by the nonlinear differential equation

$$\begin{aligned} J{\ddot{q}}=mgl \sin (q)-\Big (b+\frac{K^2}{R}\Big ){\dot{q}}+\frac{K}{R}u \end{aligned}$$
(9)

where q is the angle of the pendulum measured from the upright position, J is the pendulum inertia, m is the mass, l is the pendulum length, and b is the viscous mechanical friction. Moreover, K is the motor constant, R is the electrical motor resistance, and u is the control input in Volts which is saturated at \(\pm 3\) V. The state vector of the system is defined by \(x=[q~p]^T\) with \(p=J\dot{q}\) the angular momentum. Table 5 presents the values of the physical parameters of the pendulum. These values have been found partly by measuring and partly estimated using nonlinear system identification.

The algebraic interconnection and damping assignment actor-critic (A-IDA-AC) algorithm, proposed in [20], is implemented to obtain swing-up and stabilization of the pendulum at the desired upper equilibrium \(x_{\mathrm {d}}=(q_{\mathrm {d}},p)=(0,0)\). The goal of this algorithm is to find a proper control input after a number of learning trials. Monitoring the DoA of the learned controller at every trial provides a stopping criterion to terminate learning once the DoA is large enough to fulfill the control objective. This leads to learning in a short amount of time. The parameterized control policy is given by

$$\begin{aligned} \hat{\pi }(x,\vartheta ) =-\vartheta ^{T}\varPsi (x)\gamma (q-q_{\mathrm {d}})-mgl\sin (q) \end{aligned}$$
(10)

where \(\vartheta \in {\mathbb {R}}^n\) is a parameter vector, \(\varPsi \in {\mathbb {R}}^n\) is a user-defined basis function vector, and \(\gamma \) is a unit conversion factor with the value of one. The parameter vector \(\vartheta \) is updated using the actor-critic reinforcement learning (RL) method by following the procedure described in [20]. Consequently, the saturated control input of the A-IDA-AC algorithm is computed at each time step by

$$\begin{aligned} u_k=\text {sat}\left( \hat{\pi }(x_k,\vartheta _k)+\varDelta u_k\right) \end{aligned}$$
(11)

where \(\varDelta u_k\) is a zero-mean Gaussian noise, as an exploration term.

The desired system Hamiltonian is chosen in the quadratic form

$$\begin{aligned} H_{\mathrm {d}}(x) = \frac{1}{2}\gamma (q-q_{\mathrm {d}})^2+\frac{p^2}{2J}. \end{aligned}$$
(12)

We exploit the desired system Hamiltonian as a candidate Lyapunov function to approximate the DoAs of the learned controllers at each learning trial [17]. Figure 10 illustrates the approximated DoAs of the learned controllers computed by the sampling approach at seven specific trials, where the trial numbers are also given. While learning is in progress, the DoAs of the learned controllers typically enlarge centered at the up equilibrium, but not necessarily monotonically. In this example, after 35 trials, the DoA of the controller is large enough to cover the initial state; hence, the learning process can be terminated sooner instead of running for all the scheduled trials. As such, the sampling method speeds up the process of learning controllers.

Fig. 8
figure 8

Approximated DoAs for the origins of examples E1–E9 described in Tables 2 and  3 using the sampling method

Fig. 9
figure 9

Inverted pendulum and its schematic representation

Table 5 Physical parameters of the inverted pendulum
Fig. 10
figure 10

Approximated DoAs of the learned controllers at seven specific trials for the inverted pendulum, where the trial numbers are also indicated

5 Conclusions

This paper has proposed a fast sampling approach for estimating the DoAs of nonlinear systems in real time. The approximated DoAs computed by this technique have been compared with the estimates derived by optimization-based methods. It is concluded that the sampling approach is fast and computationally effective in comparison with optimization-based methods and it can be used for real-time applications. Although a formal guarantee for convergence does not exist yet, the empirical evidence arising from extensive simulations suggests that in practice this approach always converges to the exact level set for a sufficiently large number of samples. Moreover, the rate of convergence depends on the distribution function selected for sampling as well as the exploring regions of the state space. Using a more sophisticated distributed function can speed up convergence of the sampling procedure since it can avoid sampling in areas of the state space which are already known not belong to the DoA. As such, there is a trade-off between the speed of convergence and the computational cost imposed by the complexity of the sampling distribution function.

In addition, the sampling approach has been applied to approximate the DoAs of a passivity-based learning controller, designed for an inverted pendulum system, at every learning trial. This online approximation can be used as a stopping criterion for the learning process. This allows learning to be terminated as soon as the controller’s DoA is sufficiently large to satisfy the control objective. Thus, the proposed sampling method enables learning in a short amount of time.