1 Introduction

The capsule robots, often abbreviated to capsubots, are a class of capsule-shaped micro-robots able to explore the fields that are normally inaccessible to humans [1]. A particularly interesting subset of capsubots are devices propelled by an internal mechanical oscillator. The vibrating mass produces inertial forces which allow the whole capsule to move in the presence of friction. In this approach the external moving parts such as wheels, tracks, robotic legs or arms are no longer necessary [1, 2]. Such capsubots are truly intriguing from the practical point of view due to their enormous potential in medicine, engineering and other areas [1, 2]. Moreover, their rich dynamics [2] encompassing such phenomena as impacts, dry friction, etc. remains a broad research topic itself.

Wide range of research concerning the analysis and design of capsubots has already been performed. For instance, Guo et al. [2] presented a mesoscale prototype of a self-propelled vibro-impact capsule system as well as its optimization in terms of the average progression velocity, energy efficiency and power consumption. Huda and Yu [3] developed a control strategy for a capsubot, according to which the inner mass of the device—comprising two masses placed at the opposite ends of the cylindrical rod—is surrounded by a motor housing with a coil held in a shell. In [4], Liu et al. proposed a vibro-impact capsule containing a harmonically excited internal oscillator impacting a massless plate suspended on a spring. Such arrangement causes the resultant horizontal force acting on the capsule to be asymmetric, which in the presence of dry friction enables the system to move forward. More detailed studies on the control function with regard to the progression rate, as well as the optimization of energy consumption, have been presented in [5,6,7]. A comprehensive bifurcation analysis of a vibro-impact system with the use of path-following methods accompanied by an experimental investigation can be found in [8]. Liu et al. [9] described a downscaled self-propelled vibro-impact capsule system with an ability to move precisely in a limited space, having a size equal to a market-leading gastrointestinal capsule endoscope. The capsule system includes two impact constraints with a linear bearing holding a T-shape magnet situated between them that restricts its linear motion. The dynamic analysis of the prototype, as well as the optimization of the progression speed and minimization of the required propulsive force are described.

Apart from the vibro-impact systems, a different layout of the capsule drives can be utilized. In particular, an interesting modification of the vibro-impact capsule can be obtained by replacing the mass-on-spring oscillator with a pendulum. In this case, propulsion of the capsule is caused by the interactions between friction, inertial forces produced by the swinging motion of the pendulum and the contact force between the capsule and the underlying surface [10]. Such arrangement seems to make the dynamics of the system somewhat more complex, because the contact force is dependent on the oscillations of the pendulum. Periodic locomotion principles and nonlinear dynamics of a pendulum-driven capsule system have been investigated in the works [10,11,12]. It is worth noting that in [12], Liu included a motion-generation strategy in the presence of viscoelasticity. The design and optimization of parameters of a pre-designed control function profile for the pendulum capsule system have been considered in [12,13,14].

The existing methods of controller design applicable to capsubots use various approaches including the open-loop control, closed-loop feedback linearization or neural networks. In [1], Liu et al. presented three control approaches for capsubots. The first one involves an open-loop control, whereas the second utilizes a closed-loop control with a partial feedback linearization technique based on trajectory tracking. The last one, called a simple switch control, is a combination of the previous methods. The control profiles learned from the open and closed-loop control are used to move the capsubot effectively in the desired direction. In [3], Huda and Yu described a strategy for controlling a cylindrical rod, composed of two stages. The first one assumes the desired trajectory generation, whereas the second focuses on inner mass closed-loop control for a given desired trajectory with a partial feedback linearization approach. An adaptive trajectory tracking control method for a vibro-driven capsule system has been described by Liu et al. [13]. The implementation of an auxiliary input control variable establishing the non-collocated feedback loop, is constructed to cope with the parametric uncertainties. A comparison of the proposed approach with the classical one has been performed with the aid of a closed-loop feedback-tracking control system. The improvements in this method have been shown by Liu et al. [15]. This novel approach focuses on adding a neural network approximator and a robust compensator to an auxiliary control variable. The proposed design method with multi-layer neural networks and variable strategy structure as well as an adaptive tracking control scheme copes well with uncertainties such as a priori unknown parameters, approximation errors and disturbances [15]. In [16], Zarychta et al. described a novel Fourier series-based method of the open-loop optimal control estimation, applicable to discontinuous systems such as capsule drives. In [17], the problem of crossing a circular fold by a capsule robot has been discussed by Yan et al. The path following techniques have been utilized and the COCO software has been used in numerical studies.

Liao et al. [18] described the speed optimization of self-propelled capsule robot [9] in the varying frictional environment between the device and its supporting surface with the use of Six Sigma and Multi-Island genetic algorithms, having utilized the Monte Carlo approach for validation. In the literature we can find another interesting examples of the multi-objective optimization with Six Sigma as a controller for the genetic algorithm [19, 20] and its reliability analysis with the use of a Monte Carlo algorithm [21, 22].

There is seemingly little research on the application of neural networks in the optimal capsule drives control or other similar systems. Possibly, the direct application of the Reinforcement Learning technique [23,24,25] could be used to obtain an approximation of the optimal closed-loop controller. However, such an approach would require a lot of time and computational resources [23,24,25]. Therefore, in this work we propose a simpler option. Provided that the optimal open-loop control is determined, a neural network approach can be used to establish the dependence of the controller’s output and the corresponding states of the controlled system. In such a manner, a closed-loop controller can be obtained, the action of which reflects the open-loop optimal control operation.

The aim of this study is to test and evaluate the aforementioned concept. For this purpose, an approximation of the open-loop optimal control of a pendulum capsule drive is performed by means of the Fourier series-based method described by us in [16]. After that, a neural network is used to determine the dependency between the output value of the optimized open-loop controller and the corresponding states of the capsule system. In such a manner, a closed-loop controller is obtained. Finally, the performance and robustness of the closed-loop neural controller are compared with the original open-loop one. The results show that the neural controller maintains the efficiency of the original and offers greater robustness against the uncertainty of the controlled system friction coefficient, which is actually one of the main limitations in the use of open-loop controllers.

We believe that such a solution can be an interesting option in the design and optimization of controllers used in the mechanical systems, including the discontinuous ones. Moreover, it is expected that the proposed method will facilitate the construction of closed-loop controllers of the systems for which an optimal open-loop control is available.

2 Mathematical model

The subject of this research is the pendulum capsule drive. This section presents a brief description of the system along with an approximation of its open-loop optimal control. The information presented below is a foundation for the new, neural network based, closed-loop controller of the device. A scheme of the pendulum capsule drive is presented in Fig. 1.

Fig. 1
figure 1

Scheme of the capsule drive system. \(\mathrm{M}\)—mass of the capsule, \(\mathrm{m}\)—mass of the pendulum, \(\mathrm{l}\)—length of the pendulum, \(\uptheta\)—pendulum angle, \(\mathrm{k}\)—spring stiffness, \(\mathrm{c}\)—damping coefficient, \({\mathrm{F}}_{\uptheta }\)—external torque acting on the pendulum, \({\mathrm{F}}_{\mathrm{x}}\)—friction force, \(\mathrm{x}\left(\mathrm{t}\right),\mathrm{ y}(\mathrm{t})\)—coordinates of the capsule [16]

In the system under consideration, the propulsion of capsule is caused by the interactions between friction \({F}_{x}\), horizontal inertial forces produced by the swinging motion of the pendulum \({R}_{x}\) and the reaction (contact) force between the capsule and the underlying surface \({R}_{y}\). Dynamics of the presented system is thoroughly described in [10,11,12]. In order to derive the motion equations of the capsule, one can either directly apply Newton’s laws of motion or use the Lagrange approach. In the latter, the value of the constraint force \({R}_{y}\) can be determined with the aid of a Lagrange multiplier [26].

A detailed derivation of the pendulum capsule motion equations can be found in our previous work [16]. Therefore, in this paper we present only a brief explanation. In the dimensional form, equations of motion of the capsule are as follows:

$$\begin{array}{c}m{l}^{2}\ddot{\theta }\left(t\right)-ml\ddot{x}\left(t\right)\mathrm{cos}\theta \left(t\right)=mgl\mathrm{sin}\theta \left(t\right)-k\theta \left(t\right)-c\dot{\theta }\left(t\right)+{F}_{\theta }\left(t\right)\end{array}$$
(1a)
$$\begin{array}{c}\left(M+m\right)\ddot{x}\left(t\right)-ml\ddot{\theta }\left(t\right)\mathrm{cos}\theta \left(t\right)+ml{\dot{\theta }}^{2}\left(t\right)\mathrm{sin}\theta \left(t\right)=-{F}_{x}(t)\end{array}$$
(1b)
$$\begin{array}{c}{R}_{y}(t)=\left(M+m\right)g-ml\ddot{\theta }\left(t\right)\mathrm{sin}\theta \left(t\right)-ml{\dot{\theta }}^{2}\left(t\right)\mathrm{cos}\theta \left(t\right)\end{array}$$
(1c)

where \(g\) is the gravitational acceleration and all the other symbols are described in the caption below Fig. 1. Then, the following non-dimensional quantities are used:

$$\begin{aligned} & \Omega =\sqrt{\frac{g}{l}} , \tau =\Omega t, \gamma =\frac{M}{m}, z=\frac{x}{l}, \rho =\frac{k}{m{\Omega }^{2}{l}^{2}}, \nu =\frac{c}{m\Omega {l}^{2}},\\ & {f}_{z}=\frac{{F}_{x}}{m{\Omega }^{2}l}, u=\frac{{F}_{\theta }}{m{\Omega }^{2}{l}^{2}}, {r}_{z}=\frac{{R}_{x}}{m{\Omega }^{2}l}, {r}_{y}=\frac{{R}_{y}}{m{\Omega }^{2}l}\end{aligned}$$
(2)

where \(t\) and \(\tau\) correspond to dimensional and dimensionless time, respectively. Relations between derivatives with respect to \(t\) and \(\tau\) are as follows.

$$\begin{aligned}&\dot{x}=\frac{dx}{dt}=\frac{dx}{d\tau }\frac{d\tau }{dt}=\Omega \frac{dx}{d\tau }=\Omega {x}^{\prime}, \ddot{x}=\frac{{d}^{2}x}{d{t}^{2}}\\ &=\frac{d}{dt}\left(\frac{dx}{dt}\right)=\frac{d}{d\tau }\left(\Omega \frac{dx}{d\tau }\right)\frac{d\tau }{dt}={\Omega }^{2}\frac{{d}^{2}x}{d{\tau }^{2}}={\Omega }^{2}{x}^{\prime\prime}\end{aligned}$$
(3)

Using symbols and notation defined in formulas (2) and (3), equations of motion of the pendulum capsule drive can be presented in the following dimensionless, matrix form.

$$\begin{array}{c}\left[\begin{array}{cc}1& -\mathrm{cos}\theta \left(\tau \right)\\ -\mathrm{cos}\theta \left(\tau \right)& \gamma +1\end{array}\right]\left[\begin{array}{c}{\theta }^{\prime\prime}\left(\tau \right)\\ {z}^{\prime\prime}\left(\tau \right)\end{array}\right]=\left[\begin{array}{c}\mathrm{sin}\theta \left(\tau \right)-\rho \theta \left(\tau \right)-\nu {\theta }^{\prime}\left(\tau \right)+u\left(\tau \right)\\ -{\theta }^{\mathrm{^{\prime}}2}\left(\tau \right)\mathrm{sin}\theta \left(\tau \right)-{f}_{z}\left(\tau \right)\end{array}\right]\end{array}$$
(4)

Further dimensionless quantities are contact force \({r}_{y}\), resultant horizontal load due to the pendulum’s motion \({r}_{z}\), and the dimensionless Coulomb friction \({f}_{z}\), that are described by the following equations:

$$\begin{array}{c}{r}_{y}\left(\tau \right)=\left(\gamma +1\right)-{\theta }^{\prime\prime}\left(\tau \right)\mathrm{sin}\theta \left(\tau \right)-{\theta }^{\mathrm{^{\prime}}2}\left(\tau \right)\mathrm{cos}\theta \left(\tau \right)\end{array}$$
(5)
$$\begin{array}{c}{r}_{z}\left(\tau \right)={\theta }^{\prime\prime}\left(\tau \right)\mathrm{cos}\theta \left(\tau \right)-{\theta }^{\mathrm{^{\prime}}2}\left(\tau \right)\mathrm{sin}\theta \left(\tau \right)\end{array}$$
(6)
$${f}_{z}\left(\tau \right)=\left\{\begin{array}{l}\mu {r}_{y}\left(\tau \right)sgn\left[{z}^{\prime}\left(\tau \right)\right]\leftrightarrow {z}^{\prime}\left(\tau \right)\ne 0\\ \mu {r}_{y}\left(\tau \right)sgn\left[{r}_{z}\left(\tau \right)\right]\leftrightarrow z\left(\tau \right)=0\wedge \left|{r}_{z}\left(\tau \right)\right|\ge \mu {r}_{y}\left(\tau \right)\\ {r}_{z}\left(\tau \right)\leftrightarrow z\left(\tau \right)=0\wedge \left|{r}_{z}\left(\tau \right)\right|<\mu {r}_{y}\left(\tau \right)\end{array}\right.$$
(7)

where \(\mu\) is the friction coefficient. Equations (4)–(7) form the complete, dimensionless model of the capsule pendulum drive presented in Fig. 1.

Our previous paper [16] describes a numerical method that enables approximation of the optimal control in the form of a finite number of Fourier series terms (8).

$$u\left(\tau \right)=\frac{{a}_{0}}{2}+\sum_{k=1}^{K}{a}_{k}\mathrm{cos}\left(k\omega \tau \right)+\sum_{k=1}^{K}{b}_{k}\mathrm{sin}\left(k\omega \tau \right)$$
(8)

The detailed description of the method is beyond the scope of the current paper. Therefore, only the main idea is briefly explained here.

Using a finite number of the Fourier expansion harmonics (8), any periodic, bounded, piecewise continuous function can be approximated. The approximation accuracy increases with the number of harmonics, \(K\). Therefore, Eq. (8) enables the parametrization of control function with the aid of \(2K+3\) parameters: \({a}_{0},{a}_{1},\dots ,{a}_{K},{b}_{1},\dots ,{b}_{K},\omega ,K\). However, such parametrization cannot be effectively used to transform the optimal control problem into a nonlinear programming problem, i.e., it is not possible to simply optimize \({a}_{i},{b}_{i}\) parameters of the formula (8). The reason for this is the fact that the control function in all practical applications is bounded and there is no direct method to transform the restrictions imposed on the control function into the limits of values of the parameters \({a}_{i},{b}_{i}\). The solution to this problem is explained in Fig. 2. One can notice that if the coefficients of subsequent harmonics are stacked in a vector \([{a}_{1},{b}_{1},\dots ,{a}_{K},{b}_{K}]\), then the “span” of the resulting function depends on the length of such vector, whereas its “shape” changes with the direction of the vector.Footnote 1 The direction of a vector in \({\mathbb{R}}^{2K}\) can be described by a point on a unit sphere of dimension \((2K-1)\), which is specified by \(2K-1\) spherical coordinates. Then, the location of the control function span within the frame specified by the set of allowable controls can be described by two additional numbers from the interval \([\mathrm{0,1}]\). Such transformation enables parametrization of the Fourier expansion (8) in terms of \((2K-1)\) spherical coordinates that specify the shape of the optimized function, 2 parameters governing its span, parameter \(\omega\), which influences the period and a fixed number of harmonics \(K\). In such a manner, the function (8) is well-defined in terms of \(2K+1\) parameters in fixed ranges, which enables their optimization, the parameter \(\omega\) (which can be either optimized or selected beforehand), and the parameter \(K\), which has to be selected a-priori. For more details, please refer to paper [16].

Fig. 2
figure 2

Scheme of the Fourier series based method for optimal control approximation

Using the algorithm described above, the control of the system (4)-(7) has been optimized with respect to the distance covered by the capsule within the dimensionless time interval \(\tau \in \left[\mathrm{0,100}\right]\). In the optimization process, the following values of system parameters have been assumed: \(\mu =0.3,\rho =2.5,\nu =1.0,\gamma =10\). Moreover, it has been asserted that the control \(u(\tau )\) has to remain in the allowable range \([-\mathrm{4,4}]\). Under such assumptions, taking \(K=5\) harmonics in the formula (8), the following parameters of the approximate, open-loop optimal control have been obtained.

$$\begin{aligned}& {a}_{0}\approx 1.62506, \omega \approx 1.64722\\ & \left({a}_{1}, {a}_{2}, {a}_{3},{a}_{4},{a}_{5}\right)\approx \left(-3.43222, -1.95285, -0.68182, 0.38493, 0.17389\right)\\ & \left({b}_{1}, {b}_{2}, {b}_{3},{b}_{4},{b}_{5}\right)\approx \left(-0.41690, 0.12411, -0.10468, 0.13722, 0.27902\right)\end{aligned}$$
(9)

The control function (8) with parameters (9) and the resulting trajectory of the capsule’s motion are presented in the 4th section of the paper. These results are the starting point for the current research. In the remaining part of this work, we are going to show that a neural network can learn from the open-loop solution (8) in order to form a closed-loop controller. Moreover, it will be demonstrated that, counterintuitively, such a neural network can outperform the original solution which served as the training set in the learning process (see Sect. 4).

3 Methods

In the following sections, the use of a Neural Network (NN) in the closed-loop controller design is presented. We base this process on the approximated solution of the open-loop optimal control function, for which we used our developed Fourier series-based method [16]. The objective of the NN is to return the value of optimized control for an arbitrary state of the controlled object (i.e., the pendulum capsule drive).

The research is divided into three stages. Within the preliminary research, we evaluate the performance of feedforward artificial neural network predictive models that approximate the dependencies between the optimized open-loop control and the corresponding state variables of the system. In order to achieve this, the multi-layer perceptron (MLP) has been created in the Python language, the architecture of which is described in Sect. 3.2. In this stage, we test the design of the MLP with regard to different activation functions for the hidden and output layer, along with the changing number of neurons. The process of NN training and the criteria used for the model performance validation are described in Sect. 3.3.

In the second stage, the parameters of the NN predictive models with the top-scoring performance are implemented in the simulated controller of the pendulum capsule drive. Therefore, we are able to calculate the distance covered by the capsule system driven by the neural controller.

In the last part, the robustness of both controllers (the optimized open-loop and the neural closed-loop) concerning the varying coefficient of friction between the capsule and the underlying surface is examined. The performance of the controllers is tested in different conditions, from constant friction coefficient to large variations of this parameter.

To facilitate the comprehension of the presented method, all the above-described steps are performed in an Appendix using a much simpler example—a mathematical pendulum. We encourage the reader to analyze this additional material. Moreover, all the scripts created within the aforementioned stages of this study are available in a reference data repository [35].

3.1 The reference dataset

In this research, as a reference dataset we consider an optimized control trajectory of the pendulum capsule system along with its control calculated according to the formulas of the Fourier series-based algorithm presented in Sect. 2.

The reference dataset (see Fig. 3), consisting of approximately 10,000 control samples with six columns, contains the information about the system state variables \({x}_{0}-{x}_{3}\), optimized, open-loop control \(u\) and dimensionless time \(\tau\). In each row of the dataset, one sample of the reference control is represented. Its inputs called features, stored as columns, refer to: \({x}_{0}\)—position of the pendulum \(\theta\), \({x}_{1}\)—velocity of the pendulum \(\dot{\theta }\), \({x}_{2}\)—position of the capsule \(z\), and \({x}_{3}\)velocity of the capsule \(z^{\prime}\). Since the control has to be independent of the capsule position and the time, the related variable \({x}_{2},\) along with the dimensionless time \(\tau\) were dropped from the learning process. Therefore, we consider three features, i.e., state variables of the system (see formulas (4)–(7)) and the corresponding control \(u\) as our target variable that is going to be predicted by NN.

Fig. 3
figure 3

The reference dataset for NN training

3.2 The Neural Network architecture

The design of NN architecture demands a few key decisions to be made, for instance to select the number of layers, number of neurons and activation functions. The number of hidden layers depends on the problem which we are dealing with. In this research we fix one hidden layer, as the given configuration is sufficient to approximate an arbitrary continuous function [27, 28], such as the optimal control (the target of our NN prediction). Moreover, we want to provide a simple method that does not require the design of the deep NN model.

Another problem to be solved is the number of neurons. This should be chosen in a way that avoids under or overfitting. In the first case where there are not enough neurons to train, the results may not be satisfactory and adequate to what has been expected. On the other hand, with too many neurons, the artificial neural network possesses an excessive number of parameters to be determined which makes it “remember” each data point separately and lose the data generalization property. Many rules of thumb exist for establishing the correct number of neurons in a hidden layer [27]. We decided to test one of them, where the number of hidden neurons is calculated as the sum of 2/3 the size of the input layer and the whole size of the output layer. In total, five different numbers of neurons (3, 5, 10, 30, and 50) are tested. The first value, equal to 3, is based on the rule of thumb, whereas the other four are chosen arbitrarily, basing on authors’ experience.

The activation function could be imagined as a filter that processes the values going through it and scales the output into the proper range. Without any activation function applied, the NN could learn only linear transformations. One of the exemplary activation functions used for the hidden layer is Rectified Linear Unit (ReLU), considered one of the most efficient due to its good resistance to vanishing gradients [29]. In this research, we also use the sigmoid (logistic) and hyperbolic tangent functions. The latter is preferred, since its gradients are not restricted to vary in the specified direction [30]. Moreover, when the output returned by the sigmoid function is close to zero, caused by the highly negative inputs, the process of the neural network prolongs and the probability of getting stuck in some local minima is higher [31]. For the output layer, a default option is a linear function (commonly used for regression problems). In this study, we test both: linear and sigmoid.

As the architecture of the NN model, including the different number of neurons as well as the activation functions for hidden and output layers, is established, we propose a default MLP model with n neurons in Fig. 4.

Fig. 4
figure 4

Multi perceptron layer (MLP) architecture, where: n—total number of neurons in the hidden layer, \({x}_{0},{x}_{1},{x}_{3}\)—system state variables and input activation for the hidden layer, \(b,{b}_{1},{b}_{2},\dots {b}_{n}\)—bias for the current neuron in the hidden layer, \({z}_{1}^{h},{z}_{2}^{h},\dots\) \({z}_{n}^{h}\)—value for the current neuron in the hidden layer, \({a}_{1}^{out},{a}_{2}^{out},\dots {a}_{n}^{out}\)—input activation for the output layer, \(u\)—output (control)

The proposed MLP consists of three layers: one input, one hidden, and one output layer. Please note that each layer is fully connected to the previous one via weight coefficients. For the input layer, the three features \({x}_{0},{x}_{1}\), and \({x}_{3}\), referring to the position of the pendulum \(\theta\), the velocity of the pendulum \(\dot{\theta }\), and the velocity of the capsule system \({z}^{^{\prime}}\) respectively, are assigned. In the (single) hidden layer, consisting of \(n\) neurons, the activation function \(f\) is applied, being one of the following: the ReLU, sigmoid, or hyperbolic tangent. The output layer consists of one neuron. It is represented by the optimal control \(u\) of the pendulum capsule drive system being the target of NN model predictions. On the output, the linear and sigmoid function are applied.

Since the aim of the NN is to learn the relationship between the features (system state variables) and the target data (open-loop optimal control), formula (10) is introduced. It presents a way of calculating the output value of the chosen layer (in this case, the hidden layer) with the applied activation function \(f\). Thus, the input for the next layer (in this case, the output layer) is known. This step is repeated till the last connection between the layer and the target is achieved.

$${a}_{i}^{out}=f({z}_{i}^{h})=f\left({b}_{i}^{h}+{\sum }_{j\in \{\mathrm{0,1},3\}}{x}_{j}{w}_{ij} \right)$$
(10)

In formula (10), the following notation has been adopted: \(i\)—index of the current neuron in the hidden layer, \(i\in \{1, \dots , n\}\), \(n\)—total number of neurons in the hidden layer, \(j\)—index of the current neuron in the input layer, \(j\in \{0, 1, 3\}\), \(h\)—hidden layer, \(f\)—the activation function applied in the hidden layer (e.g., ReLU, sigmoid, tanh), \({z}_{i}^{h}\)—value for the current neuron in the hidden layer, \({b}_{i}^{h}\)—bias for the current neuron in the hidden layer, \({x}_{j}\)—input for the hidden layer, represented by three system’s state variables, \({a}_{i}^{out}\)—current input for the output layer, \({w}_{ij}\)—weight representing the connection between the \(j\)th input and the \(i\)th neuron in the hidden layer.

3.3 The Neural Network training process

Since the reference dataset (described precisely in Sect. 3.1.) is loaded and checked with respect to the missing values, the considered control features (\({x}_{0},{x}_{1}\) and \({x}_{3}\)) and the optimized open-loop control \(u\) are assigned to the vector \({\varvec{x}}\) and the target variable \(u\) respectively. We split the dataset randomly into the separate training and test sets. It is worth noting that the smaller the test set, the more inaccurate the estimation of the generalization error becomes [31]. For this, we use 80% of the samples of the aforementioned features to fit the model and the remaining 20% of the unseen data for the performance evaluation. Dataset shuffling is additionally applied in this step in order to obtain the representative training and test sets. It also means that the created model is not determined by the data order. Moreover, it prevents getting stuck in cycles during the cost function optimization [31].

The way the loss gradients are used to update the parameters of the NN is specified by the optimizers [23]. In this research, we consider the adaptive moment estimation (Adam) optimizer combining the advantages of AdaGrad [32] and RMSProp [33] methods. The first one deals efficiently with sparse gradients, whereas the second works well in online as well as non-stationary settings and resolves some problems of the first one. More precise connections between these methods and Adam optimizer are described in [34]. In the chosen algorithm, hyperparameters are equipped with intuitive interpretation and typically do not require any tuning. The individual adaptive learning rates are calculated for different parameters based on the estimation of the first and second moments of the gradient [34].

We train each of the MLP models for 1000 epochs until the lowest generalization error is achieved. Then, the calculated weights and biases are used to predict the target variable of the unseen data. To reduce the risk of the randomness results, the training is repeated three times.

Based on Fig. 5, showing the changes in the loss function for the training and test set, it can be noticed that the algorithm reaches convergence at the 114th epoch. Crossing that threshold, overtraining appears. To avoid this phenomenon, the loss values reaching the same level are monitored for ten epochs using the EarlyStopping class. Then, the training process is subsequently stopped.

Fig. 5
figure 5

The loss function value in 1000 epochs for the test set

The performance of the model is evaluated with regard to the loss function value and data fit on the test dataset. The first one is measured with the use of the Mean Squared Error (MSE) metric. This equals the average value of the Sum of Squared Errors cost function that is minimized to fit the model [31]. The MSE is calculated according to the following equation:

$$MSE=\frac{1}{n}{\sum_{i=1}^{n}({y}^{i}-{\widehat{y}}^{i})}^{2}$$
(11)

where \(n\)—number of training samples, \({y}^{i}\)—the true value of the current sample, \({\widehat{y}}^{i}\)—the predicted value of the sample.

Data fit evaluation of how far the predicted values are from the original ones is described using the coefficient of determination R2 score, which shows the fraction of response variance captured by the model [31].

4 Numerical results

The performance of NN predictive models has been analyzed for 30 different configurations of the MLP, presented in Sect. 3.2. The analysis included various activation functions for the hidden and output layers, along with changing numbers of neurons. The reasons for selecting the activation functions and the number of neurons have been presented in Sect. 3.2. The performance was measured with the use of MSE and R2 parameters providing the information about the generalization of the error and data fit respectively. The results are presented in Table 1.

Table 1 The NN predictive models’ performance results concerning the various activation functions for the hidden and output layers and the different numbers of neurons

In the second part of the research we consider only NN models with performance measured by the R2 and MSE equal to 0.999 and 0.0001/0.0002 respectively. Consequently, nine different configurations (bolded in Table 1) reach this result. It is worth noting that the highest scores are mainly obtained for the NN models with sigmoid and hyperbolic tangent activation functions in the hidden layer, whereas in the output one the sigmoid is applied. Moreover, top-scoring performance results with the aforementioned combination of activation functions in layers start from 10 neurons without significant changes during further increases in value. On the other hand, the rule of thumb tested in this study does not give satisfactory results, which could be related to the specific character of the considered system with the small number of input and output data. The process of NN training lasted an average of 100 epochs. However, in this study we do not focus on the training time criteria.

The linear relationship between the reference and predicted data with R2 coefficient equal to 0.999 is presented in Fig. 6. The results show the satisfactory level of the model fit for the top-scoring performance subject consisting of 50 neurons, along with the applied ReLU and linear activation function in the hidden and output layers.

Fig. 6
figure 6

The linear relationship between the learning and predicted data along with the R2 coefficient value equal to 0.999 for the NN model consisting of 50 neurons and the applied ReLU and linear activation function for the hidden and output layer, respectively

Parameters of the nine top-scoring performances of NN models achieved in this stage are consequently implemented into the pendulum capsule drive controller. Then, the distance covered by the system in the dimensionless time interval is calculated and presented in Table 2.

Table 2 Distance covered by the pendulum capsule drive for the NN models with the highest performance scores obtained in the preliminary research along with their structure, i.e., the hidden and output layer activation functions, number of neurons, the data fit and loss function scores

The analysis of the distance covered by the pendulum capsule drive in the assumed, dimensionless time interval revealed that the highest score is obtained for the NN model consisting of 50 neurons in the hidden layer, along with the ReLU applied as an activation function and the linear one used for the output. The achieved result is equal to 6.135, with 1.16% higher performance than the one from the open-loop control (6.065). The scores of other NN models differ insignificantly from the reference open-loop optimal control and are not considered in this study. In any case, it is worth observing that all (but one) provide a final displacement that is systematically larger than the reference one of 6.065, although the increment is minor.

The optimized control along with the distance covered by the pendulum capsule drive obtained in the dimensionless time interval for both controllers (open-loop and neural closed-loop), are presented in Fig. 7.

Fig. 7
figure 7

The optimized control (a) with the corresponding distance covered by the pendulum capsule drive (b) for the open-loop and NN controller vs. dimensionless time (τ)

4.1 Perturbations in the system

The controllers’ robustness against the variation of parameters of the system has been tested by introducing a variable friction coefficient between the capsule and the underlying surface. It has been assumed that the actual friction coefficient at each point is a uniformly distributed random value:

$$\begin{array}{c}{\mu }_{r}\in \left[\mu -\Delta , \mu +\Delta \right] \end{array}$$
(12)

where \(\mu\) is the nominal, assumed value of the friction coefficient and \(\Delta\) is its maximal, absolute deviation. A different value of \({\mu }_{r}\) has been drawn for each interval of the capsule’s path of motion of the dimensionless length equal to 0.1.

The subject of the evaluation was the NN predictive model with the highest result of the distance covered by the pendulum capsule drive in the interval of dimensionless time τ. The \(\Delta\) parameter introducing perturbations, was varied from 0.00 to 0.20, thus changing the effective friction coefficient range simultaneously. For each value of the \(\Delta\) parameter, the test has been repeated 3 times to reduce the risk of random results. The obtained mean scores, along with the corresponding standard deviation (SD) are presented in Table 3.

Table 3 The results of the distance covered by the system in the environment with varying friction coefficient (Δ) range for the open-loop and neural closed-loop controller. The relative changes between the distance without and under perturbations, for both controllers, along with their performance in the uncertain frictional environment, according to Formulas (13) and (14)

To better understand the impact of the \(\Delta\) parameter variation on the distance covered by the pendulum capsule drive for the open-loop and neural closed-loop controller, the relative changes have been calculated (see Table 3). Within the first stage each of the controllers has been evaluated with respect to the changes between the distance without and under perturbations, according to the following formula:

$$\begin{array}{c}{\delta }_{C}= \frac{{z}_{p}-{z}_{0}}{{z}_{0}}\cdot 100\%\end{array}$$
(13)

where \({\delta }_{c}\)—the relative change for the open-loop or neural closed-loop controller (%), \({z}_{0}\)—the distance covered by the system under the value of the \(\Delta\) parameter equal to 0, \({z}_{p}\)—the distance covered by the system under the non-zero value of the \(\Delta\) parameter.

The known values of the distance covered by the pendulum capsule drive in the uncertain frictional environment led us to the performance comparison of both controllers in these conditions. The relative changes of the distance under perturbations between the open-loop and neural closed-controller have been calculated as follows:

$$\begin{array}{c}{\delta }_{P}= \frac{{z}_{NN}-{z}_{OL}}{{z}_{OL}}\cdot 100\% \end{array}$$
(14)

where \({\delta }_{P}\)—the relative change between the open-loop and neural closed-loop controller (%), \({z}_{OL}\)—the distance covered by the system under the perturbations, for the open-loop controller, \({z}_{NN}\)—the distance covered by the system under the perturbations, for the neural closed-loop controller.

Please note that the parameters of the open-loop controller were calculated with regard to the Formula (8) and presented in (9). On the other hand, the NN controller ones originate from the computation performed in the second stage of this research, according to Formula (10).

The distance covered by the pendulum capsule drive in the uncertain frictional environment presented in Table 3 shows that the NN controller is more resistant to friction changes than the open-loop controller. The first significant decrease in the distance appears much earlier in the open-loop controller, starting from a value of 0.07 corresponding to 0.10–0.13 in the NN one. The higher the \(\Delta\) parameter value, the more noticeable the difference. The close-up look of the covered distance consideration for open-loop and NN controllers is presented in Fig. 8.

Fig. 8
figure 8

Distance covered by the pendulum capsule drive in the uncertain frictional environment vs. the open-loop and neural network controller

The impact of the varying coefficient of friction on the distance covered by the pendulum capsule drive calculated within the use of relative changes is presented in Figs. 9 and 10. The main observation is that the higher the \(\Delta\) parameter value, the bigger decrease in the distance, especially for the open-loop control. It is worth nothing that as \(\Delta\) varies from 0 to 0.16, the relative change of distance is of the order of 1% when the NN controller is applied. In contrast, the open-loop controller suffers a four times larger loss of efficiency for the same increase of \(\Delta\). The greatest noticeable change for both controllers appears with a \(\Delta\) parameter value of 0.20, resulting in a 10.25% and 9% decrease in the distance for the open-loop controller and the NN one respectively. While for large values of the \(\Delta\) parameter the performance of both controllers decreases (although to a different extent, as said), it is interesting to note that, quite surprisingly, for very small values of the \(\Delta\) parameter the maximum distance increases with respect to the unperturbed case \(\Delta =0\) suggesting a kind of beneficial effect of uncertainties on \(\mu\).

Fig. 9
figure 9

Relative changes between the distance without and under perturbations (Δ) for the open-loop and NN controller

Fig. 10
figure 10

The performance comparison of the open-loop and neural closed-loop controller in the environment with varying friction coefficient introduced by Δ parameter value

The analysis of perturbed distances comparison performed for both controllers proves that the neural network controller is more resistant to changes occurring in the environment with a varying range of the coefficient of friction. Hence, the most significant discernible increase in performance is equal to + 7%.

5 Summary and conclusions

In this study, a novel approach to the pendulum capsule drive control with the use of a neural network (NN) as a closed-loop controller is presented. The novelty in this research is the use of an optimized open-loop control function as the basis of the designed controller. The dependence between the output of the open-loop controller and the system state variables is determined by the NN.

One of the major aims of this research was to test and evaluate the robustness of the novel controller compared with the original open-loop control. Thus, the study was divided into three parts. In the preliminary research we created a default architecture of a NN predictive model that was tested for various configurations concerning different numbers of neurons, as well as the activation functions for hidden and output layers. Within this stage the performance of each model was measured with the R2 and MSE tools. This allowed the selection of nine NN models with the highest correlation score between the reference and predicted data. The parameters of NN models gathered in this training have been implemented in the pendulum capsule drive simulated controller to calculate the distance covered by the system in the dimensionless time interval. The highest distance was achieved for the NN model consisting of 50 neurons in the hidden layer along with the ReLU used as an activation function and a linear one applied for the output.

Subsequently, the top-performing artificial NN model was tested in the last part of the study, considering the robustness of the controller and its reliability in the uncertain frictional environment introduced by the varying friction coefficient range between the capsule shell and the underlying surface. Without any perturbations the neural network controller achieved a 1.16% higher performance than the open-loop. The variations of the coefficient of friction range proved the NN more resistant to the perturbations occurring in the system with a maximum of + 7% advantage over the open-loop controller. In fact, the changes between the distance without and under perturbations occurred much slower, and remained at the 1% level much longer. Meanwhile, in the open-loop controller the increasing value was constantly observed, exceeding some trial scores achieved in the NN controller at least 4 times.

Results presented in this study confirm that the NN controller works more efficiently compared to the original open-loop controller and proves the higher level of robustness in an environment where perturbations occur. It seems that the neural closed-loop controller could be an alternative option to the classic ones in many applications of the mechanical field, especially for non-smooth and discontinuous systems (as the one considered in this work). Moreover, it could significantly simplify the closed-loop controllers’ systems design where only the open-loop control is available.