Implementation of a Perceptual Controller for an Inverted Pendulum Robot

Perceptual Control Theory (PCT) theorizes that a creature’s behaviour is varied so that their perception can reach and maintain certain fixed limits, despite external disturbances. The distinguishing characteristic of PCT is that the controlled variables are the inputs (perceptions, as opposed to the system outputs). This paper presents the first direct comparison of a PCT controller for a mobile robot (a two-wheeled ‘inverted pendulum’ balancing robot) with a classical control method, LQR. Simulations and experimental validation results show that the performance of the PCT controller is comparable to the LQR controller and better at disturbance rejection.


Introduction
Humans, other animals and a variety of machines can be said to 'behave' [1].Yet scientific theories that traverse the life, social and physical sciences are rare.One such theory is perceptual control theory (PCT) [2,3].
Developed by a medical physicist, William Powers, in the 1950s, it provides a functional architecture of behavior based on the principle that 'behavior is the control of perception'.Powers explained that for living mechanisms, the presence of individual purpose leads to control.They control neither their behaviour, nor external environmental variables.Instead living mechanisms control their perceptions of those environmental variables when an internal motivation appears [4].
Its basic components map onto the negative feedback system utilized ubiquitously within artificial control systems and within existing models of behavior [5].These systems assume that the reference value (or goal state) for the controlled variable, are inputted to the system from an outside user, in the same way as a user sets a thermostat.
In contrast, PCT explicitly places the reference value within the control system.By doing so, the theory models the system from the inside and permits the system to attempt to control its own sensory input by comparing the current sensory signal to its internally specified reference value, and acting against disturbances in the environment to reduce this difference.
Powers utilized this basic scheme to account for complex behavior by proposing that the reference values are set by a cascade of downward signals from higher level units.A simplified 'real-life' example can illustrate this.In order to successfully sip from a cup of cold water (perception = temperature of lips; reference = cold), this level of this system needs to vary the desired transition in joint angle of the wrist which is in turn managed by systems at successively lower levels.Only the lowest level in the hierarchy interfaces with the body and environment.Here it varies actions to control the current intensity of sensory input [2].An extended 'real-life' animated example of PCT hierarchical control is provided at https://youtu.be/kE3EHvrpU7g.Whilst the hierarchical structure of both the nervous system and machine architectures is well founded [6,7], the specification of desired (rather than predicted) sensory inputs by downward signals appears unique to PCT.
PCT proposes that a creature's behaviour is varied so that their perceptions can reach and maintain certain, fixed limits, despite external disturbances.Control in living systems can be thought of as a process of continually adapting behaviour so that perceptions remain consistent with the internal system purpose [8]. Figure 1 presents a schematic diagram which shows a single hierarchical level in Perceptual Control Theory.
The distinguishing characteristic of perceptual control theory is that the controlled variables are the inputs (perceptions, as opposed to the system outputs (behavioural actions) [9].When regarding PCT in the control engineering domain, the controlled perceptions are generally sensed or transformed functions of the system states, which are affected by the system behaviours.
The principal advantage of controlling the input is that the PCT control system uses the current environment (commonly known as the plant in control engineering) as a fundamental component of the control loop.Therefore, internal modelling of the environment to predict the 'correct response' is not required.Consequently, the PCT system has the capability to achieve and maintain control despite unknown disturbances, including managing the inevitable disturbances that result from the physical components of the living organism (or robot) itself.The extension of the core PCT concept to a hierarchy of autonomous, inter-dependent units each controlling their own degree of freedom indicates a significant advantage over traditional approaches with the potential to construct advanced behavioural systems composed merely of simple, nescient control units, negating the need to define complex models of world dynamics.The PCT model also involves a form of random-walk learning algorithm known as reorganization to further optimize performance, although this is not utilized in the current study.Each of these properties entail that a PCT controller, in theory, is more robust and adaptive than traditional control methods.
To date, empirical support for PCT has come largely from constructing and testing computational models of human performance against behavioural data.These models have shown close fits across a range of experimental contexts [10,11].There have also been a number of examples of successful applications of PCT to robotics, including an real-world autonomous robotic rover [12] control of a robotic arm [13] and a hexapod simulation [14].The autonomous rover showed robust rejection of disturbances of various kinds.For example, it continuously monitored the degree of wheel movement to match it with the current reference value, and would continue to modify the current output to the wheel motor to keep to this goal, regardless of any imperfections in the transmission system Despite the above advances, no studies have directly compared the validity of a PCT model with a competing, widely used classical control approach for the same activity.Second, no studies have made this comparison using a robotic device.Third, no studies have systematically compared robustness against experimentally applied disturbances.
The benchmark of testing a theoretical model using a robot is particularly high, because it immediately establishes the real-world validity in that it permits the physics and kinematics of behavior to be assessed.The comparison vehicle chosen for this study is the inverted pendulum robot, which is a popular benchmark for control theory comparisons [15].

Inverted Pendulum Control
Figure 2 shows a diagram of an inverted pendulum on a cart.The system contains a rolling cart and a bob, which is connected to a frictionless hinge on the cart through a weightless shaft [16].For simplicity, each component is treated as a point mass, the motion of the cart is constrained to the horizontal direction and the pendulum is rigid.
This system is inherently unstable when the pendulum is suspended above the horizontal as the pendulum will fall over due to gravitational effects without additional control effort [17].The role of a control system is to apply a control force to the cart in the 'x' direction to keep the pendulum upright.The ultimate goal of a control system is to find a fixed point at which the pendulum can be held indefinitely [18].
The inverted pendulum is often accepted as an appropriate model for a self-balancing vehicle [19]; its non-linearity and availability in the laboratory mean that inverted pendulum control can be viewed as a benchmark test for evaluating and comparing different control methods [20,21].In this paper, the inverted pendulum test will be adopted to compare PCT to other contemporary control methods.

Simulation
Initially, an inverted pendulum model was simulated and two different control methods applied; PCT and Linear Quadratic Regulator control (LQR).This section presents the models, controllers and simulation results.

Inverted Pendulum Model
The inverted pendulum system model was taken from previous work [21].The non-linear state-space model for the inverted pendulum system is described in Eq. 1.
The non-linear system dynamics were specified in Matlab (Matlab R2017a) and both the PCT controller and the LQR controller were to be reproduced in Simulink 8.9.
For the purposes of the simulation, the values in Table 1 were chosen.It was decided to choose μ so that the system

LQR Control
LQR control is considered to be an optimal method for control of inverted pendulums [22].It achieves a compromise between error penalization and control effort by minimising the quadratic cost function, Eq. 2.
Where y is the state column-vector, r is the desired state column-vector, Q and R are weight-defining matrices and F is determined by the state feedback law [23].LQR control was achieved by using a linearization of the system presented in Eq. 1. Matlab's in-built LQR function was used to provide control parameters for the specified linear system.The linearized system can be found in Eq. 3.
Where μ is the coefficient of ground friction and L is the length of the rod.The tuned values used for the LQR control simulation are shown in Eq. 4.

Developing a Perceptual Controller
Previous work has applied PCT to the inverted pendulum system [16].Powers initially developed a 5-level perceptual controller (Fig. 3), but this controller had the issue that it could find equilibrium positions at the origin, even when the origin was not the desired position reference for the cart.
A revised controller was proposed [16] by removing the fourth hierarchical level (Fig. 4).The 4-level proportional controller is the basis the perceptual controller used in this paper.
The mathematical model of the controller is given by the system of equations shown in Eq. 5.
is the output control variable, u n is the output of each level of the PCT controller, e n is the error term for each level, K n is the gain of each level and K I is the gain associated with the leaky integrator In order to produce results comparable to Kennaway's [16] the values of the proportional gains, K n , were tuned Fig. 3 A Reproduction of Powers' 5-Level Perceptual Controller in [16] Fig. 4 An Adaptation of Kennaway's 4-Level Perceptual Controller in [16] manually.The relevant tuning parameters for the perceptual controller are given in Table 2.

Results and Analysis
Figure 5 shows the performance of both the PCT and LQR controllers.It can be seen that the performance of the PCT controller is comparable to the LQR.The rise-time is slower than the LQR for the position response, however the overshoot and settling time of the bob angle is less.
It is also apparent from Fig. 6 that the perceptual controller was capable of dealing with disturbances.In fact, PCT significantly outperformed LQR in the rejection of disturbances.In Fig. 6 the system under LQR control required nearly 6s to find equilibrium after a force of 5N was applied for 1.5s.To contrast this, the cart's position was almost unaffected by the same disturbance when controlled by PCT.
The results from these simulations indicate that PCT is a viable method for controlling this system, and due to this system's parallels to self-balancing vehicles [19] it seems  probable that PCT may be found to be useful in controlling these vehicles.
The results in Figs. 5 and 6 show that the perceptual controller has very good disturbance rejection.A fundamental difference between the perceptual controller and LQR is that PCT requires only one reference perception to be set.In the case of LQR, each of the four state variables is controlled.It may be that perceptual controllers can outperform other contemporary methods at controlling their highest level perception as they do not work to control other environmental variables.
It is difficult, however, to make absolute conclusions regarding the performance of the perceptual controller due to the necessity of manual tuning for both the PCT controller and the LQR controller.Although both methods were subject to tuning until continued iteration to the tuning made no demonstrable effect on the performance, it could still be argued that the tuning for either of the methods could still be improved.Another study attempted to achieve optimal control of this system in simulation [17] and found that their LQR controller was able to stabilise at a new reference point in approximately 6s.This result is consistent with the results reported here, and so the LQR controller implemented in this study is considered to be well-performing.
Before substantive claims can be made regarding the performance of perceptual controllers, it is necessary to validate the controller comparison using real robots.

Experimental Validation
The platform chosen to test the perceptual controller was the Lego Mindstorms Two-wheeled Balancing Robot, shown in Fig. 7.This platform has been used to test control methods previously [24,25] and has many similarities to the inverted pendulum on a cart.Most importantly, this platform has unmodelled dynamics and is subject to noise and uncertainties.
The two-wheeled balancing robot (TWBR) uses tachometers in the two servo-motors and a gyro sensor made by HiTechnic (The HTWay NXT Gyro Sensor) to measure the 4 state variables.Using an estimated moving average, it was possible to get a value for the robots pitch and angular velocity.The TWBR is controlled by the two servo-motors which provide a restoring force.Figure 8 defines the angular displacement and linear displacement used in the results section.
A cage of 12 VICON cameras (Vero v1.3 X) was used to track the position and pitch angle of the robot to submillimetre accuracy.A diagram of the tracking cage is shown in Fig. 9.The VICON system has been found in previous studies to have a mean accuracy of between 63-290μm [26].
Three control strategies were implemented on the robot; PCT, LQR and Proportional controller.Three experiments were conducted; station keeping at a set-point, disturbance rejection and tracking a variable set-point.
A proportional controller was implemented in this section to provide an extra comparison for the performance of the perceptual controller.The proportional controller was developed to attempt to stabilize the robot with every state-variable at zero.The proportional controller took measurements of the four state-variables from the motor encoders and the gyroscope.These inputs were used to generate a motor output based on the control law shown in Equation 6.The gains were altered through manual tuning until adequate performance was achieved.whereas the LQR and P-Controlled robots were subject to oscillations with amplitudes around ≈ 5 • .While the LQR controller performs less well than has been reported in another study with the TWBR [24], the LQR results from this study can still provide an informative comparison for the performance of the controller.Direct comparison between the results from this study and other studies with the TWBR [24] are problematic due to the differences in how the tracking data has been taken and also the differences in robot design (particularly the wheel choice).Despite these differences, it is still notable that the perceptual controller developed in this study has better reported performance in station holding (fluctuations of ≈ 2 • ) than the well-tuned LQR controller (fluctuations of ≈ 3 • ) developed in [24].

Station Holding Results and Analysis
It can be seen in this study that PCT control performed better in tracking both the head angle and head position than LQR and proportional control.After an initial stabilizing period (generally caused by a force input from the user standing the robot up), the PCT controller was capable of maintaining a position that was within 5mm of the set point: the performance of PCT controller in this test was far greater than either of the other competing methods.
Results from the experimental validation demonstrate that the PCT controller was less sensitive to internal noise, frictions and uncertainties.It is clear from the results that the low-amplitude perturbations from the TWBR under PCT control was much smaller than the results from the TWBR under P-Control or LQR control.Although this work is not an objective examination of the behaviour of these control methods, the strong performance of PCT in simulation and on the TWBR would suggest that this behaviour may be intrinsic to the PCT controller and is unlikely to be a result of tuning differences.

Disturbance Rejection and Variable Set-Point Results
The results in this section detail the perceptual controller's response to a number of additional scenarios; disturbance rejection and variable set-point.Only the PCT controller was evaluated for these scenarios for two reasons; firstly, ensuring that the robot received exactly the same impulse on different occasions was challenging and so an objective comparison of disturbance rejection could not be performed.In addition to this, PCT proved to be the only control method for this application that could reliably track a new set-point.LQR and P-control suffered from drift and so could not be used to track new set-points (Fig. 11).
Figure 12 shows the response of the robot under PCT control to three disturbances.It can be seen that the PCT controller can sustain forces that cause an initial perturbation of over 15cm.The instability of the TWBR platform and the limits of the motor control combined so that the robot was unable to right itself much larger than this.In Fig. 12 we can see that the robot took over 25s to stabilize after the final perturbation.Although this time for stabilization was large, the other competing controllers were not able to stabilize after a perturbation of a similar magnitude and instead began large-amplitude oscillatory behaviour.
A key prediction from the simulation data was that the PCT controller could withstand external perturbations, and this prediction was supported by the validation.After being subjected to perturbations in excess of 15cm the PCT controller was able to stabilize, whereas the LQR and P-Controller used in this experiment were not able to withstand perturbations noticeably larger than the oscillatory behaviour they exhibited while attempting to track a set point.While there is a possibility that this was merely a consequence of tuning in both the simulation and validation, there are certainly grounds for further investigation of this.
Figure 13 details the PCT controller's performance in the variable-set point test.After an initial stabilizing period, the perceptual controller demonstrated its ability to track a setpoint accurate to 5mm.After this, the TWBR tracked two set-points, one 20cm and one 40cm away from the origin.The robot was able to stabilize at these two set points.After 40s and 50s the robot again responded to a changing setpoint.Figure 13 shows that the robot was subject to a small level of drift as it was not able to stabilize at the origin after it had tracked the other set-points.This drift was probably likely to be caused by wheel slip.
In simulation, the perceptual controller was able to adjust to a variable set-point over a range of 2m.The results from the validation show a range that is understandably lower than this.The TWBR was unable to react to larger changes in set-point as this large change produced a large difference in reference and actual position.This produced large motor forces and this forced the robot to fall.The Leaky Integrator in the perceptual controller was implemented to reduce the effects of these large changes in reference.
It is clear that this perceptual controller is robust enough to deal with many competing environmental factors, relatively large disturbances and can still track its reference point well.

Conclusions and Further Work
This work has demonstrated clearly that robotic machines can perform balanced motion using one single reference and their sensors.Close parallels can be drawn between the robot's reference and a living organism's internal purpose or motivation [12].Parallels can also be drawn between the data from the robot's sensors and a living organism's perception of its environment.It could be contended that using perceptual control theory to effectively model robotic motion supports the idea that perceptual control theory provides an accurate model of the behaviour on living organisms.The results outlined in this paper are a clear indication that perceptual control theory is not only a viable method for control systems, but potentially superior to contemporary control algorithms in a number of key performance parameters.The strong performance of PCT in this paper certainly strengthens the argument that the development of more perceptual controllers for future robots would be worthwhile.
The subjectivity of the analysis of different control methods in this work means that no definitive assertions can be made about the comparative performance of PCT against its rivals.It is hoped that further studies can produce a more objective study of perceptual controllers that is not limited by the potential pit-falls of manual tuning.
In this research we have adhered to the previous hierarchy structure which includes an integrator function [16].Although it serves a necessary role of limiting the reference to the cart position control system and, so, the force applied to the robot, an integrator in this case can result in oscillation around the set point.This is due to smoothing over past values of the error resulting in the error not being zero when the target is reached; integral windup.In future versions we would investigate replacing it with a proportional controller with a limit, such as a sigmoid function, which would avoid excessive force.This should also result in control according to only the latest value of the error so that when the target is reached the position error, and the force error, will be zero thus avoiding oscillation.
Finally, the current study is part of a larger program to develop perceptual controllers for applications other than the inverted pendulum, such as humanoid motion.The success of these systems would greatly strengthen the argument that perceptual control theory provides an accurate explanation of behaviour in living organisms.

Fig. 1 A
Fig. 1 A Diagram to show the Organisation of a Single-Level Perceptual Control System

Fig. 2 A
Fig. 2 A Diagram of an Inverted Pendulum on a Cart

Fig. 5 A
Fig. 5 A Comparison of LQR and PCT Control

Fig. 6
Fig. 6 Disturbance Rejection of LQR and PCT Control

Fig. 7
Fig. 7 Images of the Two-Wheeled Balancing Mindstorms Robot

Figure 10 showsFig. 8 AFig. 9 A
Figure10shows the results of the angular displacement of the robot for the three different controllers.The PCT controlled robot had a steady-state error of ≈ 2 • ,

Fig. 10 A
Fig. 10 A Comparison of Three Control Methods' Ability to Control the Angular Displacement of the TWBR

Fig. 11 A
Fig. 11 A Comparison of Three Control Methods' Ability to Control the Displacement of the TWBR

Table 1
A table of values for simulation The gravitational field strength, g, is consistent with its value on the surface of the Earth, and the values of the other quantities in Table1were chosen arbitrarily.

Table 2
The tuning parameters used in the PCT controller