Learning a robot controller using an adaptive hierarchical fuzzy rule-based system

Waldock, Antony; Carse, Brian

doi:10.1007/s00500-015-1688-3

Learning a robot controller using an adaptive hierarchical fuzzy rule-based system

Methodologies and Application
Published: 22 April 2015

Volume 20, pages 2855–2881, (2016)
Cite this article

Soft Computing Aims and scope Submit manuscript

Antony Waldock¹ &
Brian Carse²

464 Accesses
5 Citations
Explore all metrics

Abstract

The majority of machine learning techniques applied to learning a robot controller generalise over either a uniform or pre-defined representation that is selected by a human designer. The approach taken in this paper is to reduce the reliance on the human designer by adapting the representation to improve the generalisation during the learning process. An extension of a Hierarchical Fuzzy Rule-Based System (HFRBS) is proposed that identifies and refines inaccurate regions of a fuzzy controller, while interacting with the environment, for both supervised and reinforcement learning problems. The paper shows that a controller using an adaptive HFRBS can learn a suitable control policy using a fewer number of fuzzy rules for both a supervised and reinforcement learning problem and is not sensitive to the layout as with a uniform representation. In supervised learning problems, a small number of extra trials are required to find an effective representation but for reinforcement learning problems, the process of adapting the representation is shown to significantly reduce the time taken to learn a suitable control policy and hence open the door to high-dimensional problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Alcala R, Casillas J, Cordón O, Herrera F (2001) Building fuzzy graphs: features and taxonomy of learning for non-grid-oriented fuzzy rule-based systems. J Intell Fuzzy Syst 11:99–119
Google Scholar
Assilian S, Mamdani EH (1974) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man-Mach Stud 7(1):1–13
MATH Google Scholar
Bastian A (1994) How to handle the flexibility of linguistic variables with applications. Int J Uncertain Fuziness Knowl-Based Syst 2(4):463–484
Article MathSciNet MATH Google Scholar
Bellman R (1957) Dynamic programming. Princeton University Press, Princeton
MATH Google Scholar
Bellman R (1961) Adaptive control processes: a guided tour. Princeton University Press, Princeton
Book MATH Google Scholar
Bouchon-Meunier B, Marsala C (1999) Learning fuzzy decision rules. In: The Handbooks of fuzzy sets series, vol 4. Kluwer Academic Publishers, pp 279–304
Boyan J, Moore A (1995) Generalization in reinforcement learning: safely approximating the value function. Adv Neural Inf Process Syst (NIPS) 7:369–376
Google Scholar
Bungartz HJ, Griebel M (2004) Sparse grids. Acta Numer 13:1–123
Article MathSciNet MATH Google Scholar
Cara A, Pomares H, Rojas I, Lendek Z, Babuka R (2010) Online self-evolving fuzzy controller with global learning capabilities. Evol Syst 1(4):225–239. doi:10.1007/s12530-010-9016-8
Article Google Scholar
Carmona P, Castro J, Zurita J (2004) Strategies to identify fuzzy rules directly from certainty degrees: a comparison and a proposal. Fuzzy Syst IEEE Trans 12:631–640
Article Google Scholar
Carse B, Fogarty T, Munro A (1996) Evolutionary learning of fuzzy rule based controllers using genetic algorithms. Fuzzy Sets Syst 80:273–293
Article Google Scholar
Chen G, Pham TT (2001) Introduction to fuzzy sets, fuzzy logic and fuzzy control systems. CRC Press LLC, Boca Raton
Cheong F, Lai R (2003) Constrained optimization of genetic fuzzy systems. In: Casillas J, Cordón O, Herrera F, Magdalena L (eds) Accuracy improvements in linguistic fuzzy modeling. Studies in fuzziness and soft computing, vol 129, chap 2. Springer, Berlin, pp 46–71
Chung CC, Hauser J (1995) Nonlinear control of a swinging pendulum. Automatica 31(6):851–862. doi:10.1016/0005-1098(94)00148-C
Article MathSciNet MATH Google Scholar
Cordón O, Herrera F, Peregrín A (1997) Applicability of the fuzzy operators in the design of fuzzy logic controllers. Fuzzy Sets Syst 86(1):15–41
Article MATH Google Scholar
Cordón O, Herrera F, Zwir I (2001) Fuzzy modeling by hierarchical built fuzzy rule bases. Int J Approx Reason 27:61–93
Article MATH Google Scholar
Cordón O, Herrera F, Zwir I (2003) A hierarchical knowledge-based environment for linguistic modeling: models and iterative methodology. Fuzzy Sets Syst 138(2):307–341
Article MathSciNet Google Scholar
Cory RE (2010) Supermaneuverable perching. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, USA
Cupertino F, Giordano V, Naso D, Delfine L (2006) Fuzzy control of a mobile robot. IEEE Robotics Autom Mag 13(4):74–81
Article Google Scholar
Dai K, Kammoun H, Alimi A (2012) H-FQL: a new reinforcement learning method for automatic hierarchization of fuzzy systems: an application to the route choice problem. In: Intelligent systems (IS), 2012 6th IEEE international conference, pp 54–59
Doya K (2000) Reinforcement learning in continuous time and space. Neural Comput 12:219–245
Article Google Scholar
Fahlman SE, Lebiere C (1990) The Cascade-Correlation Learning Architecture. In: Touretzky DS (ed) Advances in Neural Information Processing Systems 2. Morgan, Kaufmann, pp 524–532
Farahmand AM, Munos R, Szepesvári C (2010) Error propagation for approximate policy and value iteration. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Advances in Neural Information Processing Systems. Curran Associates, Inc., pp 568–576. http://papers.nips.cc/paper/4181-error-propagation-for-approximate-policy-and-value-iteration.pdf
Gaskett C (2002) Q-learning for robot control. PhD thesis, Research School of Information Sciences and Engineering, ANU
Glorennec PY, Jouffe L (1997) Fuzzy q-learning. In: Proceedings of Fuzz-IEEE 1997, 6th international conference on fuzzy systems. Barcelona, pp 659–662
Guanloa R, Musilek P, Ahmed F, Kaboli A (2004) Fuzzy situation based navigation of autonomous mobile robot using reinforcement learning. In: Proceedings of North American fuzzy information processing systems (NAFIPS), pp 820–825
Hagras H, Callaghan V, Colley M (2001) Outdoor mobile robot learning and adapation. IEEE Robotics Autom Mag 8(3):53–69
Article Google Scholar
Hellendoorn H, Thomas C (1993) Defuzzification in fuzzy controllers. J Intell Fuzzy Syst 1:109–123
Google Scholar
Holve R (1997) Rule generation for hierarchical fuzzy systems. In: North American fuzzy information processing Societ—NAFIPS, pp 444–449. doi:10.1109/NAFIPS.1997.624082
Holve R (1998) Automatic input space partitioning for hierarchical fuzzy systems. In: Fuzzy information processing society—NAFIPs, 1998 conference of the North American. Pensacola Beach, pp 266–270
Holve R (1998b) Investigation of automatic rule generation for hierarchical fuzzy systems. In: Fuzzy systems proceesings, 1998, IEEE world congress on computational intelligence, vol 2. Anchorage, pp 973–978
Ishibuchi H, Nozaki K, Tanaka H (1992) Distributed representation of fuzzy rules and its application to pattern classification. Fuzzy Sets Syst 52(1):21–32
Article Google Scholar
Kaelbling LP, Littman ML, Moore A (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Google Scholar
Kosko B (1992) Neural Networks and Fuzzy Systems: A Dynamical systems approach to machine intelligence. Prentice-Hall, Inc., Upper Saddle River, NJ, USA
Kuhlmann G, Stone P (2003) Progress in learning 3 vs. 2 keepaway. In: Systems, man and cybernetics, 2003 IEEE international conference on, vol 1, pp 52–59
Mitchell TM (1997) Machine learning. MIT Press, Cambridge
MATH Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. In: NIPS deep learning workshop
Munos R (1998) A general convergence method for reinforcement learning in the continuous case. In: 10th European conference on machine learning, pp 394–405
Munos R (2010) Approximate dynamic programming. In: Sigaud O, Buffet O (eds) Markov Decision Processes in Artificial Intelligence. ISTE Ltd and Wiley, chap 3, pp 67–98
Munos R (2000) A study of reinforcement learning in the continuous case by means of viscosity solutions. Mach Learn J 40:265–299
Munos R, Moore A (2002) Variable resolution discretization in optimal control. Mach Learn 49:291–323
Article MATH Google Scholar
Nozaki K, Ishibuchi H, Tanaka H (1997) A simple but powerful heuristic method for generating fuzzy rules from numerical data. Fuzzy Sets Syst 86:251–270
Article Google Scholar
Pan L, bin Tong Y (2009) Research of reinforcement learning control of intelligent robot based on fuzzy-cmac network. In: Computer network and multimedia technology, 2009. CNMT 2009. International symposium on, pp 1–4. doi:10.1109/CNMT.2009.5374686
Passino KM, Yurkovich S (1998) Fuzzy control. Addison Wesley Longman, Menlo Park
MATH Google Scholar
Ritthipravat P, Maneewarn T, Laowattana D, Wyatt J (2004) A modified approach to fuzzy q learning for mobile robots. In: Systems, man and cybernetics, 2004 IEEE international conference, vol 3, pp 2350–2356
Saffiotti A, Ruspini E, Konolige K (1999) Using fuzzy logic for mobile robot control. In: Handbooks of fuzzy sets, vol 6, chap 5. Kluwer Academic, MA, pp 185–205
Santamaria JC, Sutton R, Ram A (1998) Experiments with reinforcement learning in problems with continuous state and action spaces. Adap Behav 6(2):163–218
Article Google Scholar
Schaal S, Atkeson CG, Vijayakumar S (2000) Real-time robot learning with locally weighted statistical learning. In: International conference on robotics and automation (irca2000), p 1280
Shi Z, Tu J, Li Y, Wang Z (2013) Adaptive reinforcement q-learning algorithm for swarm-robot system using pheromone mechanism. In: Robotics and biomimetics (ROBIO), 2013 IEEE international conference on, pp 952–957. doi:10.1109/ROBIO.2013.6739586
Smart WD, Kaelbling LP (2002) Reinforcement learning for robot control. In: Proceedings—SPIE the international society for optical engineering, SPIE, vol 4573, pp 92–103
Spong M (1994) Swing up control of the acrobot. IEEE international conference on robotics and automation. San Diego, CA, pp 2356–2361
Google Scholar
Spong M (1995) The swingup control problem for the acrobot. IEEE Control Syst Mag 15(1):49–55
Article Google Scholar
Sudkamp T, Hammell RJ (1994) Interpolation, completion and learning fuzzy rules. Syst Man Cybern IEEE Trans 24:332–342
Article Google Scholar
Sutton RS (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Touretzky DS, Mozer MC, Hasselmo ME (eds) Advances in neural information processing systems, vol 8. The MIT Press, Cambridge, pp 1038–1044
Sutton R, Singh S (1996) Reinforcement learning with replacing eligibility traces. Mach Learn 22:123–158
MATH Google Scholar
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge
Google Scholar
Takeda M, Nakamura T, Ogasawara T (2001) Continuous values q-learning method able to incrementally refine state space. In: Proceedings of the 2001 IEEE/RSJ international conference on intelligent robots and systems, vol 1. Maui, pp 265–271
Thongchai S (2002) Behavior-based learning fuzzy rules for mobile robots. In: Proceedings of the American control conference vol 2, pp 995–1000
Thrun S, Schwartz A (1993) Issues in using function approximation for reinforcement learning. In: Mozer M, Smolensky P, Touretzky DS, Elman J, Weigend A (eds) Proceedings of the connectionist models summer school. Hillsdale, pp 255–263
Wang L, Mendel J (1992) Generating fuzzy rules by learning from examples. IEEE Trans Syst Man Cybern 22(6):1414–1427
Article MathSciNet Google Scholar
Watkins C (1989) Learning with delayed rewards. PhD thesis, University of Cambridge, England
Watkins C, Dayan P (1992) Q learning. Mach Learn 8(3/4):279–292
Article MATH Google Scholar
Zadeh LA (1988) Fuzzy logic. Computer 21(4):83–93
Article Google Scholar

Download references

Acknowledgments

This research was carried out in collaboration with BAE SYSTEMS, UK.

Author information

Authors and Affiliations

BAE Systems Advanced Technology Centre, Chelmsford, UK
Antony Waldock
Bristol Robotics Lab at the University of the West of England, Bristol, UK
Brian Carse

Authors

Antony Waldock
View author publications
You can also search for this author in PubMed Google Scholar
Brian Carse
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antony Waldock.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Waldock, A., Carse, B. Learning a robot controller using an adaptive hierarchical fuzzy rule-based system. Soft Comput 20, 2855–2881 (2016). https://doi.org/10.1007/s00500-015-1688-3

Download citation

Published: 22 April 2015
Issue Date: July 2016
DOI: https://doi.org/10.1007/s00500-015-1688-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning a robot controller using an adaptive hierarchical fuzzy rule-based system

Abstract

Access this article

Similar content being viewed by others

Learning Systems with FUZZY

Efficient hierarchical policy network with fuzzy rules

Machine Learning for an Adaptive Rule Base

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning a robot controller using an adaptive hierarchical fuzzy rule-based system

Abstract

Access this article

Similar content being viewed by others

Learning Systems with FUZZY

Efficient hierarchical policy network with fuzzy rules

Machine Learning for an Adaptive Rule Base

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation