Skip to main content
Log in

Learning a robot controller using an adaptive hierarchical fuzzy rule-based system

Soft Computing Aims and scope Submit manuscript

Abstract

The majority of machine learning techniques applied to learning a robot controller generalise over either a uniform or pre-defined representation that is selected by a human designer. The approach taken in this paper is to reduce the reliance on the human designer by adapting the representation to improve the generalisation during the learning process. An extension of a Hierarchical Fuzzy Rule-Based System (HFRBS) is proposed that identifies and refines inaccurate regions of a fuzzy controller, while interacting with the environment, for both supervised and reinforcement learning problems. The paper shows that a controller using an adaptive HFRBS can learn a suitable control policy using a fewer number of fuzzy rules for both a supervised and reinforcement learning problem and is not sensitive to the layout as with a uniform representation. In supervised learning problems, a small number of extra trials are required to find an effective representation but for reinforcement learning problems, the process of adapting the representation is shown to significantly reduce the time taken to learn a suitable control policy and hence open the door to high-dimensional problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

References

  • Alcala R, Casillas J, Cordón O, Herrera F (2001) Building fuzzy graphs: features and taxonomy of learning for non-grid-oriented fuzzy rule-based systems. J Intell Fuzzy Syst 11:99–119

    Google Scholar 

  • Assilian S, Mamdani EH (1974) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man-Mach Stud 7(1):1–13

    MATH  Google Scholar 

  • Bastian A (1994) How to handle the flexibility of linguistic variables with applications. Int J Uncertain Fuziness Knowl-Based Syst 2(4):463–484

    Article  MathSciNet  MATH  Google Scholar 

  • Bellman R (1957) Dynamic programming. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Bellman R (1961) Adaptive control processes: a guided tour. Princeton University Press, Princeton

    Book  MATH  Google Scholar 

  • Bouchon-Meunier B, Marsala C (1999) Learning fuzzy decision rules. In: The Handbooks of fuzzy sets series, vol 4. Kluwer Academic Publishers, pp 279–304

  • Boyan J, Moore A (1995) Generalization in reinforcement learning: safely approximating the value function. Adv Neural Inf Process Syst (NIPS) 7:369–376

    Google Scholar 

  • Bungartz HJ, Griebel M (2004) Sparse grids. Acta Numer 13:1–123

    Article  MathSciNet  MATH  Google Scholar 

  • Cara A, Pomares H, Rojas I, Lendek Z, Babuka R (2010) Online self-evolving fuzzy controller with global learning capabilities. Evol Syst 1(4):225–239. doi:10.1007/s12530-010-9016-8

    Article  Google Scholar 

  • Carmona P, Castro J, Zurita J (2004) Strategies to identify fuzzy rules directly from certainty degrees: a comparison and a proposal. Fuzzy Syst IEEE Trans 12:631–640

    Article  Google Scholar 

  • Carse B, Fogarty T, Munro A (1996) Evolutionary learning of fuzzy rule based controllers using genetic algorithms. Fuzzy Sets Syst 80:273–293

    Article  Google Scholar 

  • Chen G, Pham TT (2001) Introduction to fuzzy sets, fuzzy logic and fuzzy control systems. CRC Press LLC, Boca Raton

  • Cheong F, Lai R (2003) Constrained optimization of genetic fuzzy systems. In: Casillas J, Cordón O, Herrera F, Magdalena L (eds) Accuracy improvements in linguistic fuzzy modeling. Studies in fuzziness and soft computing, vol 129, chap 2. Springer, Berlin, pp 46–71

  • Chung CC, Hauser J (1995) Nonlinear control of a swinging pendulum. Automatica 31(6):851–862. doi:10.1016/0005-1098(94)00148-C

    Article  MathSciNet  MATH  Google Scholar 

  • Cordón O, Herrera F, Peregrín A (1997) Applicability of the fuzzy operators in the design of fuzzy logic controllers. Fuzzy Sets Syst 86(1):15–41

    Article  MATH  Google Scholar 

  • Cordón O, Herrera F, Zwir I (2001) Fuzzy modeling by hierarchical built fuzzy rule bases. Int J Approx Reason 27:61–93

    Article  MATH  Google Scholar 

  • Cordón O, Herrera F, Zwir I (2003) A hierarchical knowledge-based environment for linguistic modeling: models and iterative methodology. Fuzzy Sets Syst 138(2):307–341

    Article  MathSciNet  Google Scholar 

  • Cory RE (2010) Supermaneuverable perching. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, USA

  • Cupertino F, Giordano V, Naso D, Delfine L (2006) Fuzzy control of a mobile robot. IEEE Robotics Autom Mag 13(4):74–81

    Article  Google Scholar 

  • Dai K, Kammoun H, Alimi A (2012) H-FQL: a new reinforcement learning method for automatic hierarchization of fuzzy systems: an application to the route choice problem. In: Intelligent systems (IS), 2012 6th IEEE international conference, pp 54–59

  • Doya K (2000) Reinforcement learning in continuous time and space. Neural Comput 12:219–245

    Article  Google Scholar 

  • Fahlman SE, Lebiere C (1990) The Cascade-Correlation Learning Architecture. In: Touretzky DS (ed) Advances in Neural Information Processing Systems 2. Morgan, Kaufmann, pp 524–532

  • Farahmand AM, Munos R, Szepesvári C (2010) Error propagation for approximate policy and value iteration. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Advances in Neural Information Processing Systems. Curran Associates, Inc., pp 568–576. http://papers.nips.cc/paper/4181-error-propagation-for-approximate-policy-and-value-iteration.pdf

  • Gaskett C (2002) Q-learning for robot control. PhD thesis, Research School of Information Sciences and Engineering, ANU

  • Glorennec PY, Jouffe L (1997) Fuzzy q-learning. In: Proceedings of Fuzz-IEEE 1997, 6th international conference on fuzzy systems. Barcelona, pp 659–662

  • Guanloa R, Musilek P, Ahmed F, Kaboli A (2004) Fuzzy situation based navigation of autonomous mobile robot using reinforcement learning. In: Proceedings of North American fuzzy information processing systems (NAFIPS), pp 820–825

  • Hagras H, Callaghan V, Colley M (2001) Outdoor mobile robot learning and adapation. IEEE Robotics Autom Mag 8(3):53–69

    Article  Google Scholar 

  • Hellendoorn H, Thomas C (1993) Defuzzification in fuzzy controllers. J Intell Fuzzy Syst 1:109–123

    Google Scholar 

  • Holve R (1997) Rule generation for hierarchical fuzzy systems. In: North American fuzzy information processing Societ—NAFIPS, pp 444–449. doi:10.1109/NAFIPS.1997.624082

  • Holve R (1998) Automatic input space partitioning for hierarchical fuzzy systems. In: Fuzzy information processing society—NAFIPs, 1998 conference of the North American. Pensacola Beach, pp 266–270

  • Holve R (1998b) Investigation of automatic rule generation for hierarchical fuzzy systems. In: Fuzzy systems proceesings, 1998, IEEE world congress on computational intelligence, vol 2. Anchorage, pp 973–978

  • Ishibuchi H, Nozaki K, Tanaka H (1992) Distributed representation of fuzzy rules and its application to pattern classification. Fuzzy Sets Syst 52(1):21–32

    Article  Google Scholar 

  • Kaelbling LP, Littman ML, Moore A (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285

    Google Scholar 

  • Kosko B (1992) Neural Networks and Fuzzy Systems: A Dynamical systems approach to machine intelligence. Prentice-Hall, Inc., Upper Saddle River, NJ, USA

  • Kuhlmann G, Stone P (2003) Progress in learning 3 vs. 2 keepaway. In: Systems, man and cybernetics, 2003 IEEE international conference on, vol 1, pp 52–59

  • Mitchell TM (1997) Machine learning. MIT Press, Cambridge

    MATH  Google Scholar 

  • Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. In: NIPS deep learning workshop

  • Munos R (1998) A general convergence method for reinforcement learning in the continuous case. In: 10th European conference on machine learning, pp 394–405

  • Munos R (2010) Approximate dynamic programming. In: Sigaud O, Buffet O (eds) Markov Decision Processes in Artificial Intelligence. ISTE Ltd and Wiley, chap 3, pp 67–98

  • Munos R (2000) A study of reinforcement learning in the continuous case by means of viscosity solutions. Mach Learn J 40:265–299

  • Munos R, Moore A (2002) Variable resolution discretization in optimal control. Mach Learn 49:291–323

    Article  MATH  Google Scholar 

  • Nozaki K, Ishibuchi H, Tanaka H (1997) A simple but powerful heuristic method for generating fuzzy rules from numerical data. Fuzzy Sets Syst 86:251–270

    Article  Google Scholar 

  • Pan L, bin Tong Y (2009) Research of reinforcement learning control of intelligent robot based on fuzzy-cmac network. In: Computer network and multimedia technology, 2009. CNMT 2009. International symposium on, pp 1–4. doi:10.1109/CNMT.2009.5374686

  • Passino KM, Yurkovich S (1998) Fuzzy control. Addison Wesley Longman, Menlo Park

    MATH  Google Scholar 

  • Ritthipravat P, Maneewarn T, Laowattana D, Wyatt J (2004) A modified approach to fuzzy q learning for mobile robots. In: Systems, man and cybernetics, 2004 IEEE international conference, vol 3, pp 2350–2356

  • Saffiotti A, Ruspini E, Konolige K (1999) Using fuzzy logic for mobile robot control. In: Handbooks of fuzzy sets, vol 6, chap 5. Kluwer Academic, MA, pp 185–205

  • Santamaria JC, Sutton R, Ram A (1998) Experiments with reinforcement learning in problems with continuous state and action spaces. Adap Behav 6(2):163–218

    Article  Google Scholar 

  • Schaal S, Atkeson CG, Vijayakumar S (2000) Real-time robot learning with locally weighted statistical learning. In: International conference on robotics and automation (irca2000), p 1280

  • Shi Z, Tu J, Li Y, Wang Z (2013) Adaptive reinforcement q-learning algorithm for swarm-robot system using pheromone mechanism. In: Robotics and biomimetics (ROBIO), 2013 IEEE international conference on, pp 952–957. doi:10.1109/ROBIO.2013.6739586

  • Smart WD, Kaelbling LP (2002) Reinforcement learning for robot control. In: Proceedings—SPIE the international society for optical engineering, SPIE, vol 4573, pp 92–103

  • Spong M (1994) Swing up control of the acrobot. IEEE international conference on robotics and automation. San Diego, CA, pp 2356–2361

    Google Scholar 

  • Spong M (1995) The swingup control problem for the acrobot. IEEE Control Syst Mag 15(1):49–55

    Article  Google Scholar 

  • Sudkamp T, Hammell RJ (1994) Interpolation, completion and learning fuzzy rules. Syst Man Cybern IEEE Trans 24:332–342

    Article  Google Scholar 

  • Sutton RS (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Touretzky DS, Mozer MC, Hasselmo ME (eds) Advances in neural information processing systems, vol 8. The MIT Press, Cambridge, pp 1038–1044

  • Sutton R, Singh S (1996) Reinforcement learning with replacing eligibility traces. Mach Learn 22:123–158

    MATH  Google Scholar 

  • Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge

    Google Scholar 

  • Takeda M, Nakamura T, Ogasawara T (2001) Continuous values q-learning method able to incrementally refine state space. In: Proceedings of the 2001 IEEE/RSJ international conference on intelligent robots and systems, vol 1. Maui, pp 265–271

  • Thongchai S (2002) Behavior-based learning fuzzy rules for mobile robots. In: Proceedings of the American control conference vol 2, pp 995–1000

  • Thrun S, Schwartz A (1993) Issues in using function approximation for reinforcement learning. In: Mozer M, Smolensky P, Touretzky DS, Elman J, Weigend A (eds) Proceedings of the connectionist models summer school. Hillsdale, pp 255–263

  • Wang L, Mendel J (1992) Generating fuzzy rules by learning from examples. IEEE Trans Syst Man Cybern 22(6):1414–1427

    Article  MathSciNet  Google Scholar 

  • Watkins C (1989) Learning with delayed rewards. PhD thesis, University of Cambridge, England

  • Watkins C, Dayan P (1992) Q learning. Mach Learn 8(3/4):279–292

    Article  MATH  Google Scholar 

  • Zadeh LA (1988) Fuzzy logic. Computer 21(4):83–93

    Article  Google Scholar 

Download references

Acknowledgments

This research was carried out in collaboration with BAE SYSTEMS, UK.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antony Waldock.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Waldock, A., Carse, B. Learning a robot controller using an adaptive hierarchical fuzzy rule-based system. Soft Comput 20, 2855–2881 (2016). https://doi.org/10.1007/s00500-015-1688-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-1688-3

Keywords

Navigation