Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Learning a robot controller using an adaptive hierarchical fuzzy rule-based system

Abstract

The majority of machine learning techniques applied to learning a robot controller generalise over either a uniform or pre-defined representation that is selected by a human designer. The approach taken in this paper is to reduce the reliance on the human designer by adapting the representation to improve the generalisation during the learning process. An extension of a Hierarchical Fuzzy Rule-Based System (HFRBS) is proposed that identifies and refines inaccurate regions of a fuzzy controller, while interacting with the environment, for both supervised and reinforcement learning problems. The paper shows that a controller using an adaptive HFRBS can learn a suitable control policy using a fewer number of fuzzy rules for both a supervised and reinforcement learning problem and is not sensitive to the layout as with a uniform representation. In supervised learning problems, a small number of extra trials are required to find an effective representation but for reinforcement learning problems, the process of adapting the representation is shown to significantly reduce the time taken to learn a suitable control policy and hence open the door to high-dimensional problems.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

References

  1. Alcala R, Casillas J, Cordón O, Herrera F (2001) Building fuzzy graphs: features and taxonomy of learning for non-grid-oriented fuzzy rule-based systems. J Intell Fuzzy Syst 11:99–119

  2. Assilian S, Mamdani EH (1974) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man-Mach Stud 7(1):1–13

  3. Bastian A (1994) How to handle the flexibility of linguistic variables with applications. Int J Uncertain Fuziness Knowl-Based Syst 2(4):463–484

  4. Bellman R (1957) Dynamic programming. Princeton University Press, Princeton

  5. Bellman R (1961) Adaptive control processes: a guided tour. Princeton University Press, Princeton

  6. Bouchon-Meunier B, Marsala C (1999) Learning fuzzy decision rules. In: The Handbooks of fuzzy sets series, vol 4. Kluwer Academic Publishers, pp 279–304

  7. Boyan J, Moore A (1995) Generalization in reinforcement learning: safely approximating the value function. Adv Neural Inf Process Syst (NIPS) 7:369–376

  8. Bungartz HJ, Griebel M (2004) Sparse grids. Acta Numer 13:1–123

  9. Cara A, Pomares H, Rojas I, Lendek Z, Babuka R (2010) Online self-evolving fuzzy controller with global learning capabilities. Evol Syst 1(4):225–239. doi:10.1007/s12530-010-9016-8

  10. Carmona P, Castro J, Zurita J (2004) Strategies to identify fuzzy rules directly from certainty degrees: a comparison and a proposal. Fuzzy Syst IEEE Trans 12:631–640

  11. Carse B, Fogarty T, Munro A (1996) Evolutionary learning of fuzzy rule based controllers using genetic algorithms. Fuzzy Sets Syst 80:273–293

  12. Chen G, Pham TT (2001) Introduction to fuzzy sets, fuzzy logic and fuzzy control systems. CRC Press LLC, Boca Raton

  13. Cheong F, Lai R (2003) Constrained optimization of genetic fuzzy systems. In: Casillas J, Cordón O, Herrera F, Magdalena L (eds) Accuracy improvements in linguistic fuzzy modeling. Studies in fuzziness and soft computing, vol 129, chap 2. Springer, Berlin, pp 46–71

  14. Chung CC, Hauser J (1995) Nonlinear control of a swinging pendulum. Automatica 31(6):851–862. doi:10.1016/0005-1098(94)00148-C

  15. Cordón O, Herrera F, Peregrín A (1997) Applicability of the fuzzy operators in the design of fuzzy logic controllers. Fuzzy Sets Syst 86(1):15–41

  16. Cordón O, Herrera F, Zwir I (2001) Fuzzy modeling by hierarchical built fuzzy rule bases. Int J Approx Reason 27:61–93

  17. Cordón O, Herrera F, Zwir I (2003) A hierarchical knowledge-based environment for linguistic modeling: models and iterative methodology. Fuzzy Sets Syst 138(2):307–341

  18. Cory RE (2010) Supermaneuverable perching. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, USA

  19. Cupertino F, Giordano V, Naso D, Delfine L (2006) Fuzzy control of a mobile robot. IEEE Robotics Autom Mag 13(4):74–81

  20. Dai K, Kammoun H, Alimi A (2012) H-FQL: a new reinforcement learning method for automatic hierarchization of fuzzy systems: an application to the route choice problem. In: Intelligent systems (IS), 2012 6th IEEE international conference, pp 54–59

  21. Doya K (2000) Reinforcement learning in continuous time and space. Neural Comput 12:219–245

  22. Fahlman SE, Lebiere C (1990) The Cascade-Correlation Learning Architecture. In: Touretzky DS (ed) Advances in Neural Information Processing Systems 2. Morgan, Kaufmann, pp 524–532

  23. Farahmand AM, Munos R, Szepesvári C (2010) Error propagation for approximate policy and value iteration. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Advances in Neural Information Processing Systems. Curran Associates, Inc., pp 568–576. http://papers.nips.cc/paper/4181-error-propagation-for-approximate-policy-and-value-iteration.pdf

  24. Gaskett C (2002) Q-learning for robot control. PhD thesis, Research School of Information Sciences and Engineering, ANU

  25. Glorennec PY, Jouffe L (1997) Fuzzy q-learning. In: Proceedings of Fuzz-IEEE 1997, 6th international conference on fuzzy systems. Barcelona, pp 659–662

  26. Guanloa R, Musilek P, Ahmed F, Kaboli A (2004) Fuzzy situation based navigation of autonomous mobile robot using reinforcement learning. In: Proceedings of North American fuzzy information processing systems (NAFIPS), pp 820–825

  27. Hagras H, Callaghan V, Colley M (2001) Outdoor mobile robot learning and adapation. IEEE Robotics Autom Mag 8(3):53–69

  28. Hellendoorn H, Thomas C (1993) Defuzzification in fuzzy controllers. J Intell Fuzzy Syst 1:109–123

  29. Holve R (1997) Rule generation for hierarchical fuzzy systems. In: North American fuzzy information processing Societ—NAFIPS, pp 444–449. doi:10.1109/NAFIPS.1997.624082

  30. Holve R (1998) Automatic input space partitioning for hierarchical fuzzy systems. In: Fuzzy information processing society—NAFIPs, 1998 conference of the North American. Pensacola Beach, pp 266–270

  31. Holve R (1998b) Investigation of automatic rule generation for hierarchical fuzzy systems. In: Fuzzy systems proceesings, 1998, IEEE world congress on computational intelligence, vol 2. Anchorage, pp 973–978

  32. Ishibuchi H, Nozaki K, Tanaka H (1992) Distributed representation of fuzzy rules and its application to pattern classification. Fuzzy Sets Syst 52(1):21–32

  33. Kaelbling LP, Littman ML, Moore A (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285

  34. Kosko B (1992) Neural Networks and Fuzzy Systems: A Dynamical systems approach to machine intelligence. Prentice-Hall, Inc., Upper Saddle River, NJ, USA

  35. Kuhlmann G, Stone P (2003) Progress in learning 3 vs. 2 keepaway. In: Systems, man and cybernetics, 2003 IEEE international conference on, vol 1, pp 52–59

  36. Mitchell TM (1997) Machine learning. MIT Press, Cambridge

  37. Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. In: NIPS deep learning workshop

  38. Munos R (1998) A general convergence method for reinforcement learning in the continuous case. In: 10th European conference on machine learning, pp 394–405

  39. Munos R (2010) Approximate dynamic programming. In: Sigaud O, Buffet O (eds) Markov Decision Processes in Artificial Intelligence. ISTE Ltd and Wiley, chap 3, pp 67–98

  40. Munos R (2000) A study of reinforcement learning in the continuous case by means of viscosity solutions. Mach Learn J 40:265–299

  41. Munos R, Moore A (2002) Variable resolution discretization in optimal control. Mach Learn 49:291–323

  42. Nozaki K, Ishibuchi H, Tanaka H (1997) A simple but powerful heuristic method for generating fuzzy rules from numerical data. Fuzzy Sets Syst 86:251–270

  43. Pan L, bin Tong Y (2009) Research of reinforcement learning control of intelligent robot based on fuzzy-cmac network. In: Computer network and multimedia technology, 2009. CNMT 2009. International symposium on, pp 1–4. doi:10.1109/CNMT.2009.5374686

  44. Passino KM, Yurkovich S (1998) Fuzzy control. Addison Wesley Longman, Menlo Park

  45. Ritthipravat P, Maneewarn T, Laowattana D, Wyatt J (2004) A modified approach to fuzzy q learning for mobile robots. In: Systems, man and cybernetics, 2004 IEEE international conference, vol 3, pp 2350–2356

  46. Saffiotti A, Ruspini E, Konolige K (1999) Using fuzzy logic for mobile robot control. In: Handbooks of fuzzy sets, vol 6, chap 5. Kluwer Academic, MA, pp 185–205

  47. Santamaria JC, Sutton R, Ram A (1998) Experiments with reinforcement learning in problems with continuous state and action spaces. Adap Behav 6(2):163–218

  48. Schaal S, Atkeson CG, Vijayakumar S (2000) Real-time robot learning with locally weighted statistical learning. In: International conference on robotics and automation (irca2000), p 1280

  49. Shi Z, Tu J, Li Y, Wang Z (2013) Adaptive reinforcement q-learning algorithm for swarm-robot system using pheromone mechanism. In: Robotics and biomimetics (ROBIO), 2013 IEEE international conference on, pp 952–957. doi:10.1109/ROBIO.2013.6739586

  50. Smart WD, Kaelbling LP (2002) Reinforcement learning for robot control. In: Proceedings—SPIE the international society for optical engineering, SPIE, vol 4573, pp 92–103

  51. Spong M (1994) Swing up control of the acrobot. IEEE international conference on robotics and automation. San Diego, CA, pp 2356–2361

  52. Spong M (1995) The swingup control problem for the acrobot. IEEE Control Syst Mag 15(1):49–55

  53. Sudkamp T, Hammell RJ (1994) Interpolation, completion and learning fuzzy rules. Syst Man Cybern IEEE Trans 24:332–342

  54. Sutton RS (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Touretzky DS, Mozer MC, Hasselmo ME (eds) Advances in neural information processing systems, vol 8. The MIT Press, Cambridge, pp 1038–1044

  55. Sutton R, Singh S (1996) Reinforcement learning with replacing eligibility traces. Mach Learn 22:123–158

  56. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge

  57. Takeda M, Nakamura T, Ogasawara T (2001) Continuous values q-learning method able to incrementally refine state space. In: Proceedings of the 2001 IEEE/RSJ international conference on intelligent robots and systems, vol 1. Maui, pp 265–271

  58. Thongchai S (2002) Behavior-based learning fuzzy rules for mobile robots. In: Proceedings of the American control conference vol 2, pp 995–1000

  59. Thrun S, Schwartz A (1993) Issues in using function approximation for reinforcement learning. In: Mozer M, Smolensky P, Touretzky DS, Elman J, Weigend A (eds) Proceedings of the connectionist models summer school. Hillsdale, pp 255–263

  60. Wang L, Mendel J (1992) Generating fuzzy rules by learning from examples. IEEE Trans Syst Man Cybern 22(6):1414–1427

  61. Watkins C (1989) Learning with delayed rewards. PhD thesis, University of Cambridge, England

  62. Watkins C, Dayan P (1992) Q learning. Mach Learn 8(3/4):279–292

  63. Zadeh LA (1988) Fuzzy logic. Computer 21(4):83–93

Download references

Acknowledgments

This research was carried out in collaboration with BAE SYSTEMS, UK.

Author information

Correspondence to Antony Waldock.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Waldock, A., Carse, B. Learning a robot controller using an adaptive hierarchical fuzzy rule-based system. Soft Comput 20, 2855–2881 (2016). https://doi.org/10.1007/s00500-015-1688-3

Download citation

Keywords

  • Fuzzy systems
  • Reinforcement learning
  • Robotics