A parallel scheduling algorithm for reinforcement learning in large state space

Liu, Quan; Yang, Xudong; Jing, Ling; Li, Jin; Li, Jiao

doi:10.1007/s11704-012-1098-y

A parallel scheduling algorithm for reinforcement learning in large state space

Research Article
Published: 10 November 2012

Volume 6, pages 631–646, (2012)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Quan Liu^1,3,
Xudong Yang¹,
Ling Jing²,
Jin Li¹ &
…
Jiao Li¹

301 Accesses
1 Citation
Explore all metrics

Abstract

The main challenge in the area of reinforcement learning is scaling up to larger and more complex problems. Aiming at the scaling problem of reinforcement learning, a scalable reinforcement learning method, DCS-SRL, is proposed on the basis of divide-and-conquer strategy, and its convergence is proved. In this method, the learning problem in large state space or continuous state space is decomposed into multiple smaller subproblems. Given a specific learning algorithm, each subproblem can be solved independently with limited available resources. In the end, component solutions can be recombined to obtain the desired result. To address the question of prioritizing subproblems in the scheduler, a weighted priority scheduling algorithm is proposed. This scheduling algorithm ensures that computation is focused on regions of the problem space which are expected to be maximally productive. To expedite the learning process, a new parallel method, called DCS-SPRL, is derived from combining DCS-SRL with a parallel scheduling architecture. In the DCS-SPRL method, the subproblems will be distributed among processors that have the capacity to work in parallel. The experimental results show that learning based on DCS-SPRL has fast convergence speed and good scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Learning search algorithm: framework and comprehensive performance for solving optimization problems

Article Open access 09 May 2024

A review of motion planning algorithms for intelligent robots

Article Open access 25 November 2021

References

Zhao Z, Liu H. Searching for interacting features in subset selection. Intelligent Data Analysis, 2009, 13(2): 207–228
Google Scholar
Singh S P, Jaakola T, Jordan M I. Neural Information Processing Systems. Massachusetts: MIT Press, 1995, 361–368
Google Scholar
LaTorre A, Peña J M, Muelas S, Freitas A. Learning hybridization strategies in evolutionary algorithms. Intelligent Data Analysis, 2010, 14(3): 333–354
Google Scholar
Akiyama T, Hachiya H, Sugiyama M. Efficient exploration through active learning for value function approximation in reinforcement learning. Neural Networks, 2010, 23(5): 639–648
Article Google Scholar
Langlois M, Sloan R H. Reinforcement learning via approximation of the Q-function. Journal of Experimental and Theoretical Artificial Intelligence, 2010, 22(3): 219–235
Article MATH Google Scholar
Yao M H, Qu X Y, Li J H, Gu Q L, Tang L P. Study on Q-learning algorithm based on ART2. Chinese Control and Decision, 2011, 26(2): 227–232
MathSciNet MATH Google Scholar
Zaragoza J H, Morales E F. Relational reinforcement learning with continuous actions by combining behavioural cloning and locally weighted regression. Journal of Intelligent Learning Systems and Applications, 2010, 2(2): 69–79
Article Google Scholar
Mohan S, Laird J E. Relational reinforcement learning in infinite Mario. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence. 2010, 1953–1954
Wang W X, Xiao S D, Meng X Y, Chen Y S, Zhang W H. Model and architecture of hierarchical reinforcement learning based on agent. Chinese Journal of Mechanical Engineering, 2010, 46(2): 76–82 (in Chinese)
Article Google Scholar
Kozlova O, Sigaud O, Meyer C. TeXDYNA: hierarchical reinforcement learning in factored MDPs. In: Proceedings of the 11th International Conference on Simulation of Adaptive Behavior: from Animals to Animats. 2010, 489–500
Yang Q. Formalizing planning knowledge for hierarchical planning. Computational Intelligence, 1990, 6(1): 12–24
Article Google Scholar
Knoblock C A, Tenenberg J D, Yang Q. Characterizing abstraction hierarchies for planning. In: Proceedings of the 9th National Conference on Artificial Intelligence. 1991, 692–697
Smith D R. Applications of a strategy for designing divide-and-conquer algorithms. Science of Computer Programming, 1987, 8(3): 213–229
Article MATH Google Scholar
Tong L, Lu J L, Gong J W. Research on fast reinforcement learning. Journal of Beijing Institute of Technology, 2005, 25(4): 328–331 (in Chinese)
Google Scholar
Wang H Y. Novel heuristic Q-learning algorithm. Chinese Computer Engineering, 2009, 35(22): 173–175 (in Chinese)
Google Scholar
Song Q K, Hu Z Y. Q-Learning based on the experience knowledge. Chinese Control Theory and Applications, 2006, 25(11): 10–12 (in Chinese)
Google Scholar
Kretchmar R M. Parallel reinforcement learning. In: Proceedings of the 6th World Conference on Systemics, Cybernetics, and Informatics. 2002: 114–118
Wingate D, Seppi K D. P3VI: a partitioned, prioritized, parallel value iterator. In: Proceedings of the 21st International Conference on Machine Learning. 2004: 109–116
Meng W, Han X D. Parallel reinforcement learning algorithm and its application. Chinese Computer Engineering and Applications, 2009, 45(34): 25–28 (in Chinese)
Google Scholar
Kaya M, Arslan A. Parallel and distributed multiagent reinforcement learning. In: Proceedings of the 8th International Conference on Parallel and Distributed Systems. 2001: 437–441
Zhang C J, Lesser V. Coordinated multi-agent reinforcement learning in networked distributed POMDPs. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence. 2011, 764–770
Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 1998
Google Scholar
Tsitsiklis J N. Asynchronous stochastic approximation and Q-learning. Machine Learning, 1994, 16(3): 185–202
MathSciNet MATH Google Scholar
Kretchmar R M. Reinforcement learning algorithms for homogenous multi-agent systems. In: Workshop on Agent and Swarm Programming. 2003
Printista A M, Errecalde M L, Montoya C I. A parallel implementation of Q-learning based on communication with cache. Journal of Computer Science & Technology, 2002, 1(6)
Grounds M, Kudenko D. Parallel Reinforcement Learning with Linear Function Approximation. Berlin Heidelberg: Springer-Verlag, 2008, 60–74
Google Scholar
Kushida M, Takahashi K, Ueda H, Miyahara T. A comparative study of parallel reinforcement learning methods with a PC cluster system. In: Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology. 2006, 416–419
Mann T A, Choe Y. Scaling up reinforcement learning through targeted exploration. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence. 2011, 435–440
Tateyama T, Kawata S, Shimomura Y. Parallel reinforcement learning systems using exploration agents and dyna-Q algorithm. In: Proceedings of International Conference on Instrumentation, Control, Information Technology and System Integration. 2007, 2774–2778

Download references

Author information

Authors and Affiliations

Institute of Computer Science and Technology, Soochow University, Suzhou, 215006, China
Quan Liu, Xudong Yang, Jin Li & Jiao Li
Department of Computer Science and Technology, Nanjing University, Nanjing, 210093, China
Ling Jing
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
Quan Liu

Authors

Quan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xudong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ling Jing
View author publications
You can also search for this author in PubMed Google Scholar
Jin Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Quan Liu.

Additional information

Quan Liu is a professor and PhD supervisor of the Institute of Computer Science and Technology, Soochow University. He received his PhD and MSc in computing from Jilin University in 2004 and 1999, and his BSc in computer software from Daqing Petroleum Institute in 1991. He is a senior member of the China Computer Federation. His main research interests include intelligence information processing, automated reasoning, and machine learning.

Xudong Yang received his MSc and BSc in computing from Soochow University in 2012 and 2009, respectively. His research interests include machine learning and data mining.

Ling Jing received her MSc in computing from Nanjing University in 2012 and BSc in computing from Soochow University in 2009, respectively. Her research interests include machine learning and image processing.

Jin Li is a PhD student in Soochow University. She received her MSc and BSc in computing from Soochow University in 2012 and 2009, respectively. Her research interests include reinforcement learning and RoboCup.

Jiao Li received her MSc and BSc in computing fromSoochow University in 2012 and 2009, respectively. Her research interests include automated reasoning and data quality management.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Q., Yang, X., Jing, L. et al. A parallel scheduling algorithm for reinforcement learning in large state space. Front. Comput. Sci. 6, 631–646 (2012). https://doi.org/10.1007/s11704-012-1098-y

Download citation

Received: 30 June 2011
Accepted: 21 April 2012
Published: 10 November 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s11704-012-1098-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A parallel scheduling algorithm for reinforcement learning in large state space

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Learning search algorithm: framework and comprehensive performance for solving optimization problems

A review of motion planning algorithms for intelligent robots

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A parallel scheduling algorithm for reinforcement learning in large state space

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Learning search algorithm: framework and comprehensive performance for solving optimization problems

A review of motion planning algorithms for intelligent robots

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation