Adaptive goal selection for agents in dynamic environments

Zhang, Huiliang; Luo, Xudong; Miao, Chunyan; Shen, Zhiqi; You, Jin

doi:10.1007/s10115-013-0645-7

Adaptive goal selection for agents in dynamic environments

Regular Paper
Published: 18 April 2013

Volume 37, pages 665–692, (2013)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Huiliang Zhang¹,
Xudong Luo²,
Chunyan Miao¹,
Zhiqi Shen³ &
…
Jin You⁴

412 Accesses
1 Citation
Explore all metrics

Abstract

In psychology, goal-setting theory, which has been studied by psychologists for over 35 years, reveals that goals play significant roles in incentive, action and performance for human beings. Based on this theory, a goal net model has been proposed to design intelligent agents that can be viewed as a soft copy of human being somehow. The goal net model has been successfully applied in many agents, specially, non-player-character agents in computer games. Such an agent selects the optimal solution in all possible solutions found by using a recursive algorithm. However, if a goal net is very complex, the time of selection could be too long for the agent to respond quickly when the agent needs to re-select a new solution against the world’s change. Moreover, in some dynamic environments, it is impossible to know the exact outcome of choosing a solution in advance, and so the possible solutions cannot be evaluated precisely. Thus, to address the problem, this paper applies learning algorithm into goal selection in dynamic environments. More specifically, we first develop a reorganization algorithm that can convert a goal net to its equivalent counterpart that a Q-learning algorithm can operate on; then, we define the key component of Q-learning, reward function, according to the feature of goal nets; and finally lots of experiments are conducted to show that, in dynamic environments, the agent with the learning algorithm significantly outperforms the one with the recursive searching algorithm. Therefore, our work suggests an agent model that can effectively be applied in dynamic time-sensitive domain, like computer games and the P2P systems of online movie watching.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Monte Carlo Tree Search: a review of recent modifications and applications

Article Open access 19 July 2022

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Notes

In this paper, the meaning of goal net is goal network. It is irrelevant to soccer goal nets.
We give out here the details of the algorithm for the sake of readers’ convenience.
Of course, we can also use more complicated methods of multiple attributes and constraints [25, 27, 28, 51, 53]. However, it is beyond the scope of this paper, and so, we will do it in the future work.
The status variables could be more than gain and cost. However, they are beyond the scope of this paper, and it is worth discussing in future work.
http://ww.sics.se/tac/.

References

Ahmed R, Karypis G (2012) Algorithms for mining the evolution of conserved relational states in dynamic networks. Knowl Inf Syst 33(3):603–630
Article Google Scholar
Amyot D, Ghanavati S, Horkoff J, Mussbacher G, Peyton L, Yu E (2010) Evaluating goal models within the goal-oriented requirement language. Int J Intell Syst 25(8):841–877
Article Google Scholar
Anchuri P, Zaki MJ, Barkol O, Bergman R, Felder Y, Golan S, Sityon A (2012) Graph mining for discovering infrastructure patterns in configuration management databases. Knowl Inf Syst 33(3):491–522
Article Google Scholar
Baylor AL, Kim Y (2005) Simulating instructional roles through pedagogical agents. Int J Artif Intell Educ 15(2):95–115
Google Scholar
Blashford-Snell V (2008) The cooking book. Dorling Kindersley, London, United Kingdom
Chang M, He M, Luo X (2010) Designing a successful adaptive agent for TAC Ad auction. In: Proceedings of the 19th European conference on artificial intelligence, Lisbon, Portugal, pp 587–592
Chang M, He M, Luo X (2011) AstonCAT-plus: an efficient specialist for the tac market design tournament. In: Proceedings of 22nd international joint conferences on, artificial intelligence, pp 146–151
Dubois D, Grabisch M, Modave F, Prade H (2000) Relating decision under uncertainty and multicriteria decision making models. Int J Intell Syst 15:967979
Google Scholar
Erol K, Hendler J, Nau DS (1994) HTN planning: complexity and expressivity. In: Proceedings of the 12th national conference on artificial intelligence, Seattle, WA, USA, pp 1123–1128
Fikes RE, Nilsson NJ (1971) STRIPS: a new approach in the application of theorem proving to problem solving. Artif Intell 2:189–208
Article MATH Google Scholar
Fu Y, Zhu X, Li B (2012) A survey on instance selection for active learning. Knowl Inf Syst 1–35. doi:10.1007/s10115-012-0507-8
Haake M, Gulz A (2009) A look at the roles of look and roles in embodied pedagogical agents—a user preference perspective. Int J Artif Intell Educ 19(1):39–71
Google Scholar
He M, Leung H, Jennings NR (2003) A fuzzy logic based bidding strategy for autonomous agents in continuous double auctions. IEEE Trans Knowl Data Eng 15(6):1345–1363
Article Google Scholar
He M, Rogers A, Luo X, Jennings NR (2006) Designing a successful trading agent for supply chain management. In: Proceedings of the 5th international conference on autonomous agents and multi-agent systems, Hakodate, Japan, pp 1159–1166
Hogg C, Hector M-A, Kuter U (2008) HTN-MAKER: learning HTNs with minimal additional knowledge engineering required. In: Proceedings of the 23rd AAAI conference on artificial intelligence, Illinois, Chicago, pp 950–956
Hogg C, Kuter U, Munoz-Avila H (2009) Learning hierarchical task networks for nondeterministic planning domains. In: Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California, USA, pp 1708–1714
Huang Z, Huang Q (2012) To reach consensus using uninorm aggregation operator: a gossip-based protocol. Int J Intell Syst 27:375–395
Article Google Scholar
Ilghami O, Munoz-Avila H, Nau DS (2002) CaMeL: learning method preconditions for HTN planning. In: Proceedings of the 6th international conference on AI planning and scheduling, Toulouse, France, pp 131–141
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Google Scholar
Kapp MN, Sabourin R, Maupin P (2011) A dynamic optimization approach for adaptive incremental learning. Int J Intell Syst 26:1101–1124
Article Google Scholar
Lamsweerde AV (2001) Goal-oriented requirements engineering: a guided tour. In: Proceedings of the 5th IEEE international symposium on requirements engineering, Canada, Toronto, pp 249–263
Lovejoy W (2010a) Bargaining chains. Manage Sci 56(12):2282–2301
Google Scholar
Lovejoy W (2010b) Conversations with supply chain managers. Technical report, Ross School of Business, University of Michigan, Working paper 1145
Luo X (2012) The evaluation of a knowledge based acquisition system of fuzzy tradeoff strategies for negotiating agents. In: Proceedings of the 14th annual international conference on electronic commerce, Singapore, pp 157–158
Luo X, Jennings NR (2007) A spectrum of compromise aggregation operators for multi-attribute decision making. Artif Intell 171(2–3):161–184
Article MathSciNet MATH Google Scholar
Luo X, Jennings NR, Shadbolt N (2006) Acquiring user tradeoff strategies and preferences for negotiating agents: a default-then-adjust method. Int J Man-Mach Stud 64(4):304–321
Google Scholar
Luo X, Jennings NR, Shadbolt NR, Leung H-f, Lee JH-m (2003) A fuzzy constraint based model for bilateral, multi-issue negotiations in semi-competitive environments. Artif Intell 148(1–2):53–102
Article MathSciNet MATH Google Scholar
Luo X, Lee JH-m, Leung H-f, Jennings NR (2003) Prioritised fuzzy constraint satisfaction problems: axioms, instantiation and validation. Fuzzy Sets Syst 136(2):151–188
Article MathSciNet MATH Google Scholar
Luo X, Miao C, Jennings N, He M, Shen Z, Zhang M (2012) KEMNAD: a knowledge engineering methodology for negotiating agent development. Comput Intell 28(1):51–105
Article MathSciNet Google Scholar
Ma W, Xiong W, Luo X (2013) A model for decision making with missing, imprecise, and uncertain evaluations of multiple criteria. Int J Intell Syst 28(2):152–184
Article Google Scholar
McDermott D, Ghallab M, Howe A, Knoblock C, Ram A, Veloso M, Weld D, Wilkins D (1998) PDDL—the planning domain definition language. Technique report, Yale Center for Computational Vision and Control
Merigó JM, Casanovas M (2011) The uncertain induced quasi-arithmetic owa operator. Int J Intell Syst 26(1):1–24
Article MATH Google Scholar
Merrick K, Maher ML (2009) Motivated learning from interesting events: adaptive, multitask learning agents for complex environments. Adapt Behav 17(1):7–27
Article Google Scholar
Mitchell TM (1997) Machine learning. McGraw Hill, Burr Ridge
MATH Google Scholar
Nagurney A (2006) Supply chain network economics: dynamics of prices, flows, and profits. Edward Elgar Publishing, Cheltenham
Google Scholar
Natarajan S, Tadepalli P, Fern A (2012) A relational hierarchical model for decision-theoretic assistance. Knowl Inf Syst 32(2):329–349
Article Google Scholar
Nowaczyk S (2006) Learning of agents with limited resources. In: Proceedings of the 21st national conference on artificial intelligence, Boston, Massachusetts, pp 1893–1894
Pednault E (1989) ADL: exploring the middle ground between strips and the situation calculus. In: Proceedings of the 1st international conference on principles of knowledge representation and reasoning, Toronto, ON. Morgan Kaufmann, San Mateo, CA, pp 324–332
Rao AS, Georgeff M (1995) BDI agents: from theory to practice. In: Lesser VR, Gasser L (eds) Proceedings of the 1st international conference on multiagent systems, San Francisco, USA, pp 312–319
Sacerdoti E (1975) The non-linear nature of plans. In: Proceedings of the 4th international joint conference on artificial intelligence, Tbilisi, Georgia, pp 206–214
Shen Z (2005) Goal-oriented modeling for intelligent agents and their applications. PhD thesis, Nanyang Technological University
Shen Z, Li D, Miao C, Gay R (2005) Goal-oriented methodology for agent system development. In: Proceedings of the 2005 IEEE/WIC/ACM international conference on intelligent agent technology, Compiegne, France, pp 95–101
Shen Z, Miao C, Gay R (2004) Goal oriented modeling for intelligent software agents. In: Proceedings of the 2004 IEEE/WIC/ACM international conference on intelligent agent technology, Beijing, China, pp 540–543
Shen Z, Miao C, Miao Y, Tao X, Gay R (2006) A goal-oriented approach to goal selection and action selection. In: Proceedings of the IEEE international conference on fuzzy systems, 2006. IEEE world congress on computational intelligence, Vancouver, BC, Canada, pp 114–121
Singh D, Sardina S, Padgham L, Airiau S (2010) Learning context conditions for BDI plan selection. In: Proceedings of the 9th international conference on autonomous agents and multiagent systems, Toronto, Canada, pp 325–332
Strobbe M, Van Laere O, Dhoedt B, De Turck F, Demeester P (2012) Hybrid reasoning technique for improving context-aware applications. Knowl Inf Syst 31(3):581–616
Article Google Scholar
Sukthankar G, Sycara K (2008) Hypothesis pruning and ranking for large plan recognition problems. In: Proceedings of the 23rd national conference on artificial intelligence, vol 2. Chicago, Illinois, pp 998–1003
Warren JF (2003) Rickshaw coolie: a people’s history of Singapore, 1880–1940. NUS Press, Singapore
Watkins C, Dayan P (1992) Q-learning. Mach Learn 8:279–292
MATH Google Scholar
Xiao Z, Chen W, Li L (2012) A method based on interval-valued fuzzy soft set for multi-attribute group decision-making problems under uncertain environment. Knowl Inf Syst 1–17. doi:10.1007/s10115-012-0496-7
Yager R (1988) On ordered weighted averaging aggregation operators in multicriteria decisionmaking. IEEE Trans Syst Man Cybernet 18(1):183–190
Article MathSciNet MATH Google Scholar
Yager RR (2008) Decision making under dempster-shafer uncertainties. In: Yager RR, Liping L (eds) Classic works of the Dempster-Shafer theory of belief functions. Springer, Berlin, pp 619–632
Yager R, Rybalov A (1996) Uninorm aggregation operators. Fuzzy Sets Syst 80:111–120
Article MathSciNet MATH Google Scholar
Yang Q, Wu K, Jiang Y (2007) Learning action models from plan examples using weighted MAX-SAT export. Artif Intell 171(2–3):107–143
Article MathSciNet MATH Google Scholar
Zhang H, Shen Z, Miao C (2009) Enabling goal oriented action planning with goal net. In: Proceedings of the 2009 IEEE/WIC/ACM international conference on intelligent agent technology, Milano, Italy, pp 271–274

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for very helpful comments. This paper is partially supported by National Natural Science Foundation of China (No. 61173019), Bairen plan of Sun Yat-sen University, and major projects of the Ministry of Education China (No. 10JZD0006).

Author information

Authors and Affiliations

School of Computer Engineering, Nanyang Technological University, Singapore, Singapore
Huiliang Zhang & Chunyan Miao
Institute of Logic and Cognition, Sun Yat-sen University, Guangzhou, 510275, China
Xudong Luo
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
Zhiqi Shen
Department of Psychology, University of Houston, Houston, TX, USA
Jin You

Authors

Huiliang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xudong Luo
View author publications
You can also search for this author in PubMed Google Scholar
Chunyan Miao
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqi Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jin You
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xudong Luo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Luo, X., Miao, C. et al. Adaptive goal selection for agents in dynamic environments. Knowl Inf Syst 37, 665–692 (2013). https://doi.org/10.1007/s10115-013-0645-7

Download citation

Received: 06 March 2012
Revised: 08 February 2013
Accepted: 09 March 2013
Published: 18 April 2013
Issue Date: December 2013
DOI: https://doi.org/10.1007/s10115-013-0645-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive goal selection for agents in dynamic environments

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive goal selection for agents in dynamic environments

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation