Skip to main content

Advertisement

Log in

Adaptive goal selection for agents in dynamic environments

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In psychology, goal-setting theory, which has been studied by psychologists for over 35 years, reveals that goals play significant roles in incentive, action and performance for human beings. Based on this theory, a goal net model has been proposed to design intelligent agents that can be viewed as a soft copy of human being somehow. The goal net model has been successfully applied in many agents, specially, non-player-character agents in computer games. Such an agent selects the optimal solution in all possible solutions found by using a recursive algorithm. However, if a goal net is very complex, the time of selection could be too long for the agent to respond quickly when the agent needs to re-select a new solution against the world’s change. Moreover, in some dynamic environments, it is impossible to know the exact outcome of choosing a solution in advance, and so the possible solutions cannot be evaluated precisely. Thus, to address the problem, this paper applies learning algorithm into goal selection in dynamic environments. More specifically, we first develop a reorganization algorithm that can convert a goal net to its equivalent counterpart that a Q-learning algorithm can operate on; then, we define the key component of Q-learning, reward function, according to the feature of goal nets; and finally lots of experiments are conducted to show that, in dynamic environments, the agent with the learning algorithm significantly outperforms the one with the recursive searching algorithm. Therefore, our work suggests an agent model that can effectively be applied in dynamic time-sensitive domain, like computer games and the P2P systems of online movie watching.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. In this paper, the meaning of goal net is goal network. It is irrelevant to soccer goal nets.

  2. We give out here the details of the algorithm for the sake of readers’ convenience.

  3. Of course, we can also use more complicated methods of multiple attributes and constraints [25, 27, 28, 51, 53]. However, it is beyond the scope of this paper, and so, we will do it in the future work.

  4. The status variables could be more than gain and cost. However, they are beyond the scope of this paper, and it is worth discussing in future work.

  5. http://ww.sics.se/tac/.

References

  1. Ahmed R, Karypis G (2012) Algorithms for mining the evolution of conserved relational states in dynamic networks. Knowl Inf Syst 33(3):603–630

    Article  Google Scholar 

  2. Amyot D, Ghanavati S, Horkoff J, Mussbacher G, Peyton L, Yu E (2010) Evaluating goal models within the goal-oriented requirement language. Int J Intell Syst 25(8):841–877

    Article  Google Scholar 

  3. Anchuri P, Zaki MJ, Barkol O, Bergman R, Felder Y, Golan S, Sityon A (2012) Graph mining for discovering infrastructure patterns in configuration management databases. Knowl Inf Syst 33(3):491–522

    Article  Google Scholar 

  4. Baylor AL, Kim Y (2005) Simulating instructional roles through pedagogical agents. Int J Artif Intell Educ 15(2):95–115

    Google Scholar 

  5. Blashford-Snell V (2008) The cooking book. Dorling Kindersley, London, United Kingdom

  6. Chang M, He M, Luo X (2010) Designing a successful adaptive agent for TAC Ad auction. In: Proceedings of the 19th European conference on artificial intelligence, Lisbon, Portugal, pp 587–592

  7. Chang M, He M, Luo X (2011) AstonCAT-plus: an efficient specialist for the tac market design tournament. In: Proceedings of 22nd international joint conferences on, artificial intelligence, pp 146–151

  8. Dubois D, Grabisch M, Modave F, Prade H (2000) Relating decision under uncertainty and multicriteria decision making models. Int J Intell Syst 15:967979

    Google Scholar 

  9. Erol K, Hendler J, Nau DS (1994) HTN planning: complexity and expressivity. In: Proceedings of the 12th national conference on artificial intelligence, Seattle, WA, USA, pp 1123–1128

  10. Fikes RE, Nilsson NJ (1971) STRIPS: a new approach in the application of theorem proving to problem solving. Artif Intell 2:189–208

    Article  MATH  Google Scholar 

  11. Fu Y, Zhu X, Li B (2012) A survey on instance selection for active learning. Knowl Inf Syst 1–35. doi:10.1007/s10115-012-0507-8

  12. Haake M, Gulz A (2009) A look at the roles of look and roles in embodied pedagogical agents—a user preference perspective. Int J Artif Intell Educ 19(1):39–71

    Google Scholar 

  13. He M, Leung H, Jennings NR (2003) A fuzzy logic based bidding strategy for autonomous agents in continuous double auctions. IEEE Trans Knowl Data Eng 15(6):1345–1363

    Article  Google Scholar 

  14. He M, Rogers A, Luo X, Jennings NR (2006) Designing a successful trading agent for supply chain management. In: Proceedings of the 5th international conference on autonomous agents and multi-agent systems, Hakodate, Japan, pp 1159–1166

  15. Hogg C, Hector M-A, Kuter U (2008) HTN-MAKER: learning HTNs with minimal additional knowledge engineering required. In: Proceedings of the 23rd AAAI conference on artificial intelligence, Illinois, Chicago, pp 950–956

  16. Hogg C, Kuter U, Munoz-Avila H (2009) Learning hierarchical task networks for nondeterministic planning domains. In: Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California, USA, pp 1708–1714

  17. Huang Z, Huang Q (2012) To reach consensus using uninorm aggregation operator: a gossip-based protocol. Int J Intell Syst 27:375–395

    Article  Google Scholar 

  18. Ilghami O, Munoz-Avila H, Nau DS (2002) CaMeL: learning method preconditions for HTN planning. In: Proceedings of the 6th international conference on AI planning and scheduling, Toulouse, France, pp 131–141

  19. Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285

    Google Scholar 

  20. Kapp MN, Sabourin R, Maupin P (2011) A dynamic optimization approach for adaptive incremental learning. Int J Intell Syst 26:1101–1124

    Article  Google Scholar 

  21. Lamsweerde AV (2001) Goal-oriented requirements engineering: a guided tour. In: Proceedings of the 5th IEEE international symposium on requirements engineering, Canada, Toronto, pp 249–263

  22. Lovejoy W (2010a) Bargaining chains. Manage Sci 56(12):2282–2301

    Google Scholar 

  23. Lovejoy W (2010b) Conversations with supply chain managers. Technical report, Ross School of Business, University of Michigan, Working paper 1145

  24. Luo X (2012) The evaluation of a knowledge based acquisition system of fuzzy tradeoff strategies for negotiating agents. In: Proceedings of the 14th annual international conference on electronic commerce, Singapore, pp 157–158

  25. Luo X, Jennings NR (2007) A spectrum of compromise aggregation operators for multi-attribute decision making. Artif Intell 171(2–3):161–184

    Article  MathSciNet  MATH  Google Scholar 

  26. Luo X, Jennings NR, Shadbolt N (2006) Acquiring user tradeoff strategies and preferences for negotiating agents: a default-then-adjust method. Int J Man-Mach Stud 64(4):304–321

    Google Scholar 

  27. Luo X, Jennings NR, Shadbolt NR, Leung H-f, Lee JH-m (2003) A fuzzy constraint based model for bilateral, multi-issue negotiations in semi-competitive environments. Artif Intell 148(1–2):53–102

    Article  MathSciNet  MATH  Google Scholar 

  28. Luo X, Lee JH-m, Leung H-f, Jennings NR (2003) Prioritised fuzzy constraint satisfaction problems: axioms, instantiation and validation. Fuzzy Sets Syst 136(2):151–188

    Article  MathSciNet  MATH  Google Scholar 

  29. Luo X, Miao C, Jennings N, He M, Shen Z, Zhang M (2012) KEMNAD: a knowledge engineering methodology for negotiating agent development. Comput Intell 28(1):51–105

    Article  MathSciNet  Google Scholar 

  30. Ma W, Xiong W, Luo X (2013) A model for decision making with missing, imprecise, and uncertain evaluations of multiple criteria. Int J Intell Syst 28(2):152–184

    Article  Google Scholar 

  31. McDermott D, Ghallab M, Howe A, Knoblock C, Ram A, Veloso M, Weld D, Wilkins D (1998) PDDL—the planning domain definition language. Technique report, Yale Center for Computational Vision and Control

  32. Merigó JM, Casanovas M (2011) The uncertain induced quasi-arithmetic owa operator. Int J Intell Syst 26(1):1–24

    Article  MATH  Google Scholar 

  33. Merrick K, Maher ML (2009) Motivated learning from interesting events: adaptive, multitask learning agents for complex environments. Adapt Behav 17(1):7–27

    Article  Google Scholar 

  34. Mitchell TM (1997) Machine learning. McGraw Hill, Burr Ridge

    MATH  Google Scholar 

  35. Nagurney A (2006) Supply chain network economics: dynamics of prices, flows, and profits. Edward Elgar Publishing, Cheltenham

    Google Scholar 

  36. Natarajan S, Tadepalli P, Fern A (2012) A relational hierarchical model for decision-theoretic assistance. Knowl Inf Syst 32(2):329–349

    Article  Google Scholar 

  37. Nowaczyk S (2006) Learning of agents with limited resources. In: Proceedings of the 21st national conference on artificial intelligence, Boston, Massachusetts, pp 1893–1894

  38. Pednault E (1989) ADL: exploring the middle ground between strips and the situation calculus. In: Proceedings of the 1st international conference on principles of knowledge representation and reasoning, Toronto, ON. Morgan Kaufmann, San Mateo, CA, pp 324–332

  39. Rao AS, Georgeff M (1995) BDI agents: from theory to practice. In: Lesser VR, Gasser L (eds) Proceedings of the 1st international conference on multiagent systems, San Francisco, USA, pp 312–319

  40. Sacerdoti E (1975) The non-linear nature of plans. In: Proceedings of the 4th international joint conference on artificial intelligence, Tbilisi, Georgia, pp 206–214

  41. Shen Z (2005) Goal-oriented modeling for intelligent agents and their applications. PhD thesis, Nanyang Technological University

  42. Shen Z, Li D, Miao C, Gay R (2005) Goal-oriented methodology for agent system development. In: Proceedings of the 2005 IEEE/WIC/ACM international conference on intelligent agent technology, Compiegne, France, pp 95–101

  43. Shen Z, Miao C, Gay R (2004) Goal oriented modeling for intelligent software agents. In: Proceedings of the 2004 IEEE/WIC/ACM international conference on intelligent agent technology, Beijing, China, pp 540–543

  44. Shen Z, Miao C, Miao Y, Tao X, Gay R (2006) A goal-oriented approach to goal selection and action selection. In: Proceedings of the IEEE international conference on fuzzy systems, 2006. IEEE world congress on computational intelligence, Vancouver, BC, Canada, pp 114–121

  45. Singh D, Sardina S, Padgham L, Airiau S (2010) Learning context conditions for BDI plan selection. In: Proceedings of the 9th international conference on autonomous agents and multiagent systems, Toronto, Canada, pp 325–332

  46. Strobbe M, Van Laere O, Dhoedt B, De Turck F, Demeester P (2012) Hybrid reasoning technique for improving context-aware applications. Knowl Inf Syst 31(3):581–616

    Article  Google Scholar 

  47. Sukthankar G, Sycara K (2008) Hypothesis pruning and ranking for large plan recognition problems. In: Proceedings of the 23rd national conference on artificial intelligence, vol 2. Chicago, Illinois, pp 998–1003

  48. Warren JF (2003) Rickshaw coolie: a people’s history of Singapore, 1880–1940. NUS Press, Singapore

  49. Watkins C, Dayan P (1992) Q-learning. Mach Learn 8:279–292

    MATH  Google Scholar 

  50. Xiao Z, Chen W, Li L (2012) A method based on interval-valued fuzzy soft set for multi-attribute group decision-making problems under uncertain environment. Knowl Inf Syst 1–17. doi:10.1007/s10115-012-0496-7

  51. Yager R (1988) On ordered weighted averaging aggregation operators in multicriteria decisionmaking. IEEE Trans Syst Man Cybernet 18(1):183–190

    Article  MathSciNet  MATH  Google Scholar 

  52. Yager RR (2008) Decision making under dempster-shafer uncertainties. In: Yager RR, Liping L (eds) Classic works of the Dempster-Shafer theory of belief functions. Springer, Berlin, pp 619–632

  53. Yager R, Rybalov A (1996) Uninorm aggregation operators. Fuzzy Sets Syst 80:111–120

    Article  MathSciNet  MATH  Google Scholar 

  54. Yang Q, Wu K, Jiang Y (2007) Learning action models from plan examples using weighted MAX-SAT export. Artif Intell 171(2–3):107–143

    Article  MathSciNet  MATH  Google Scholar 

  55. Zhang H, Shen Z, Miao C (2009) Enabling goal oriented action planning with goal net. In: Proceedings of the 2009 IEEE/WIC/ACM international conference on intelligent agent technology, Milano, Italy, pp 271–274

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for very helpful comments. This paper is partially supported by National Natural Science Foundation of China (No. 61173019), Bairen plan of Sun Yat-sen University, and major projects of the Ministry of Education China (No. 10JZD0006).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xudong Luo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Luo, X., Miao, C. et al. Adaptive goal selection for agents in dynamic environments. Knowl Inf Syst 37, 665–692 (2013). https://doi.org/10.1007/s10115-013-0645-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0645-7

Keywords

Navigation