Policy Search through Adaptive Function Approximation for Bidding in TAC SCM

  • Kyriakos C. Chatzidimitriou
  • Andreas L. Symeonidis
  • Pericles A. Mitkas
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 136)


Agent autonomy is strongly related to learning and adaptation. Machine learning models generated through the use of historical data or current environmental signals, provide agents with the necessary decision-making and generalization capabilities in competitive, dynamic, partially observable and stochastic environments. In this work, we discuss learning and adaptation in the context of the TAC SCM game. We apply a variety of machine learning and computational intelligence methods for generating the most efficient sales component of the agent, dealing with customer orders and production throughput. Along with utility maximization and bid acceptance probability estimation methods, we evaluate regression trees, particle swarm optimization, heuristic control and policy search via adaptive function approximation in order to build an efficient, near-real time, bidding mechanism. Results indicate that a suitable reinforcement learning setup coupled with the power of adaptive function approximation techniques is a good candidate for enabling high performance strategies.


adaptive function approximation trading agent competition supply chain management echo state networks neuroevolution of augmented reservoirs 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chatzidimitriou, K.C., Symeonidis, A.L., Kontogounis, I., Mitkas, P.A.: Agent mertacor: A robust design for dealing with uncertainty and variation in scm environments. Expert Systems with Applications 35(3), 591–603 (2008) (Cited by: Collins2008)Google Scholar
  2. 2.
    Stone, P.: Learning and multiagent reasoning for autonomous agents. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 13–30 (January 2007)Google Scholar
  3. 3.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  4. 4.
    Arunachalam, R., Sadeh, N.M.: The supply chain trading agent competition. Electronic Commerce Research and Applications 4(1), 66–84 (2005)CrossRefGoogle Scholar
  5. 5.
    Collins, J., Arunachalam, R., Sadeh, N., Eriksson, J., Finne, N., Janson, S.: The supply chain management game for the 2007 trading agent competition. Technical Report CMU-ISRI-07-100, Carnegie Mellon University (December 2006)Google Scholar
  6. 6.
    Jaeger, H.: Tutorial on training recurrent neural networks, covering BPTT, RTRL, EKF and the “echo state network” approach. Technical Report GMD Report 159, German National Research Center for Information Technology (2002)Google Scholar
  7. 7.
    Szita, I., Gyenes, V., Lőrincz, A.: Reinforcement learning with echo state networks. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006. LNCS, vol. 4131, pp. 830–839. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evolutionary Computation 10(2), 99–127 (2002)CrossRefGoogle Scholar
  9. 9.
    Chatzidimitriou, K.C., Mitkas, P.A.: A neat way for evolving echo state networks. In: European Conference on Artificial Intelligence. IOS Press (August 2010)Google Scholar
  10. 10.
    Pardoe, D., Stone, P.: An autonomous agent for supply chain management. In: Adomavicius, G., Gupta, A. (eds.) Handbooks in Information Systems Series: Business Computing, vol. 3, pp. 141–172. Emerald Group (2009)Google Scholar
  11. 11.
    Benisch, M., Greenwald, A., Grypari, I., Lederman, R., Naroditskiy, V., Tschantz, M.: Botticelli: A supply chain management agent designed to optimize under uncertainty. ACM Transactions on Computational Logic 4(3), 29–37 (2004)Google Scholar
  12. 12.
    Pardoe, D., Stone, P.: Bidding for customer orders in TAC SCM. In: Faratin, P., Rodríguez-Aguilar, J.-A. (eds.) AMEC 2004. LNCS (LNAI), vol. 3435, pp. 143–157. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Stan, M., Stan, B., Florea, A.M.: A dynamic strategy agent for supply chain management. In: Proceedings of the Eighth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, pp. 227–232 (2006)Google Scholar
  14. 14.
    Chatzidimitriou, K.C., Symeonidis, A.L.: Data-mining-enhanced agents in dynamic supply-chain-management environments. Intelligent Systems 24(3), 54–63 (2009); Special issue on Agents and Data MiningGoogle Scholar
  15. 15.
    Hogenboom, A., Ketter, W., van Dalen, J., Kaymak, U., Collins, J., Gupta, A.: Product pricing in TAC SCM using adaptive real-time probability of acceptance estimations based on economic regimes. In: Workshop: Trading Agent Design and Analysis (TADA) at Twenty-First International Joint Conference on Artificial Intelligence (IJCAI 2009), 15–24 (July 2009)Google Scholar
  16. 16.
    Benisch, M., Greenwald, A., Naroditskiy, V., Tschantz, M.C.: A stochastic programming approach to scheduling in TAC SCM. In: Proceedings of the 5th ACM Conference on Electronic Commerce, EC 2004, pp. 152–159. ACM, New York (2004)Google Scholar
  17. 17.
    Breiman, L., Friedman, J., Stone, C.J., Olshen, R.: Classification and Regression Trees. Chapman and Hall (1984)Google Scholar
  18. 18.
    Wang, Y., Witten, I.H.: Induction of model trees for predicting continuous classes. Poster Papers of the 9th European Conference on Machine Learning, pp. 128–137 (1997)Google Scholar
  19. 19.
    Kiekintveld, C., Miller, J., Jordan, P.R., Callender, L.F., Wellman, M.P.: Forecasting market prices in a supply chain game. Electronic Commerce Research and Applications 8, 63–77 (2009)CrossRefGoogle Scholar
  20. 20.
    Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of the International Conference on Neural Networks, pp. 1942–1948 (1995)Google Scholar
  21. 21.
    Kiekintveld, C., Wellman, M.P., Singh, S., Estelle, J., Vorobeychik, Y., Soni, V., Rudary, M.: Distributed feedback control for decision making on supply chains. In: Fourteenth International Conference on Automated Planning and Scheduling (2004)Google Scholar
  22. 22.
    He, M., Rogers, A., Luo, X., Jennings, N.R.: Designing a successful trading agent for supply chain management. In: Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS 2006 (2006)Google Scholar
  23. 23.
    R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2011) ISBN 3-900051-07-0Google Scholar
  24. 24.
    Therneau, T.M., port by Brian Ripley, B.A.R.: rpart: Recursive Partitioning, R package version 3.1-50 (2011)Google Scholar
  25. 25.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: An update. SIGKDD Explorations 11(1), 10–18 (2009)CrossRefGoogle Scholar
  26. 26.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)zbMATHGoogle Scholar
  27. 27.
    Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bulletin 1(6), 80–83 (1945)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Kyriakos C. Chatzidimitriou
    • 1
    • 2
  • Andreas L. Symeonidis
    • 1
    • 2
  • Pericles A. Mitkas
    • 1
    • 2
  1. 1.Department of Electrical and Computer EngineeringAristotle University of ThessalonikiGreece
  2. 2.Informatics and Telematics InstituteCentre for Research and Technology HellasGreece

Personalised recommendations