Skip to main content

Discussion

  • 3475 Accesses

Part of the Springer Theses book series (Springer Theses)

Abstract

This chapter summarizes the findings of this work from both a reinforcement learning perspective as well as a design of experiments perspective. We elaborate on our findings, discuss related work and extensions, note the innovations of this work, and present potential future directions for this work.

Keywords

  • Partial Little Square
  • Reinforcement Learning
  • Design Point
  • Partial Little Square Model
  • Domain Characteristic

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-12197-0_8
  • Chapter length: 16 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-12197-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Hardcover Book
USD   179.99
Price excludes VAT (USA)

References

  • Bect, J., Ginsbourger, D., Li, L., Picheny, V., & Vazquez, E. (2012). Sequential design of computer experiments for the estimation of a probability of failure. Statistics and Computing, 22(3), 773–793.

    CrossRef  MATH  MathSciNet  Google Scholar 

  • Bichon, B. J., Eldred, M. S., Swiler, L. P., Mahadevan, S., & McFarland, J. M. (2008). Efficient global reliability analysis for nonlinear implicit performance analysis. AIAA (American Institute of Aeronautics and Astronautics) Journal, 46(10), 76–96.

    Google Scholar 

  • Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. New York, NY: Chapman & Hall.

    MATH  Google Scholar 

  • Chapman, W. L., Welch, W. J., Bowman, K. P., Sacks, J., & Walsh, J. E. (1994). Arctic sea ice variability: Model sensitivities and a multidecadal simulation. Journal of Geophysical Research, 99(C1), 919–935.

    CrossRef  Google Scholar 

  • Chen, E. J. & Lin, M. (2014). Design of experiments for interpolation-based metamodels. Simulation Modelling Practice and Theory, 44, 14–25.

    CrossRef  Google Scholar 

  • Chen, X., Ankenman, B. E., & Nelson, B. L. (2012). The effects of Common Random Numbers on stochastic kriging metamodeling. ACM Transactions on Modeling and Computer Simulation, 22(2). doi: 10.1145/2133390.2133391

    Google Scholar 

  • Chen, X., Ankenman, B. E., & Nelson, B. L. (2013). Enhancing stochastic kriging metamodels with gradient estimators. Operations Research, 61(2), 512–528.

    CrossRef  MATH  MathSciNet  Google Scholar 

  • Dann, C., Neumann, G., & Peters, J. (2014). Policy evaluation with temporal differences: A survey and comparison. Journal of Machine Learning Research, 15(1), 809–883.

    Google Scholar 

  • Diesenroth, M. P., Neumann, G., & Peters, J. (2011). A survey on policy search for robotics. Foundations and Trends in Robotics, 2(1–2), 1–142.

    CrossRef  Google Scholar 

  • Embrechts, M. J., Hargis, B. J., & Linton, J. D. (2010). An augmented efficient backpropagation training strategy for deep autoassociative neural networks. In Proceedings of the 2010 International Joint Conference on Neural Networks (IJCNN), Barcelona, Spain, 18–23 July (pp. 1–6). doi: 10.1109/IJCNN.2010. 5596828

    Google Scholar 

  • Gatti, C. J. & Embrechts, M. J. (2014). An application of the temporal difference algorithm to the truck backer-upper problem. In Proceedings of the \(22\rm nd\) European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges, Belgium, 23–25 April. Bruges, Belgium: ESANN.

    Google Scholar 

  • Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2011a). Parameter settings of reinforcement learning for the game of Chung Toi. In Proceedings of the 2011 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2011), Anchorage, AK, 9–12 October (pp. 3530–3535). doi: 10.1109/ICSMC.2011.6084216

    Google Scholar 

  • Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2013). An empirical analysis of reinforcement learning using design of experiments. In Proceedings of the 21st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges, Belgium, 24–26 April (pp. 221–226). Bruges, Belgium: ESANN.

    Google Scholar 

  • Ghory, I. (2004). Reinforcement learning in board games (Technical Report CSTR-04-004, Department of Computer Science, University of Bristol). Retrieved from http://www.cs.bris.ac.uk/Publications/Papers/2000100.pdf.

  • Huang, D., Allen, T. T., Notz, W. I., & Zeng, N. (2006). Global optimization of stochastic black-box systems via sequential kriging meta-models. Journal of Global Optimization, 34(3), 441–466.

    CrossRef  MATH  MathSciNet  Google Scholar 

  • Jones, D. R., Schonlau, M., & Welch, W. J. (1998). Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13(4), 455–492.

    CrossRef  MATH  MathSciNet  Google Scholar 

  • Kalyanakrishnan, S. & Stone, P. (2009). An empirical analysis of value function-based and policy search reinforcement learning. In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS '09), Budapest, Hungary, 10–15 May (Vol. 2, pp. 749–756). Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems.

    Google Scholar 

  • Kalyanakrishnan, S. & Stone, P. (2011). Characterizing reinforcement learning methods through parameterized learning problems. Machine Learning, 84(1–2), 205–247.

    CrossRef  MathSciNet  Google Scholar 

  • Kliejnen, J. P. C. (2013). Simulation-optimization via kriging and bootstrapping: A survey (Technical Report 2013-064, Tilburg University: CentER). Retrieved from https://pure.uvt.nl/portal/files/1544115/2013-064.pdf.

  • LeCun, Y., Bottou, L., Orr, G.,, & Müller, K. (1998). Efficient backprop. In Orr, G. & Müller, K. (Eds.), Neural Networks: Tricks of the Trade, volume 1524 (pp. 5–50). Berlin: Springer.

    Google Scholar 

  • Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3–4), 293–321.

    Google Scholar 

  • Loeppky, J. L., Sacks, J., & Welch, W. J. (2009). Choosing the sample size of a computer experiment: A practical guide. Technometrics, 51(4), 366–376.

    CrossRef  MathSciNet  Google Scholar 

  • Patist, J. P. & Wiering, M. (2004). Learning to play draughts using temporal difference learning with neural networks and databases. In Proceedings of the 13th Belgian-Dutch Conference on Machine Learning, Brussels, Belgium, 8–9 January (pp. 87–94). doi: 10.1007/978-3-540-88190-2_13

    Google Scholar 

  • Pichney, V., Ginsbourger, D., Roustant, O., Haftka, R. T., & Kim, N.-H. (2010). Adaptive design of experiments for accurate approximation of a target region. Journal of Mechanical Engineering, 132(7), 1–9.

    Google Scholar 

  • Ranjan, P., Bingham, D., & Michailidis, G. (2008). Sequential experiment design for contour estimation from complex computer codes. Technometrics, 50(4), 527–541.

    CrossRef  MathSciNet  Google Scholar 

  • Robertson, B. L., Price, C. J., & Reale, M. (2013). CARTopt: A random search method for nonsmooth unconstrained optimization. Computational Optimization and Applications, 56(2), 291–315.

    CrossRef  MATH  MathSciNet  Google Scholar 

  • Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning. Cambridge, MA: MIT Press.

    Google Scholar 

  • Tesauro, G. (1990). Neurogammon: A neural network backgammon program. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), San Diego, CA, 17–21 June (Vol. 3, pp. 33–39). doi: 10.1109/IJCNN.1990. 137821

    Google Scholar 

  • Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8(3–4), 257–277.

    MATH  Google Scholar 

  • Tesauro, G. (1994). TD-Gammon, a self-teaching backgammon program achieves master-level play. Neural Computation, 6(2), 215–219.

    CrossRef  Google Scholar 

  • Thrun, S. (1995). Learning to play the game of Chess. In Advances in Neural Information Processing Systems 7 (pp. 1069–1076). Cambridge, MA: MIT Press.

    Google Scholar 

  • Watkins, C. J. C. H. & Dayan, P. (1992). Q-learning. Machine Learning, 8(3–4), 279–292.

    MATH  Google Scholar 

  • Wiering, M. A. (1995). TD learning of game evaluation functions with hierarchical neural architectures. Unpublished masters thesis, Department of Computer Science, University of Amsterdam, Amsterdam, Netherlands.

    Google Scholar 

  • Wiering, M. A. (2010). Self-play and using an expert to learn to play backgammon with temporal difference learning. Journal of Intelligent Learning Systems & Applications, 2(2), 57–68.

    CrossRef  Google Scholar 

  • Wiering, M. A., Patist, J. P., & Mannen, H. (2007). Learning to play board games using temporal difference methods (Technical Report UU–CS–2005–048, Institute of Information and Computing Sciences, Utrecht University). Retrieved from http://www.ai.rug.nl/mwiering/GROUP/ARTICLES/learning_games_TR.pdf mwiering/GROUP/ARTICLES/learning_games_TR.pdf.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Gatti .

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Gatti, C. (2015). Discussion. In: Design of Experiments for Reinforcement Learning. Springer Theses. Springer, Cham. https://doi.org/10.1007/978-3-319-12197-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12197-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12196-3

  • Online ISBN: 978-3-319-12197-0

  • eBook Packages: EngineeringEngineering (R0)