Locally Weighted Learning for Control

Atkeson, Christopher G.; Moore, Andrew W.; Schaal, Stefan

doi:10.1007/978-94-017-2053-3_3

Christopher G. Atkeson^2,4,
Andrew W. Moore³ &
Stefan Schaal^2,4

576 Accesses
64 Citations

Abstract

Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We explain various forms that control tasks can take, and how this affects the choice of learning paradigm. The discussion section explores the interesting impact that explicitly remembering all previous experiences has on the problem of learning to control.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D. W. & Salzberg, S. L. (1993). Learning to catch: Applying nearest neighbor algorithms to dynamic control tasks. In Proceedings of the Fourth International Workshop on Artificial Intelligence and Statistics, pp. 363–368, Ft. Lauderdale, FL.
Google Scholar
Albus, J. S. (1981). Brains, Behaviour and Robotics. BYTE Books, McGraw-Hill.
Google Scholar
Atkeson, C. G. (1990). Using local models to control movement. In Touretzky, D. S. (ed.), Advances in Neural Information Processing Systems 2, pp. 316–323. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Atkeson, C. G. (1994). Using local trajectory optimizers to speed up global optimization in dynamic programming. In Hanson, S. J., Cowan, J. D. & Giles, C. L. (eds.), Advances in Neural Information Processing Systems 6, pp. 663–670. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Atkeson, C. G., Moore, A. W. & Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, this issue.
Google Scholar
Barto, A. G., Sutton, R. S. & Watkins, C. J. C. H. (1990). Learning and Sequential Decision Making. In Gabriel, M. & Moore, J. W. (eds.), Learning and Computational Neuroscience, pp. 539–602. MIT Press, Cambridge, MA.
Google Scholar
Barto, A. G., Bradtke, S. J. & Singh, S. P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence 72(1): 81-138.
Google Scholar
Bellman, R. E. (1957). Dynamic Programming. Princeton University Press, Princeton, NJ. Bertsekas, D. P. & Tsitsiklis, J. N. (1989). Parallel and Distributed Computation. Prentice Hall.
Google Scholar
Cannon, R. H. (1967). Dynamics of Physical Systems. McGraw-Hill.
Google Scholar
Cohn, D. A., Ghahramani, Z. & Jordan, M. I. (1995). Active learning with statistical models. In Tesauro, G., Touretzky, D. & Leen, T. (eds.), Advances in Neural Information Processing Systems 7. MIT Press.
Google Scholar
Connell, M. E. & Utgoff, P. E. (1987). Learning to control a dynamic physical system. In Sixth National Conference on Artificial Intelligence, pp. 456–460, Seattle, WA. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Conte, S. D. & De Boor, C. (1980). Elementary Numerical Analysis,McGraw Hill.
Google Scholar
Deng, K. & Moore, A. W. (1995). Multiresolution Instance-based Learning. In Proceedingsof the International Joint Conference on Artificial Intelligence,pp. 1233–1239. Morgan Kaufmann.
Google Scholar
Friedman, J. H. & Stuetzle, W. (1981). Projection Pursuit Regression. Journal of the American Statistical Association, 76 (376): 817–823.
Article Google Scholar
Grosse, E. (1989). LOESS: Multivariate Smoothing by Moving Least Squares. In C. K. Chul, L. L. S. & Ward, J. D. (eds.), Approximation Theory VI. Academic Press.
Google Scholar
Hastie, T. & Loader, C. (1993). Local regression: Automatic kernel carpentry. Statistical Science 8 (2): 120–143.
Google Scholar
Jordan, M. I. & Jacobs, R. A. (1990). Learning to control an unstable system with forward modeling. In Touretzky, D. (ed.), Advances in Neural Information Processing Systems 2, pp. 324–331. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Jordan, M. I. & Rumelhart, D. E. (1992). Forward Models: Supervised Learning with a Distal Teacher. Cognitive Science 16: 307–354.
Google Scholar
Kaelbling, L. P. (1993). Learning in Embedded Systems. MIT Press, Cambridge, MA. Kuperstein, M. (1988). Neural Model of Adaptive Hand-Eye Coordination for Single Postures. Science 239: 1308–3111.
Google Scholar
MacKay, D. J. C. (1992). Bayesian Model Comparison and Backprop Nets. In Moody, J. E., Hanson, S. J. & Lippman, R. P. (eds.), Advances in Neural Information Processing Systems 4, pp. 839–846. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Mahadevan, S. (1992). Enhancing Transfer in Reinforcement Learning by Building Stochastic Models of Robot Actions. In Machine Learning: Proceedings of the Ninth International Conference,pp. 290–299. Morgan Kaufmann.
Google Scholar
Maron, O. & Moore, A. (1994). Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation. In Advances in Neural Information Processing Systems 6, pp. 59–66. Morgan Kaufmann, San Mateo, CA.
Google Scholar
McCallum, R. A. (1995). Instance-based utile distinctions for reinforcement learning with hidden state. In Prieditis and Russell (1995), pp. 387–395.
Google Scholar
Mel, B. W. (1989). MURPHY: A Connectionist Approach to Vision-Based Robot Motion Planning. Technical Report CCSR-89–17A, University of Illinois at Urbana-Champaign.
Google Scholar
Miller, W. T. (1989). Real-Time Application of Neural Networks for Sensor-Based Control of Robots with Vision. IEEE Transactions on Systems, Man and Cybernetics 19 (4): 825–831.
Article Google Scholar
Moore, A. W. (1990). Acquisition of Dynamic Control Knowledge for a Robotic Manipulator. In Proceedings of the 7th International Conference on Machine Learning,pp. 244–252. Morgan Kaufmann.
Google Scholar
Moore, A. W. (1991a). Knowledge of Knowledge and Intelligent Experimentation for Learning Control. In Proceedings of the 1991 Seattle International Joint Conference on Neural Networks.
Google Scholar
Moore, A. W. (1991b). Variable Resolution Dynamic Programming: Efficiently Learning Action Maps in Multivariate Real-valued State-spaces. In Birnbaum, L. & Collins, G. (eds.), Machine Learning: Proceedings of the Eighth International Workshop, pp. 333337. Morgan Kaufmann.
Google Scholar
Moore, A. W. (1992). Fast, Robust Adaptive Control by Learning only Forward Models. In Moody, J. E., Hanson, S. J. & Lippman, R. P. (eds.), Advances in Neural Information Processing Systems 4, pp. 571–578. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Moore, A. W. & Atkeson, C. G. (1993). Prioritized Sweeping: Reinforcement Learning with Less Data and Less Real Time. Machine Learning 13: 103–130.
Google Scholar
Moore, A. W., Hill, D. J. & Johnson, M. P. (1992). An Empirical Investigation of Brute Force to Choose Features, Smoothers and Function Approximators. In Hanson, S., Judd, S. & Petsche, T. (eds.), Computational Learning Theory and Natural Learning Systems, Volume 3. MIT Press.
Google Scholar
Moore, A. W. & Lee, M. S. (1994). Efficient Algorithms for Minimizing Cross Validation Error. In Proceedings of the I1 th International Conference on Machine Learning,pp. 190–198. Morgan Kaufmann.
Google Scholar
Moore, A. W. & Schneider, J. (1995). Memory-Based Stochastic Optimization. In Proceedings of Neural Information Processing Systems Conference.
Google Scholar
Omohundro, S. M. (1987). Efficient Algorithms with Neural Network Behaviour. Journal of Complex Systems 1 (2): 273–347.
MathSciNet MATH Google Scholar
Omohundro, S. M. (1991). Bumptrees for Efficient Function, Constraint, and Classification Learning. In Lippmann, R. P., Moody, J. E. & Touretzky, D. S. (eds.), Advances in Neural Information Processing Systems 3, pp. 693–699. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Ortega, J. M. & Rheinboldt, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables. Academic Press.
Google Scholar
Peng, J. (1995). Efficient memory-based dynamic programming. In Prieditis and Russell (1995), pp. 438 416.
Google Scholar
Peng, J. & Williams, R. J. (1993). Efficient Learning and Planning Within the Dyna Framework. In Proceedings of the Second International Conference on Simulation of Adaptive Behavior. MIT Press.
Google Scholar
Pomerleau, D. (1994). Reliability estimation for neural network based autonomous driving. Robotics and Autonomous Systems, 12.
Google Scholar
Preparata, F. P. & Shamos, M. (1985). Computational Geometry. Springer-Verlag.
Google Scholar
Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P. (1988). Numerical Recipes in C. Cambridge University Press, New York, NY.
Google Scholar
Prieditis, A. & Russell, S. (eds.) (1995). Twelfth International Conference on Machine Learning, Tahoe City, CA. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Quinlan, J. R. (1993). Combining Instance-Based and Model-Based Learning. In Machine Learning: Proceedings of the Tenth International Conference,pp. 236–243. Morgan Kaufmann.
Google Scholar
Schaal, S. & Atkeson, C. (1994a). Robot Juggling: An Implementation of Memory-based Learning. Control Systems Magazine 14 (1): 57–71.
Article Google Scholar
Schaal, S. & Atkeson, C. G. (1994b). Assessing the Quality of Local Linear Models. In Cowan, J. D., Tesauro, G. & Alspector, J. (eds.), Advances in Neural Information Processing Systems 6, pp. 160–167. Morgan Kaufmann.
Google Scholar
Stanfill, C. & Waltz, D. (1986). Towards Memory-Based Reasoning. Communications of the ACM 29 (12): 1213–1228.
Google Scholar
Stengel, R. F. (1986). Stochastic Optimal Control. John Wiley and Sons.
Google Scholar
Sutton, R. S. (1988). Learning to Predict by the Methods of Temporal Differences. Machine Learning 3: 9-44.
Google Scholar
Sutton, R. S. (1990). Integrated Architecture for Learning, Planning, and Reacting Based on Approximating Dynamic Programming. In Proceedings of the 7th International Conference on Machine Learning,pp. 216–224. Morgan Kaufmann.
Google Scholar
Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD. Thesis, King’s College, University of Cambridge.
Google Scholar
Zografski, Z. (1992). Geometric and neuromorphic learning for nonlinear modeling, control and forecasting. In Proceedings of the 1992 IEEE International Symposium on Intelligent Control,pp. 158–163. Glasgow, Scotland. IEEE catalog number 92CH3110–4.
Google Scholar

Download references

Author information

Authors and Affiliations

Georgia Institute of Technology, College of Computing, 801 Atlantic Drive, Atlanta, GA, 30332-0280, USA
Christopher G. Atkeson & Stefan Schaal
Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA
Andrew W. Moore
ATR Human Information Processing Research Laboratories, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-02, Japan
Christopher G. Atkeson & Stefan Schaal

Authors

Christopher G. Atkeson
View author publications
You can also search for this author in PubMed Google Scholar
Andrew W. Moore
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Schaal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Navy Center for Applied Research in Artificial Intelligence, Naval Research Laboratory, Washington, D.C., USA
David W. Aha

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Atkeson, C.G., Moore, A.W., Schaal, S. (1997). Locally Weighted Learning for Control. In: Aha, D.W. (eds) Lazy Learning. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2053-3_3

Download citation

DOI: https://doi.org/10.1007/978-94-017-2053-3_3
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-4860-8
Online ISBN: 978-94-017-2053-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics