Reinforcement Learning of Projections

Barbakh, Wesam Ashour; Wu, Ying; Fyfe, Colin

doi:10.1007/978-3-642-04005-4_8

Wesam Ashour Barbakh,
Ying Wu &
Colin Fyfe

Part of the book series: Studies in Computational Intelligence ((SCI,volume 249))

804 Accesses

Abstract

In this chapter, we derive three general reinforcement learning methods [234]. for projection problems under the framework of reinforcement learning. The three methods share the same structure in that the component weight vectors are represented by stochastic units drawn from the Gaussian distribution with mean m and variance β²I. We create adaptive update rules for these parameters of the Gaussian distribution so as to maximize the expected value of the long-term reward. We first derive a particular form of immediate reward reinforcement learning which can be applied to solve linear and non-linear projection problems. Then we suggest that the reinforcement learning neural network can be implemented for unsupervised projection problems. Lastly, based on temporal difference learning, we investigate two new algorithms that are based on Sarsa-learning and Q-learning for projection problems. We show that the last method has accurate convergence, even for non-linear projections.

Also, it is frequently important in projection methods to identify multiple components. Although we can find more than one component by deflationary methods such as the Gram-Schmidt method, these methods seem to be outwith the framework of reinforcement learning. Thus, we describe a general method that can find more components by re-defining the reward functions. To perform deflationary orthogonalization, we extend the definition of reward functions so that it includes one basic reward function and one extended reward function. Based on such an idea, two different ways are derived to identify orthogonal directions in linear projection problems and kernel methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Authors

Wesam Ashour Barbakh
View author publications
You can also search for this author in PubMed Google Scholar
Ying Wu
View author publications
You can also search for this author in PubMed Google Scholar
Colin Fyfe
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Barbakh, W.A., Wu, Y., Fyfe, C. (2009). Reinforcement Learning of Projections. In: Non-Standard Parameter Adaptation for Exploratory Data Analysis. Studies in Computational Intelligence, vol 249. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04005-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-04005-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04004-7
Online ISBN: 978-3-642-04005-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics