Immediate Reward Reinforcement Learning for Clustering and Topology Preserving Mappings

Part of the Lecture Notes in Computer Science book series (LNCS, volume 5400)


We extend a reinforcement learning algorithm which has previously been shown to cluster data. Our extension involves creating an underlying latent space with some pre-defined structure which enables us to create a topology preserving mapping. We investigate different forms of the reward function, all of which are created with the intent of merging local and global information, thus avoiding one of the major difficulties with e.g. K-means which is its convergence to local optima depending on the initial values of its parameters. We also show that the method is quite general and can be used with the recently developed method of stochastic weight reinforcement learning [14].


Latent Space Reinforcement Learning Reward Function Exploratory Data Analysis Reinforcement Learning Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barbakh, W.: Local versus Global Interactions in Clustering Algorithms. Ph.D thesis, School of Computing, University of the West of Scotland (2008)Google Scholar
  2. 2.
    Barbakh, W., Fyfe, C.: Clustering with reinforcement learning. In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds.) IDEAL 2007. LNCS, vol. 4881, pp. 507–516. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Bishop, C.M., Svensen, M., Williams, C.K.I.: Gtm: The generative topographic mapping. Neural Computation (1997)Google Scholar
  4. 4.
    Friedman, J.H.: Exploratory projection pursuit. Journal of the American Statistical Association 82(397), 249–266 (1987)CrossRefGoogle Scholar
  5. 5.
    Friedman, J.H., Tukey, J.W.: A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers c-23(9), 881–889 (1974)CrossRefGoogle Scholar
  6. 6.
    Fyfe, C.: A scale invariant feature map. Network: Computation in Neural Systems 7, 269–275 (1996)CrossRefGoogle Scholar
  7. 7.
    Fyfe, C.: A comparative study of two neural methods of exploratory projection pursuit. Neural Networks 10(2), 257–262 (1997)CrossRefPubMedGoogle Scholar
  8. 8.
    Fyfe, C.: Two topographic maps for data visualization. Data Mining and Knowledge Discovery 14, 207–224 (2007)CrossRefGoogle Scholar
  9. 9.
    Intrator, N.: Feature extraction using an unsupervised neural network. Neural Computation 4(1), 98–107 (1992)CrossRefGoogle Scholar
  10. 10.
    Jones, M.C., Sibson, R.: What is projection pursuit. Journal of The Royal Statistical Society, 1–37 (1987)Google Scholar
  11. 11.
    Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)Google Scholar
  12. 12.
    Kohonen, T.: Self-Organising Maps. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  13. 13.
    Likas, A.: A reinforcement learning approach to on-line clustering. Neural Computation (2000)Google Scholar
  14. 14.
    Ma, X., Likharev, K.K.: Global reinforcement learning in neural networks with stochastic synapses. IEEE Transactions on Neural Networks 18(2), 573–577 (2007)CrossRefPubMedGoogle Scholar
  15. 15.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. MIT Press, Cambridge (1998)Google Scholar
  16. 16.
    Williams, R.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229–256 (1992)Google Scholar
  17. 17.
    Williams, R.J., Pong, J.: Function optimization using connectionist reinforcement learning networks. Connection Science 3, 241–268 (1991)CrossRefGoogle Scholar
  18. 18.
    Zhang, B.: Generalized k-harmonic means – boosting in unsupervised learning. Technical report, HP Laboratories, Palo Alto (October 2000)Google Scholar
  19. 19.
    Zhang, B., Hsu, M., Dayal, U.: K-harmonic means - a data clustering algorithm. Technical report, HP Laboratories, Palo Alto (October 1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  1. 1.Applied Computational Intelligence Research UnitThe University of the West of ScotlandScotland

Personalised recommendations