Restricted Deep Belief Networks for Multi-view Learning

  • Yoonseop Kang
  • Seungjin Choi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6912)


Deep belief network (DBN) is a probabilistic generative model with multiple layers of hidden nodes and a layer of visible nodes, where parameterizations between layers obey harmonium or restricted Boltzmann machines (RBMs). In this paper we present restricted deep belief network (RDBN) for multi-view learning, where each layer of hidden nodes is composed of view-specific and shared hidden nodes, in order to learn individual and shared hidden spaces from multiple views of data. View-specific hidden nodes are connected to corresponding view-specific hidden nodes in the lower-layer or visible nodes involving a specific view, whereas shared hidden nodes follow inter-layer connections without restrictions as in standard DBNs. RDBN is trained using layer-wise contrastive divergence learning. Numerical experiments on synthetic and real-world datasets demonstrate the useful behavior of the RDBN, compared to the multi-wing harmonium (MWH) which is a two-layer undirected model.


Hide Node Neural Information Processing System Mean Average Precision Restricted Boltzmann Machine Deep Belief Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bengio, Y.: Learning deep architectures for AI. Foundations and Trends® in Machine Learning 2(1), 1–127 (2009)Google Scholar
  2. 2.
    Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Annual Conference on Learning Theory (COLT), Madison, Wisconsin (1998)Google Scholar
  3. 3.
    Chen, N., Zhu, J., Xing, E.P.: Predictive subspace learning for multi-view data: A large margin approach. In: Advances in Neural Information Processing Systems (NIPS), vol. 23. MIT Press, Cambridge (2010)Google Scholar
  4. 4.
    Choi, H., Choi, S., Choe, Y.: Manifold integration with Markov random walk. In: Proceedings of the AAAI National Conference on Artificial Intelligence (AAAI), Chicago, IL (2008)Google Scholar
  5. 5.
    Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Computation 14(8), 1771–1800 (2002)CrossRefzbMATHGoogle Scholar
  6. 6.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Hinton, G.E., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Hotelling, H.: Relations between two sets of variates. Biometrika 28, 312–377 (1936)CrossRefzbMATHGoogle Scholar
  9. 9.
    Lawrence, N.: Gaussian process latent variable models for visualisation of high dimensional data. In: Advances in Neural Information Processing Systems (NIPS), vol. 16. MIT Press, Cambridge (2004)Google Scholar
  10. 10.
    LeCun, Y., Huang, F.J., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Washington, DC (2004)Google Scholar
  11. 11.
    Lee, H., Choi, S.: Group nonnegative matrix factorization for EEG classification. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), Clearwater Beach, Florida (2009)Google Scholar
  12. 12.
    Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the International Conference on Machine Learning (ICML), Haifa, Israel (2010)Google Scholar
  13. 13.
    Salzmann, M., Ek, C.H., Urtasun, R., Darrell, T.: Factorized orthogonal latent spaces. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), Sardinia, Italy (2010)Google Scholar
  14. 14.
    Shon, A.P., Grochow, K., Hertzmann, A., Rao, R.P.N.: Learning shared latent structure for image synthesis and robotic imitation. In: Advances in Neural Information Processing Systems (NIPS), vol. 17. MIT Press, Cambridge (2006)Google Scholar
  15. 15.
    Sinha, P., Jaini, R.: Classification and annotation of digital photos using optical context data. In: Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR), Niagara Falls, Canada (2008)Google Scholar
  16. 16.
    Smolensky, P.: Information processing in dynamical systems: Foundations of harmony theory. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing: Explorations in the Microstructures of Cognition: Foundations, vol. 1, pp. 194–281. MIT Press, Cambridge (1986)Google Scholar
  17. 17.
    Welling, M., Rosen-Zvi, M., Hinton, G.: Exponential family harmoniums with an application to information retrieval. In: Advances in Neural Information Processing Systems (NIPS), vol. 17. MIT Press, Cambridge (2005)Google Scholar
  18. 18.
    Xing, E.P., Yan, R., Hauptmann, A.G.: Mining associated text and images with dual-wing harmonium. In: Proceedings of the Annual Conference on Uncertainty in Artificial Intelligence (UAI), Edinburgh, UK (2005)Google Scholar
  19. 19.
    Yang, J., Liu, Y., Xing, E.P., Hauptmann, A.G.: Harmonium models for semantic video representation and classification. In: Proceedings of the SIAM International Conference on Data Mining (SDM), Minineapolis, MN (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Yoonseop Kang
    • 1
  • Seungjin Choi
    • 1
    • 2
  1. 1.Department of Computer SciencePohang University of Science and TechnologyPohangKorea
  2. 2.Division of IT Convergence EngineeringPohang University of Science and TechnologyPohangKorea

Personalised recommendations