A new variant of restricted Boltzmann machine with horizontal connections

Shi, Guang; Zhang, Jiangshe; Ji, NanNan; Wang, ChangPeng

doi:10.1007/s00521-018-3460-y

A new variant of restricted Boltzmann machine with horizontal connections

Original Article
Published: 16 April 2018

Volume 31, pages 6521–6533, (2019)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Guang Shi¹,
Jiangshe Zhang¹,
NanNan Ji² &
…
ChangPeng Wang³

334 Accesses
1 Citation
Explore all metrics

Abstract

Restricted Boltzmann machines (RBMs) are successfully employed to construct deep architectures because their power of expression and the inference is tractable and easy. In this paper, we propose a model named self-connected restricted Boltzmann machine (SCRBM), which adds horizontal connections to the hidden layer to enable direct information transfer between hidden units. We present a simple and effective method based on greedy layer-wise procedure of deep learning algorithms to train the model. Under the algorithm, SCRBM has a three-layer architecture. The first hidden layer extracts features from the data, and the second hidden layer is used to stimulate various interactions between units in the layer. Specifically, to stimulate the lateral inhibition that exists in sensory systems, a log sparse item is introduced to the second hidden layer of SCRBM. Our experiments show that the features learned by our algorithm are more vivid and clean than those learned by basic RBM and SparseRBM. Further experiments show the performance of SCRBM outperforms basic RBM and SparseRBM on several widely used datasets in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative Analysis of Restricted Boltzmann Machine Models for Image Classification

Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information

A Method of Discriminative Features Extraction for Restricted Boltzmann Machines

Notes

The search region of \(\alpha _0\), \(\lambda\) and \(\alpha _1\) is \([10^{-4}, 10^{-2}]\), \([10^{-3}, 1]\) and \([10^{-3}, 1]\) respectively.
We choose six \(\lambda\) from \((10^{-3},1)\) using log space and decrease the epochs of pre-training and fine-tune stage to 10 in the experiment.

References

McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133
Article MathSciNet Google Scholar
Widrow B, Hoff ME (1962) Associative storage and retrieval of digital information in networks of adaptive neurons. In: Biological prototypes and synthetic systems, Springer US, pp 160–160
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Ciresan DC, Giusti A, Gambardella LM, Schmidhuber J (2013) Mitosis detection in breast cancer histology images with deep neural networks. In: Medical image computing and computer-assisted intervention–MICCAI 2013, Springer, Berlin, pp 411–418
Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42
Article Google Scholar
Dan CC, Giusti A, Gambardella LM, Schmidhuber J (2012) Deep neural networks segment neuronal membranes in electron microscopy images. Adv Neural Inf Process Syst 25:2852–2860
Google Scholar
Hinton GE, Deng L, Yu D, Dahl GE, Mohamed A-R, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath TN (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Sig Process Mag IEEE 29(6):82–97
Article Google Scholar
Von der Malsburg C (1973) Self-organization of orientation sensitive cells in the striate cortex. Biol Cybern 14(2):85–100
Google Scholar
Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79(8):2554–2558
Article MathSciNet Google Scholar
Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951
Article Google Scholar
Zhang H, Cao X, Ho JK, Chow TW (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inf 13(2):520–531
Article Google Scholar
Minsky ML, Papert SA (1987) Perceptrons-expanded edition: an introduction to computational geometry. MIT Press, Cambridge
MATH Google Scholar
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–55
Article Google Scholar
Werbos PJ (1982) Applications of advances in nonlinear sensitivity analysis. In: System modeling and optimization. Springer, Berlin, pp 762–770
Werbos PJ. Beyond regression: New tools for prediction and analysis in the behavioral sciences, Ph.d. dissertation Harvard University
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Article Google Scholar
Bengio Y, Lamblin P, Popovici D, Larochelle H et al (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153
Google Scholar
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Article MathSciNet Google Scholar
Hartline HK, Wagner HG, Ratliff F (1956) Inhibition in the eye of limulus. J Gen Physiol 39(5):651–673
Article Google Scholar
Lee H, Ekanadham C, Ng AY (2008) Sparse deep belief net model for visual area v2. In: Advances in neural information processing systems, vol 20, pp 873–880
Osindero S, Hinton GE (2008) Modeling image patches with a directed hierarchy of markov random fields. In: Advances in neural information processing systems, pp 1121–1128
Larochelle H, Erhan D, Vincent P (2009) Deep learning using robust interdependent codes. In: AISTATS, pp 312–319
Hinton GE, Sejnowski TJ (1986) Learning and relearning in boltzmann machines. Parallel Distrib Process Explor Microstruct Cognit 1:282–317
Google Scholar
Memisevic R, Hinton GE (2010) Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Comput 22(6):1473–1492
Article Google Scholar
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800
Article Google Scholar
Goldstein E (2013) Sensation and perception, Cengage Learning
Welling M, Hinton GE (2002) A new learning algorithm for mean field boltzmann machines. In: International conference on artificial neural networks (ICANN’02), Springer, Berlin, pp 351–357
Ranzato M, Boureau YL, Lecun Y (2007) Sparse feature learning for deep belief networks. Adv Neural Inf Process Syst 20:1185–1192
Google Scholar
Ji NN, Zhang JS, Zhang CX, Yin QY (2014) Enhancing performance of restricted boltzmann machines via log-sum regularization. Knowl-Based Syst 63:82–96
Article Google Scholar
Melacci S, Belkin M (2011) Laplacian support vector machines trained in the primal. J Mach Learn Res 12:1149–1184
MathSciNet MATH Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
LeCun Y, Huang FJ, Bottou L (2004) Learning methods for generic object recognition with invariance to pose and lighting. In: IEEE computer society conference on computer vision and pattern recognition (CVPR’04), vol 2, IEEE, pp II–97
Decoste D, Scholkopf B (2002) Training invariant support vector machines. Mach Learn 46(1–3):161–190
Article Google Scholar
Williams CK, Agakov FV. An analysis of contrastive divergence learning in gaussian boltzmann machines. Institute for Adaptive and Neural Computation
Teh YW, Welling M, Osindero S, Hinton GE (2003) Energy-based models for sparse overcomplete representations. J Mach Learn Res 4(12):1235–1260
MathSciNet MATH Google Scholar
Yuille AL (2005) The convergence of contrastive divergences. In: Advances in neural information processing systems, pp 1593–1600

Download references

Acknowledgements

This work is supported by the National Basic Research Program of China (973 Program, No. 2013CB329404), the National Natural Science Foundation of China (Nos. 61572393, 11501049, 11671317, 11131006) and the Special Program for Applied Research on Super Computation of the NSFC-Guangdong Joint Fund (the second phase).

Author information

Authors and Affiliations

School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, 710049, People’s Republic of China
Guang Shi & Jiangshe Zhang
School of Science, Chang’an University, Xi’an, 710064, People’s Republic of China
NanNan Ji
School of Mathematics and Information Science, Chang’an University, Xi’an, 710064, People’s Republic of China
ChangPeng Wang

Authors

Guang Shi
View author publications
You can also search for this author in PubMed Google Scholar
Jiangshe Zhang
View author publications
You can also search for this author in PubMed Google Scholar
NanNan Ji
View author publications
You can also search for this author in PubMed Google Scholar
ChangPeng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiangshe Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shi, G., Zhang, J., Ji, N. et al. A new variant of restricted Boltzmann machine with horizontal connections. Neural Comput & Applic 31, 6521–6533 (2019). https://doi.org/10.1007/s00521-018-3460-y

Download citation

Received: 06 September 2017
Accepted: 23 March 2018
Published: 16 April 2018
Issue Date: October 2019
DOI: https://doi.org/10.1007/s00521-018-3460-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new variant of restricted Boltzmann machine with horizontal connections

Abstract

Access this article

Similar content being viewed by others

Comparative Analysis of Restricted Boltzmann Machine Models for Image Classification

Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information

A Method of Discriminative Features Extraction for Restricted Boltzmann Machines

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal rights

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new variant of restricted Boltzmann machine with horizontal connections

Abstract

Access this article

Similar content being viewed by others

Comparative Analysis of Restricted Boltzmann Machine Models for Image Classification

Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information

A Method of Discriminative Features Extraction for Restricted Boltzmann Machines

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal rights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation