Elastic restricted Boltzmann machines for cancer data analysis
Restricted Boltzmann machines (RBMs) are endowed with the universal power of modeling (binary) joint distributions. Meanwhile, as a result of their confining network structure, training RBMs confronts less difficulties when dealing with approximation and inference issues. But little work has been developed to fully exploit the capacity of these models to analyze cancer data, e.g., cancer genomic, transcriptomic, proteomic and epigenomic data. On the other hand, in the cancer data analysis task, the number of features/predictors is usually much larger than the sample size, which is known as the “p ≫ N” problem and is also ubiquitous in other bioinformatics and computational biology fields. The “p ≫ N” problem puts the bias-variance trade-off in a more crucial place when designing statistical learning methods. However, to date, few RBM models have been particularly designed to address this issue.
We propose a novel RBMs model, called elastic restricted Boltzmann machines (eRBMs), which incorporates the elastic regularization term into the likelihood function, to balance the model complexity and sensitivity. Facilitated by the classic contrastive divergence (CD) algorithm, we develop the elastic contrastive divergence (eCD) algorithm which can train eRBMs efficiently.
We obtain several theoretical results on the rationality and properties of our model.We further evaluate the power of our model based on a challenging task — predicting dichotomized survival time using the molecular profiling of tumors. The test results show that the prediction performance of eRBMs is much superior to that of the state-of-the-art methods.
The proposed eRBMs are capable of dealing with the “p ≫ N” problems and have superior modeling performance over traditional methods. Our novel model is a promising method for future cancer data analysis.
KeywordsRBMs regularization cancer data analysis survival time prediction
This work was supported in part by the National Basic Research Program of China (Nos. 2011CBA00300 and 2011CBA00301), the National Natural Science Foundation of China (Nos. 61033001, 61361136003 and 61472205), and China’s Youth 1000-Talent Program, the Beijing Advanced Innovation Center for Structural Biology.
- 6.West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J. A., Marks, J. R. and Nevins, J. R. (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. USA, 98, 11462–11467CrossRefPubMedPubMedCentralGoogle Scholar
- 8.Tibshirani, R. (1994) Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. B, 58, 267–288Google Scholar
- 10.Fischer, A. and Igel, C. (2012) An Introduction to Restricted Boltzmann Machines. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Alvarez, L., Mejail, M., Gomez, L. and Jacobo, J. eds., Vol. 7441 of Lecture Notes in Computer Science, pp. 14–36, Berlin: SpringerCrossRefGoogle Scholar
- 15.Salakhutdinov, R. and Hinton, G. E. (2009) Deep boltzmann machines. In International Conference on Artificial Intelligence and Statistics, 448–455Google Scholar
- 18.Hinton, G. E. and Salakhutdinov, R. R. (2009) Replicated Softmax: an Undirected Topic model. In Advances in Neural Information Processing Systems 22. Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C. and Culotta, A. eds., pp. 1607–1614. New York: Curran Associates, IncGoogle Scholar
- 19.Salakhutdinov, R., Mnih, A. and Hinton, G. (2007) Restricted boltzmann machines for collaborative filtering. In Proceedings of the 24th International Conference on Machine Learning, 791–798Google Scholar
- 21.Hinton, G. (2010) A practical guide to training restricted Boltzmann machines. In Neural Networks: Tricks of the Trade, pp. 599–619. Berlin: SpringerGoogle Scholar
- 23.Bengio, Y. (2012) Practical recommendations for gradient-based training of deep architectures. arXiv:1206.5533Google Scholar
- 24.Schervish, M. J. (1995) Theory of Statistics. In Springer series in statistics. New York: Springer. Corrected second printing: 1997Google Scholar
- 29.Ranzato, M. A., Boureau, Y. L. and Le Cun, Y., (2008) Sparse Feature Learning for Deep Belief Networks. In Advances in Neural Information Processing Systems 20. Platt, J., Koller, D., Singer, Y. and Roweis, S. eds., pp. 1185–1192, New York: Curran Associates, IncGoogle Scholar
- 30.Ranzato, M. A., Poultney, C., Chopra, S. and Le Cun, Y. (2007) Efficient Learning of Sparse Representations with an Energy-based Model. In Advances in Neural Information Processing Systems 19. Schölkopf, B. Platt, J. and Hoffman, T., eds., pp. 1137–1144. Cambridge: MIT PressGoogle Scholar
- 31.Ranzato, M., Huang, F., Boureau, Y. and LeCun, Y. (2007) Unsupervised learning of invariant feature hierarchies with applications to object recognition. In Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 1–8Google Scholar
- 32.Nair, V. and Hinton, G. E. (2009) 3D Object Recognition with Deep Belief Nets. In Advances in Neural Information Processing Systems 22. Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C. and Culotta, A. eds., 1339–1347. New York: Curran Associates, IncGoogle Scholar
- 33.Min, W., Liu, J. and Zhang, S. (2016) Network-regularized sparse logistic regression models for clinical risk prediction and biomarker discovery. arXiv:1609.06480Google Scholar
- 35.Larochelle, H. and Bengio, Y. (2008) Classification using discriminative restricted boltzmann machines. In Proceedings of the 25th International Conference on Machine Learning, 536–543Google Scholar
- 36.Larochelle, H., Mandel, M., Pascanu, R. and Bengio, Y. (2012) Learning algorithms for the classification restricted boltzmann machine. J. Mach. Learn. Res., 13, 643–669Google Scholar
- 38.Vapnik, V. N. (1998) Statistical Learning Theory. 1 ed, New Jersey: WileyGoogle Scholar