Abstract
Artificial neural networks can be represented by paths. Generated as random walks on a dense network graph, we find that the resulting sparse networks allow for deterministic initialization and even weights with fixed sign. Such networks can be trained sparse from scratch, avoiding the expensive procedure of training a dense network and compressing it afterwards. Although sparse, weights are accessed as contiguous blocks of memory. In addition, enumerating the paths using deterministic low discrepancy sequences, for example variants of the Sobol’ sequence, amounts to connecting the layers of neural units by progressive permutations, which naturally avoids bank conflicts in parallel computer hardware. We demonstrate that the artificial neural networks generated by low discrepancy sequences can achieve an accuracy within reach of their dense counterparts at a much lower computational complexity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Changpinyo, S., Sandler, M., Zhmoginov, A.: The power of sparsity in convolutional neural networks (2017). arXiv:1702.06257
Child, R., Gray, S., Radford, A., Sutskever, I.: Generating long sequences with sparse transformers (2019). arXiv:1904.10509
Dettmers, T., Zettlemoyer, L.: Sparse networks from scratch: faster training without losing performance. CoRR (2019). arxiv:abs/1907.04840
Dey, S., Beerei, P., Chugg, K.: Interleaver design for deep neural networks. In: 51st Asilomar Conference on Signals, Systems, and Computers, pp. 1979–1983. IEEE (2017)
Dey, S., Huang, K.-W., Beerel, P., Chugg, K.: Characterizing sparse connectivity patterns in neural networks. In: 2018 Information Theory and Applications Workshop (ITA), pp. 1–9. IEEE (2018)
Dey, S., Huang, K.-W., Beerel, P., Chugg, K.: Pre-defined sparse neural networks with hardware acceleration. CoRR (2018). arxiv:abs/1812.01164
Dey, S., Shao, Y., Chugg, K., Beerel, P.: Accelerating training of deep neural networks via sparse edge processing. In: Lintas, A., Rovetta, S., Verschure, P.F., Villa, A.E. (eds.) Artificial Neural Networks and Machine Learning—ICANN 2017, pp. 273–280. Springer International Publishing, Cham (2017)
Dick, J., Pillichshammer, F.: Digital Nets and Sequences. Cambridge University Press, Discrepancy Theory and Quasi-Monte Carlo Integration (2010)
Farhat, N.H., Psaltis, D., Prata, A., Paek, E.: Optical implementation of the Hopfield model. Appl. Opt. 24(10), 1469–1475 (1985)
Frankle, J., Carbin, M.: The lottery ticket hypothesis: finding sparse, trainable neural networks. In: International Conference on Learning Representations (ICLR) (2019)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323. JMLR Workshop and Conference Proceedings (2011)
Gray, S., Radford, A., Kingma, D.P.: GPU kernels for block-sparse weights (2017). arXiv:1711.09224
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Heitz, E., Belcour, L., Ostromoukhov, V., Coeurjolly, D., Iehl, J.C.: A low-discrepancy sampler that distributes Monte Carlo errors as a blue noise in screen space. In: SIGGRAPH’19 Talks. ACM, Los Angeles, United States (2019). https://hal.archives-ouvertes.fr/hal-02150657
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)
Jayakumar, S., Pascanu, R., Rae, J., Osindero, S., Elsen, E.: Top-KAST: Top-k always sparse training. Adv. Neural Inf. Process. Syst. 33 (2020)
Joe, S., Kuo, F.: Remark on algorithm 659: Implementing Sobol’s quasirandom sequence generator. ACM Trans. Math. Softw. 29(1), 49–57 (2003)
Joe, S., Kuo, F.: Notes on generating Sobol’ sequences. Technical report, School of Mathematics and Statistics, University of New South Wales (2008). http://web.maths.unsw.edu.au/~fkuo/sobol/joe-kuo-notes.pdf
Keller, A.: Myths of computer graphics. In: Niederreiter, H. (ed.) Monte Carlo and Quasi-Monte Carlo Methods 2004, pp. 217–243. Springer, Berlin (2006)
Keller, A.: Quasi-Monte Carlo image synthesis in a nutshell. In: Dick, J., Kuo, F., Peters, G., Sloan, I. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2012, pp. 203–238. Springer, Berlin (2013)
Keller, A., Grünschloß, L.: Parallel quasi-Monte Carlo integration by partitioning low discrepancy sequences. In: Plaskota, L., Woźniakowski, H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2010, pp. 487–498. Springer, Berlin (2012). http://gruenschloss.org/parqmc/parqmc.pdf
Kriman, S., Beliaev, S., Ginsburg, B., Huang, J., Kuchaiev, O., Lavrukhin, V., Leary, R., Li, J., Zhang, Y.: Quartznet: Deep automatic speech recognition with 1d time-channel separable convolutions. In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6124–6128 (2020). https://arxiv.org/abs/1910.10261
Kundu, S., Nazemi, M., Pedram, M., Chugg, K., Beerel, P.: Pre-defined sparsity for low-complexity convolutional neural networks (2020)
Kundu, S., Prakash, S., Akrami, H., Beerel, P., Chugg, K.: pSConv: A pre-defined sparse kernel based convolution for deep CNNs. In: 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 100–107 (2019). https://doi.org/10.1109/ALLERTON.2019.8919683
Kung, H., Leiserson, C.: Systolic arrays (for VLSI). SIAM Sparse Matrix Proc. 1978, 256–282 (1979)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Molchanov, P., Mallya, A., Tyree, S., Frosio, I., Kautz, J.: Importance estimation for neural network pruning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11264–11272 (2019)
Mordido, G., Van keirsbilck, M., Keller, A.: Instant quantization of neural networks using Monte Carlo methods. In: NeurIPS 2019 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing (NeurIPS 2019 EMC\(^2\)) (2019)
Mordido, G., Van keirsbilck, M., Keller, A.: Monte Carlo gradient quantization. In: CVPR 2020 Joint Workshop on Efficient Deep Learning in Computer Vision (CVPR 2020 EDLCV) (2020)
Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM, Philadelphia (1992)
Owen, A.: Randomly permuted \((t,m,s)\)-nets and \((t,s)\)-sequences. In: Niederreiter, H., Shiue, P. (eds.) Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing, Lecture Notes in Statistics, vol. 106, pp. 299–315. Springer, Berlin (1995)
Paulin, L., Coeurjolly, D., Iehl, J.C., Bonneel, N., Keller, A., Ostromoukhov, V.: Cascaded Sobol’ sampling. ACM Trans. Graph. 40(6), 274:1–274:13 (2021). https://hal.archives-ouvertes.fr/hal-03358957
Rui, X., Daquan, H., Zhineng, L.: A perfect shuffle type of interpattern association optical neural network model. Guangzi Xuebao/Acta Photonica Sinica 29(1) (2000)
Rumelhart, D., Hinton, G., Williams, R.: Learning representations by back-propagating errors. In: Anderson, J., Rosenfeld, E. (eds.) Neurocomputing: Foundations of Research, pp. 696–699. MIT Press, Cambridge, MA, USA (1988)
Sobol’, I.: On the Distribution of points in a cube and the approximate evaluation of integrals. Zh. vychisl. Mat. mat. Fiz. 7(4), 784–802 (1967). USSR Comput. Math. Math. Phys. 86–112
de Sousa, C.: An overview on weight initialization methods for feedforward neural networks. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 52–59. IEEE (2016)
Stone, H.: Parallel processing with the perfect shuffle. IEEE Trans. Comput. 20(2), 153–161 (1971)
Wächter, C.: Quasi-Monte Carlo Light Transport Simulation by Efficient Ray Tracing. Ph.D. thesis, Universität Ulm (2008)
Zhou, H., Lan, J., Liu, R., Yosinski, J.: Deconstructing lottery tickets: Zeros, signs, and the supermask. In: NeurIPS 2019 (2019). Arxiv:1905.01067
Zhu, C., Han, S., Mao, H., Dally, W.: Trained ternary quantization. CoRR (2016). arxiv:abs/1612.01064
Acknowledgements
The first author is very thankful to Cédric Villani for a discussion on structure to be discovered in neural networks during the AI for Good Global Summit 2019 in Geneva. The authors like to thank Jeff Pool, Nikolaus Binder, and David Luebke for profound discussions and Noah Gamboa, who helped with early experiments on sparse artificial neural networks. This work has been partially funded by the Federal Ministry of Education and Research (BMBF, Germany) in the project Open Testbed Berlin - 5G and Beyond - OTB-5G+ (Förderkennzeichen 16KIS0980).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Keller, A., Van keirsbilck, M. (2022). Artificial Neural Networks Generated by Low Discrepancy Sequences. In: Keller, A. (eds) Monte Carlo and Quasi-Monte Carlo Methods. MCQMC 2020. Springer Proceedings in Mathematics & Statistics, vol 387. Springer, Cham. https://doi.org/10.1007/978-3-030-98319-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-98319-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98318-5
Online ISBN: 978-3-030-98319-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)