Skip to main content
Log in

Fully hyperbolic convolutional neural networks

  • Research
  • Published:
Research in the Mathematical Sciences Aims and scope Submit manuscript

Abstract

Convolutional neural networks (CNN) have recently seen tremendous success in various computer vision tasks. However, their application to problems with high dimensional input and output, such as high-resolution image and video segmentation or 3D medical imaging, has been limited by various factors. Primarily, in the training stage, it is necessary to store network activations for back-propagation. In these settings, the memory requirements associated with storing activations can exceed what is feasible with current hardware, especially for problems in 3D. Motivated by the propagation of signals over physical networks, that are governed by the hyperbolic Telegraph equation, in this work we introduce a fully conservative hyperbolic network for problems with high-dimensional input and output. We introduce a coarsening operation that allows completely reversible CNNs by using a learnable discrete wavelet transform and its inverse to both coarsen and interpolate the network state and change the number of channels. We show that fully reversible networks are able to achieve results comparable to the state of the art in 4D time-lapse hyper-spectral image segmentation and full 3D video segmentation, with a much lower memory footprint that is a constant independent of the network depth. We also extend the use of such networks to variational auto-encoders, where optimization begins from an exact recovery and we discover the level of compression through optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and materials

All data used in this manuscript has been cited and is publicly available. The hyperspectral example used in Sect. 5.1.1 is publicly available at [22]. The bear video used in Sect. 5.1.2 is publicly available at [47]. The MNIST dataset used in Sect. 5.2.1 is publicly available at [34]. The CelebA dataset used in Sect. 5.2.2 is publicly available at [42].

Code availability

An open source implementation of the proposed network is available on Github: https://github.com/klensink/HyperNet.

References

  1. Ardizzone, L., Lüth, C., Kruse, J., Rother, C., Köthe, U.: Guided image generation with conditional invertible neural networks (2019)

  2. Avendi, M., Kheradvar, A., Jafarkhani, H.: A combined deep-learning and deformable-model approach to fully automatic segmentation of the left ventricle in cardiac mri. Med. Image Anal. 30, 108–119 (2016)

    Article  Google Scholar 

  3. Bengio, Y.: Learning deep architectures for AI. Found. Trends® Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  4. Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis (2018)

  5. Bruna, J., Mallat, S.: Invariant scattering convolution networks. https://doi.org/10.48550/ARXIV.1203.1513. arXiv:1203.1513 (2012)

  6. Caelles, S., Pumarola, A., Moreno-Noguer, F., Sanfeliu, A., Van Gool, L.: Fast video object segmentation with spatio-temporal GANs. arXiv preprint arXiv:1903.12161 (2019)

  7. Chang, B., Meng, L., Haber, E., Ruthotto, L., Begert, D., Holtham, E.: Reversible architectures for arbitrarily deep residual neural networks. In: AAAI Conference on AI (2018)

  8. Chen, T.Q., Rubanova, Y., Bettencourt, J., Duvenaud, D.K.: Neural ordinary differential equations. CoRR arXiv:1806.07366 (2018)

  9. Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) Medical Image Computing and Computer-Assisted Intervention—MICCAI 2016, pp. 424–432. Springer International Publishing, Cham (2016)

  10. Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. CoRR arXiv:1606.06650 (2016)

  11. Dai, B., Seljak, U.: Sliced iterative normalizing flows. https://doi.org/10.48550/ARXIV.2007.00674. arXiv:2007.00674 (2020)

  12. Daubechies, I.: Orthonormal bases of compactly supported wavelets. Commun. Pure Appl. Math. 41(3), 909–996 (1988). https://doi.org/10.1002/cpa.3160410705

    Article  MathSciNet  MATH  Google Scholar 

  13. Dieng, A.B., Ruiz, F.J.R., Blei, D.M., Titsias, M.K.: Prescribed generative adversarial networks. https://doi.org/10.48550/ARXIV.1910.04302. arXiv:1910.04302 (2019)

  14. Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. CoRR arXiv:1605.08803 (2016)

  15. Etmann, C., Ke, R., Schönlieb, C.B.: iUNets: fully invertible U-Nets with learnable up- and downsampling (2020)

  16. Fujieda, S., Takayama, K., Hachisuka, T.: Wavelet convolutional neural networks for texture classification. arXiv preprint arXiv:1707.07394 (2017)

  17. Gholami, A., Keutzer, K., Biros, G.: Anode: unconditionally accurate memory-efficient gradients for neural odes (2019)

  18. Gomez, A.N., Ren, M., Urtasun, R., Grosse, R.B.: The reversible residual network: backpropagation without storing activations. In: Advances in Neural Information Processing Systems, pp. 2211–2221 (2017)

  19. Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  20. Hammernik, K., Klatzer, T., Kobler, E., Recht, M.P., Sodickson, D.K., Pock, T., Knoll, F.: Learning a variational network for reconstruction of accelerated MRI data. Magn. Reson. Med. 79(6), 3055–3071 (2017). https://doi.org/10.1002/mrm.26977

    Article  Google Scholar 

  21. Hanin, B.: Universal function approximation by deep neural nets with bounded width and relu activations. arXiv preprint arXiv:1708.02691v3 (2017)

  22. Hasanlou, M., Seydi, S.T.: Hyperspectral change detection: an experimental comparative study. Int. J. Remote Sens. 39(20), 7029–7083 (2018). https://doi.org/10.1080/01431161.2018.1466079

    Article  Google Scholar 

  23. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  24. He, M., Li, B., Chen, H.: Multi-scale 3D deep convolutional neural network for hyperspectral image classification. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3904–3908 (2017). https://doi.org/10.1109/ICIP.2017.8297014

  25. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. https://doi.org/10.48550/ARXIV.1706.08500. arXiv:1706.08500 (2017)

  26. Hou, R., Chen, C., Shah, M.: An end-to-end 3D convolutional neural network for action detection and segmentation in videos. arXiv preprint arXiv:1712.01111 (2017)

  27. Jacobsen, J., Smeulders, A.W.M., Oyallon, E.: i-RevNet: deep invertible networks. CoRR arXiv:1802.07088 (2018)

  28. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation (2017)

  29. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks (2018)

  30. Kingma, D.P., Welling, M.: Auto-encoding variational bayes (2013)

  31. Kolouri, S., Pope, P.E., Martin, C.E., Rohde, G.K.: Sliced-wasserstein autoencoder: an embarrassingly simple generative model. https://doi.org/10.48550/ARXIV.1804.01947. arXiv:1804.01947 (2018)

  32. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  33. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  34. Lecun, Y., Cortes, C.: The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/

  35. LeCun, Y., Cortes, C.: MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/ (2010)

  36. Lee, H., Kwon, H.: Contextual deep CNN based hyperspectral classification. In: 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp. 3322–3325 (2016). https://doi.org/10.1109/IGARSS.2016.7729859

  37. Li, J., Liang, B., Wang, Y.: A hybrid neural network for hyperspectral image classification. Remote Sens. Lett. 11(1), 96–105 (2020). https://doi.org/10.1080/2150704X.2019.1686780

    Article  Google Scholar 

  38. Li, Y., Zhang, H., Shen, Q.: Spectral-spatial classification of hyperspectral imagery with 3D convolutional neural network. Remote Sens. 9(1), 67 (2017). https://doi.org/10.3390/rs9010067

    Article  Google Scholar 

  39. Lin, Z., Khetan, A., Fanti, G., Oh, S.: PacGAN: the power of two samples in generative adversarial networks. https://doi.org/10.48550/ARXIV.1712.04086. arXiv:1712.04086 (2017)

  40. Liu, P., Zhang, H., Zhang, K., Lin, L., Zuo, W.: Multi-level wavelet-CNN for image restoration. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 886–88609 (2018). https://doi.org/10.1109/CVPRW.2018.00121

  41. Liu, S., Zhong, G., De Mello, S., Gu, J., Jampani, V., Yang, M.H., Kautz, J.: Switchable temporal propagation network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision—ECCV 2018, pp. 89–104. Springer International Publishing, Cham (2018)

  42. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV) (2015)

  43. Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29, pp. 4898–4906. Curran Associates Inc., Red Hook (2016)

    Google Scholar 

  44. Mallat, S.: Group invariant scattering. https://doi.org/10.48550/ARXIV.1101.2286. arXiv:1101.2286 (2011)

  45. Oh, S.W., Lee, J., Sunkavalli, K., Kim, S.J.: Fast video object segmentation by reference-guided mask propagation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7376–7385. https://doi.org/10.1109/CVPR.2018.00770 (2018)

  46. Oyallon, E., Belilovsky, E., Zagoruyko, S.: Scaling the scattering transform: deep hybrid networks. https://doi.org/10.48550/ARXIV.1703.08961. arXiv:1703.08961 (2017)

  47. Perazzi, F., Pont-Tuset, J., McWilliams, B., Gool, L.V., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 724–732 (2016). https://doi.org/10.1109/CVPR.2016.85

  48. Peters, B., Granek, J., Haber, E.: Multiresolution neural networks for tracking seismic horizons from few training images. Interpretation 7(3), SE201–SE213 (2019). https://doi.org/10.1190/INT-2018-0225.1

    Article  Google Scholar 

  49. Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Numerical Recipes in C: The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)

    MATH  Google Scholar 

  50. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. https://doi.org/10.48550/ARXIV.1511.06434. arXiv:1511.06434 (2015)

  51. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. CoRR arXiv:1505.04597 (2015)

  52. Ruthotto, L., Haber, E.: Deep neural networks motivated by partial differential equations. arXiv preprint arXiv:1804.04272 (2018)

  53. Salimans, T., Goodfellow, I.J., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. CoRR arXiv:1606.03498 (2016)

  54. Seitzer, M.: pytorch-fid: FID Score for PyTorch. https://github.com/mseitzer/pytorch-fid (2020). Version 0.2.1

  55. Shah, S., Ghosh, P., Davis, L.S., Goldstein, T.: Stacked U-Nets: a no-frills approach to natural image segmentation. CoRR arXiv:1804.10343 (2018)

  56. Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017). https://doi.org/10.1109/TPAMI.2016.2572683

    Article  Google Scholar 

  57. Srivastava, A., Valkov, L., Russell, C., Gutmann, M.U., Sutton, C.: Veegan: Reducing mode collapse in GANs using implicit variational learning. https://doi.org/10.48550/ARXIV.1705.07761. arXiv:1705.07761 (2017)

  58. Székely, G.J., Rizzo, M.L.: Testing for Equal Distributions in High Dimensions. InterStat, Durban (2004)

    Google Scholar 

  59. Tao, X., Gao, H., Wang, Y., Shen, X., Wang, J., Jia, J.: Scale-recurrent network for deep image deblurring. CoRR arXiv:1802.01770 (2018)

  60. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Deep end2end voxel2voxel prediction. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 402–409 (2016). https://doi.org/10.1109/CVPRW.2016.57

  61. Truchetet, F., Laligant, O.: Wavelets in industrial applications: a review. In: Wavelet Applications in Industrial Processing II, vol. 5607 (2004). https://doi.org/10.1117/12.580395

  62. Xu, Q., Xiao, Y., Wang, D., Luo, B.: CSA-MSO3DCNN: multiscale octave 3D CNN with channel and spatial attention for hyperspectral image classification. Remote Sens. 12(1), 188 (2020). https://doi.org/10.3390/rs12010188

    Article  Google Scholar 

  63. Xue, Z.: A general generative adversarial capsule network for hyperspectral image spectral–spatial classification. Remote Sens. Lett. 11(1), 19–28 (2020). https://doi.org/10.1080/2150704X.2019.1681598

    Article  Google Scholar 

  64. Yu, J.J., Derpanis, K.G., Brubaker, M.A.: Wavelet flow: fast training of high resolution normalizing flows (2020)

  65. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). https://doi.org/10.1109/cvpr.2017.660

  66. Zhou, Y., Luo, Z.: A Crank–Nicolson collocation spectral method for the two-dimensional telegraph equations. J. Inequal. Appl. 2018, 137 (2018). https://doi.org/10.1186/s13660-018-1728-5

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

K.L. and E.H. acknowledge the support of the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Keegan Lensink.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lensink, K., Peters, B. & Haber, E. Fully hyperbolic convolutional neural networks. Res Math Sci 9, 60 (2022). https://doi.org/10.1007/s40687-022-00343-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s40687-022-00343-1

Keywords

Navigation