Skip to main content
Log in

Quantized convolutional neural networks through the lens of partial differential equations

  • Research
  • Published:
Research in the Mathematical Sciences Aims and scope Submit manuscript


Quantization of convolutional neural networks (CNNs) is a common approach to ease the computational burden involved in the deployment of CNNs, especially on low-resource edge devices. However, fixed-point arithmetic is not natural to the type of computations involved in neural networks. In this work, we explore ways to improve quantized CNNs using PDE-based perspective and analysis. First, we harness the total variation (TV) approach to apply edge-aware smoothing to the feature maps throughout the network. This aims to reduce outliers in the distribution of values and promote piecewise constant maps, which are more suitable for quantization. Secondly, we consider symmetric and stable variants of common CNNs for image classification and graph convolutional networks for graph node classification. We demonstrate through several experiments that the property of forward stability preserves the action of a network under different quantization rates. As a result, stable quantized networks behave similarly to their non-quantized counterparts even though they rely on fewer parameters. We also find that at times, stability even aids in improving accuracy. These properties are of particular interest for sensitive, resource-constrained, low-power or real-time applications like autonomous driving.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

Data sharing is not applicable to this article as all the data sets that are used during the current study are publicly available online. The code for reproducing the results is available at


  1. We assume that the ReLU activation function is used in between any convolution operator, resulting in non-negative activation maps, and can be quantized using an unsigned scheme. If a different activation function is used that is not non-negative, like \(\tanh ()\), signed quantization should be used instead.



  1. Alt, T., Peter, P., Weickert, J., Schrader, K.: Translating numerical concepts for PDEs into neural architectures. In: Scale Space and Variational Methods in Computer Vision: 8th International Conference, p. 294–306. Springer-Verlag, Berlin (2021)

  2. Alt, T., Schrader, K., Augustin, M., Peter, P., Weickert, J.: Connections between numerical algorithms for PDEs and neural networks. J. Math. Imaging Vis. (2022)

  3. Ambrosio, L., Tortorelli, V.M.: Approximation of functional depending on jumps by elliptic functional via t-convergence. Commun. Pure Appl. Math. 43, 999–1036 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  4. Banner, R., Nahshan, Y., Soudry, D.: Post training 4-bit quantization of convolutional networks for rapid-deployment. NeurIPS 7948–7956 (2019)

  5. Bengio, Y.: Estimating or propagating gradients through stochastic neurons for conditional computation. preprint arXiv:arXiv1305.2982 (2013)

  6. Blalock, D., Ortiz, J., Frankle, J., Guttag, J.: What is the state of neural network pruning? MLSys (2020)

  7. Bodner, B.J., Ben Shalom, G., Treister, E.: GradFreeBits: gradient free bit allocation for mixed precision neural networks. arXiv preprint arXiv:2102.09298 (2022)

  8. Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  9. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)

  10. Cai, W., Li, W.: Weight normalization based quantization for deep neural network compression. arXiv preprint arXiv:1907.00593 (2019)

  11. Chamberlain, B., Rowbottom, J., Gorinova, M.I., Bronstein, M., Webb, S., Rossi, E.: GRAND: graph neural diffusion. In: Proceedings of the 38th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 139, pp. 1407–1418 (2021)

  12. Chan, T., Vese, L.: Active contours without edges. IEEE Trans. Image Process. 10(2), 266–277 (2001).

    Article  MATH  Google Scholar 

  13. Chaudhari, P., Oberman, A., Osher, S., Soatto, S., Carlier, G.: Deep relaxation: partial differential equations for optimizing deep neural networks. Res. Math. Sci. 5(3), 30 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  14. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

  15. Chen, M., Wei, Z., Huang, Z., Ding, B., Li, Y.: Simple and deep graph convolutional networks. In: 37th International Conference on Machine Learning (ICML), vol. 119, pp. 1725–1735 (2020)

  16. Chen, T.Q., Rubanova, Y., Bettencourt, J., Duvenaud, D.K.: Neural ordinary differential equations. In: Advances in Neural Information Processing Systems, pp. 6571–6583 (2018)

  17. Chen, Y., Xie, Y., Song, L., Chen, F., Tang, T.: A survey of accelerator architectures for deep neural networks. Engineering 6(3), 264–274 (2020)

    Article  Google Scholar 

  18. Cheng, Y., Wang, D., Zhou, P., Zhang, T.: Model compression and acceleration for deep neural networks: the principles, progress, and challenges. IEEE Signal Process. Mag. 35(1), 126–136 (2018)

    Article  Google Scholar 

  19. Choi, J., Wang, Z., Venkataramani, S., Chuang, P.I.J., Srinivasan, V., Gopalakrishnan, K.: Parameterized clipping activation quantized neural networks. arXiv preprints (2018). arXiv:1805.06085

  20. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The Cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3213–3223 (2016)

  21. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009).

  22. Eliasof, M., Haber, E., Treister, E.: PDE-GCN: novel architectures for graph neural networks motivated by partial differential equations. In: Advances in Neural Information Processing Systems, vol. 34, pp. 3836–3849 (2021)

  23. Eliasof, M., Treister, E.: DiffGCN: graph convolutional networks via differential operators and algebraic multigrid pooling. In: 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada. pp. 18016–18027 (2020)

  24. Ephrath, J., Eliasof, M., Ruthotto, L., Haber, E., Treister, E.: LeanConvNets: low-cost yet effective convolutional neural networks. IEEE J. Sel. Top. Signal Process. 14(4), 894–904 (2020)

    Article  Google Scholar 

  25. Esser, S.K., McKinstry, J.L., Bablani, D., Appuswamy, R., Modha, D.S.: Learned step size quantization. arXiv preprint arXiv:1902.08153 (2019)

  26. Gholami, A., Keutzer, K., Biros, G.: ANODE: unconditionally accurate memory-efficient gradients for neural odes. In: IJCAI, pp. 730–736 (2019)

  27. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)

  28. Goodfellow, I., McDaniel, P., Papernot, N.: Making machine learning robust against adversarial inputs. Commun. ACM 61(7), 56–66 (2018)

    Article  Google Scholar 

  29. Google, LLC et al.: gemmlowp: a small self-contained low-precision GEMM library (1999).

  30. Gou, J., Yu, B., Maybank, S., Tao, D.: Knowledge distillation: a survey. arXiv preprint arXiv:2006.05525 (2020)

  31. Gunther, S., Ruthotto, L., Schroder, J.B., Cyr, E.C., Gauger, N.R.: Layer-parallel training of deep residual neural networks. SIAM J. Math. Data Sci. 2(1), 1–23 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  32. Haber, E., Lensink, K., Triester, E., Ruthotto, L.: IMEXnet: a forward stable deep neural network. arXiv preprint arXiv:1903.02639 (2019)

  33. Haber, E., Ruthotto, L.: Stable architectures for deep neural networks. Inverse Probl (1) (2017)

  34. Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. In: 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, Conference Track Proceedings (2016)

  35. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  36. Horn, B.K.P., Schunck, B.G.: Determining optical flow. Artif. Intell. 17, 185–203 (1981)

    Article  MATH  Google Scholar 

  37. Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., et al.: Searching for MobileNetv3. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1314–1324 (2019)

  38. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18, 187:1-187:30 (2017)

    MathSciNet  MATH  Google Scholar 

  39. Jakubovitz, D., Giryes, R.: Improving DNN robustness to adversarial attacks using Jacobian regularization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 514–529 (2018)

  40. Jin, Q., Yang, L., Liao, Z., Qian, X.: Neural network quantization with scale-adjusted training. In: British Machine Vision Conference (BMVC) (2020)

  41. Jung, S., Son, C., Lee, S., Son, J., Han, J.J., Kwak, Y., Hwang, S.J., Choi, C.: Learning to quantize deep networks by optimizing quantization intervals with task loss. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4345–4354 (2019)

  42. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015 (2015)

  43. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: The International Conference on Learning Representations (ICLR) (2017)

  44. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. University of Toronto, Toronto, Ontario, Technical Report (2009)

  45. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. (NIPS) 61, 1097–1105 (2012)

    Google Scholar 

  46. LeCun, Y., Boser, B.E., Denker, J.S.: Handwritten digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems, pp. 396–404 (1990)

  47. Li, Y., Dong, X., Wang, W.: Additive powers-of-two quantization: an efficient non-uniform discretization for neural networks. In: International Conference on Learning Representations (ICLR) (2019)

  48. Liu, Y., Zhang, W., Wang, J.: Zero-shot adversarial quantization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

  49. Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2574–2582 (2016)

  50. Nagel, M., van Baalen, M., Blankevoort, T., Welling, M.: Data-free quantization through weight equalization and bias correction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1325–1334 (2019)

  51. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, pp. 807–814. Omnipress, Madison, WI, USA (2010)

  52. Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Mach. Intell. 12, 629–639 (1990)

    Article  Google Scholar 

  53. Raina, R., Madhavan, A., Ng, A.Y.: Large-scale deep unsupervised learning using graphics processors. In: 26th ICML, pp. 873–880 (2009).

  54. Ren, P., Xiao, Y., Chang, X., Huang, P., Li, Z., Chen, X., Wang, X.: A comprehensive survey of neural architecture search: challenges and solutions. arXiv preprint arXiv:2006.02903 (2020)

  55. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, pp. 234–241 (2015)

  56. Rudin, L., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Physica D Nonlinear Phenomena 60, 259–268 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  57. Ruthotto, L., Haber, E.: Deep neural networks motivated by partial differential equations. J. Math. Imaging Vis. 352–364 (2020)

  58. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

  59. Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29(3), 93 (2008)

    Google Scholar 

  60. Thorpe, M., Nguyen, T.M., Xia, H., Strohmer, T., Bertozzi, A., Osher, S., Wang, B.: GRAND++: graph neural diffusion with a source term. In: International Conference on Learning Representations (2022).

  61. Uhlich, S., Mauch, L., Cardinaux, F., Yoshiyama, K., Garcia, J., Tiedemann, S., Kemp, T., Nakamura, A.: Mixed precision DNNs: all you need is a good parametrization. In: ICLR (2020)

  62. Weinan, E.: A proposal on machine learning via dynamical systems. Commun. Math. Stat. 5(1), 1–11 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  63. Weickert, J.: Anisotropic Diffusion in Image Processing. Teubner, Stuttgart (1998)

    MATH  Google Scholar 

  64. Xhonneux, L.P.A.C., Qu, M., Tang, J.: Continuous graph neural networks. In: Proceedings of the 37th International Conference on Machine Learning (2020)

  65. Xu, X., Lu, Q., Yang, L., Hu, S., Chen, D., Hu, Y., Shi, Y.: Quantization of fully convolutional networks for accurate biomedical image segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8300–8308 (2018).

  66. Yang, Z., Cohen, W., Salakhudinov, R.: Revisiting semi-supervised learning with graph embeddings. In: International Conference on Machine Learning, pp. 40–48. PMLR (2016)

  67. Yin, P., Zhang, S., Lyu, J., Osher, S., Qi, Y., Xin, J.: Blended coarse gradient descent for full quantization of deep neural networks. Res. Math. Sci. 6(1), 1–23 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  68. Zhang, D.: LQ-Nets: learned quantization for highly accurate and compact deep neural networks. ECCV (2018)

  69. Zhang, L., Schaeffer, H.: Forward stability of ResNet and its variants. J. Math. Imaging Vis. 62(3), 328–351 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  70. Zhao, R., Hu, Y., Dotzel, J., De Sa, C., Zhang, Z.: Improving neural network quantization without retraining using outlier channel splitting. ICML 97, 7543–7552 (2019)

    Google Scholar 

  71. Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., Zou, Y.: Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016)

  72. Zhou, Y., Moosavi-Dezfooli, S.M., Cheung, N.M., Frossard, P.: Adaptive quantization for deep neural network. In: AAAI, pp. 4596–4604 (2018)

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ido Ben-Yair.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The research reported in this paper was supported by the Israel Innovation Authority through the Avatar consortium, and by Grant No. 2018209 from the United States - Israel Binational Science Foundation (BSF), Jerusalem, Israel. ME is supported by Kreitman High-Tech scholarship. The authors thank the Lynn and William Frankel Center for Computer Science at BGU.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ben-Yair, I., Ben Shalom, G., Eliasof, M. et al. Quantized convolutional neural networks through the lens of partial differential equations. Res Math Sci 9, 58 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


Mathematics Subject Classification
