Exponential Discretization of Weights of Neural Network Connections in Pre-Trained Neural Network. Part II: Correlation Maximization

Abstract

In this article, we develop method of linear and exponential quantization of neural network weights. We improve it by means of maximizing correlations between the initial and quantized weights taking into account the weight density distribution in each layer. We perform the quantization after the neural network training without a subsequent post-training and compare our algorithm with linear and exponential quantization. The quality of the neural network VGG-16 is already satisfactory (top5 accuracy 76%) in the case of 3-bit exponential quantization. The ResNet50 and Xception neural networks show top5 accuracy at 4 bits 79% and 61%, respectively.

This is a preview of subscription content, access via your institution.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.

REFERENCES

  1. 1

    Malsagov, M.Y., Khayrov, E.M., Pushkareva, M.M., et al., Exponential discretization of weights of neural network connections in pre-trained neural networks, Opt. Mem. Neural Networks, 2019, vol. 28, no. 4, pp. 262–270. https://doi.org/10.3103/S1060992X19040106

    Article  Google Scholar 

  2. 2

    ImageNet—huge image dataset. http://www.image-net.org.

  3. 3

    Models for image classification with weights trained on ImageNet. https://keras.io/applications/.

  4. 4

    Simonyan, K. and Zisserman, A., Very deep convolutional networks for large-scale image recognition, 2014. https://arxiv.org/abs/1409.1556.

  5. 5

    He, K., Zhang, X., Ren, S., and Sun, J., Deep residual learning for image recognition, 2015. https://arxiv.org/abs/1512.03385.

  6. 6

    Han, S., Mao, H., and Dally, W.J., Deep compression: compressing deep neural network with pruning, trained quantization and Huffman coding, Proc. 4th Int. Conf. on Learning Representations, ICLR 2016, San Juan, 2016.

  7. 7

    Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., and Zou, Y., DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients, 2016. arxiv.org/abs/1606.06160.

  8. 8

    Kryzhanovsky, B.V., Kryzhanovsky, M.V., and Malsagov, M.Yu., Discretization of a matrix in quadratic functional binary optimization, Dokl. Math., 2011, vol. 83, no. 3, pp. 413–417. https://doi.org/10.1134/S1064562411030197

    MathSciNet  Article  MATH  Google Scholar 

  9. 9

    Courbariaux, M., Bengio, Y., and David, J., Training deep neural networks with low precision multiplications, 2014. https://arxiv.org/abs/1412.7024.

  10. 10

    Courbariaux, M., Bengio, Y., and David, J.-P., BinaryConnect: training deep neural networks with binary weights during propagations, Proc. 28th Int. Conf. on Neural Information Processing Systems (NIPS’15), 2015, vol. 2, pp. 3123–3131.

  11. 11

    Lin, Z., Courbariaux, M., Memisevic, R., and Bengio, Y., Neural networks with few multiplications, Proc. 3rd Int. Conf. on Learning Representations, ICLR 2015, San Diego, CA, 2015.

  12. 12

    Lee, E.H., Miyashita, D., Chai, E., Murmann, B., and Wong, S.S., LogNet: energy-efficient neural networks using logarithmic computation, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Piscataway, NJ: Inst. Electr. Electron. Eng., 2017.

  13. 13

    Han, S., Pool, J., Tran, J., and Dally, W.J., Learning both weights and connections for efficient neural networks, 2015. https://arxiv.org/abs/1506.02626.

  14. 14

    Chen, W., Wilson, J.T., Tyree, S., Weinberger, K.Q., and Chen, Y., Compressing neural networks with the hashing trick, 2015. https://arxiv.org/abs/1504.04788.

  15. 15

    Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., and Lempitsky, V., Speeding-up convolutional neural networks using fine-tuned cp-decomposition, Proc. 3rd Int. Conf. on Learning Representations, ICLR 2015, San Diego, CA, 2015.

Download references

CONFLICT OF INTEREST

The authors declare that they have no conflicts of interest.

Funding

The work was financially supported by Russian Foundation for Basic Research no. 19-29-03030.

Author information

Affiliations

Authors

Corresponding authors

Correspondence to M. M. Pushkareva or I. M. Karandashev.

APPENDIX A

APPENDIX A

# accessory functions

def f(x, func, kde, X, x_min, x_max):

y, px, cov, p = func(x, kde, X, x_min, x_max)

return cov

def grad(x, func, kde, X, x_min, x_max, alpha=10):

y, px, cov, p = func(x, kde, X, x_min, x_max)

step = alpha * px * (y[1:] – y[:–1]) * (y[1:] + y[:–1] - 2 * x) / 2

return step

def cov_kde(x0, kde, X, x_min, x_max):

'''

calculate distribution function, quantized values and covariation on the set x0

X – weights in this layer,

x_min and x_max – minimal and maximal weight value in the layer

x0 current set (variable values only)

'''

p = np.zeros(len(x0) + 1)

C = np.zeros(len(x0) + 1)

y = np.zeros(len(x0) + 1)

x_ext = sorted(np.append(x0, [x_min, x_max]))

for i in range(len(x_ext)-1):

mask = np.logical_and(x_ext[i] < X, X <= x_ext[i + 1])

p[i] = len(X[mask])

C[i] = np.sum(X[mask])

if p[i] == 0:

C[i] = 0

p[i] = 1

y = C / p

px = kde.evaluate(x0)

cov = np.linalg.norm(C / np.sqrt(p)) #/ sigma_kde

return y, px, cov, p

def results(kde, w, x0, x_min, x_max, func, bits, kde_std, ans_case='CG'):

'''

correlation maximization procedure for initial set x0 (only variable values),

w – layer,

kde – kernel density estimation on random sample from weights,

x_min and x_max – minimal and maximal weight values

'''

n_d = 2 ** bits

fx = lambda x: –f(x, func, kde, w, x_min, x_max)

gradx = lambda x: –grad(x, func, kde, w, x_min, x_max, alpha)

tol_curr = 1e–4

alpha = 10

ans = minimize(fun=fx, x0=x0, jac=gradx, method='CG', tol=tol_curr

solutions = ans['x']

correlations = -ans['fun']

gradients = np.linalg.norm(gradx(ans['x'])) / alpha / n_d

return solutions, correlations, gradients

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pushkareva, M.M., Karandashev, I.M. Exponential Discretization of Weights of Neural Network Connections in Pre-Trained Neural Network. Part II: Correlation Maximization. Opt. Mem. Neural Networks 29, 179–186 (2020). https://doi.org/10.3103/S1060992X20030042

Download citation

Keywords:

  • weight quantization
  • correlation maximization
  • exponential quantization
  • neural network
  • neural network compression
  • reduction of bit depth of weights