Skip to main content

Bio-inspired machine learning: programmed death and replication

Abstract

We analyze algorithmic and computational aspects of biological phenomena, such as replication and programmed death, in the context of machine learning. We use two different measures of neuron efficiency to develop machine learning algorithms for adding neurons to the system (i.e., replication algorithm) and removing neurons from the system (i.e., programmed death algorithm). We argue that the programmed death algorithm can be used for compression of neural networks and the replication algorithm can be used for improving performance of the already trained neural networks. We also show that a combined algorithm of programmed death and replication can improve the learning efficiency of arbitrary machine learning systems. The computational advantages of the bio-inspired algorithms are demonstrated by training feedforward neural networks on the MNIST dataset of handwritten images.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26

Data availability

The MNIST dataset [24] analyzed during the current study is available in the MNIST database, http://yann.lecun.com/exdb/mnist/.

References

  1. Galushkin AI (2007) Neural networks theory. Springer, Berlin, p 396

    MATH  Google Scholar 

  2. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  3. Haykin Simon S (1999) Neural networks: a comprehensive foundation. Prentice Hall, Hoboken

    MATH  Google Scholar 

  4. Vapnik Vladimir N (2000) The nature of statistical learning theory. Information Science and Statistics

  5. Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. PNAS 79(8):2554–2558

    Article  MathSciNet  MATH  Google Scholar 

  6. Shwartz-Ziv R, Tishby N (2017) Opening the black box of deep neural networks via information. arXiv:1703.00810 [cs.LG]

  7. Roberts D, Yaida S, Hanin B (2022) The principles of deep learning theory: an effective theory approach to understanding neural networks. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  8. Vanchurin V (2021) Toward a theory of machine learning. Mach Learn: Sci Technol 2:035012

    Google Scholar 

  9. Vanchurin V, Wolf YI, Katsnelson MO, Koonin EV (2022) Towards a theory of evolution as multilevel learning. Proc Natl Acad Sci USA 119:e2120037119

    Article  Google Scholar 

  10. Vanchurin V, Wolf YI, Koonin EV, Katsnelson MO (2022) Thermodynamics of evolution and the origin of life. Proc Natl Acad Sci USA 119:e2120042119

    Article  Google Scholar 

  11. Katsnelson MI, Vanchurin V (2021) Emergent quantumness in neural networks. Found Phys 51(5):1–20

    Article  MathSciNet  MATH  Google Scholar 

  12. Katsnelson MI, Vanchurin V, Westerhout T (2021) Self-organized criticality in neural networks. arXiv:2107.03402

  13. Vanchurin V (2022) Towards a theory of quantum gravity from neural networks. Entropy 24:7

    Article  MathSciNet  Google Scholar 

  14. Vanchurin V (2020) The world as a neural network. Entropy 22:1210

    Article  MathSciNet  Google Scholar 

  15. Hassibi B, Stork DG (1992) Second order derivatives for network pruning: optimal brain surgeon. Adv Neural Inform Proc Syst 5

  16. Medeiros CMS, Baretto GA (2013) A novel weight pruning method for MLP classifiers on the MAXCORE principle. Neural Comput Appl 22:71–84

    Article  Google Scholar 

  17. Thomas P, Suhner M-C (2015) A new multilayer perceptron pruning algorithm for classification and regression applications. Neural Process Lett 42(2):437–458

    Article  Google Scholar 

  18. Augasta MG, Kathirvalavakumar T (2011) A novel pruning algorithm for optimizing feedforward neural network of classification problems. Neural Process Lett 34:241–258

    Article  Google Scholar 

  19. Zeng X, Yeung DS (2006) Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neuro Comput 69:825–837

    Google Scholar 

  20. Kwok TY, Yeung DY (1997) Constructive algorithms for structure learning in feedforward neural networks for regression problems. IEEE Trans Neural Netw 8(3):630–645

    Article  Google Scholar 

  21. Parekh R, Yang J, Honavar V (2000) Constructive neural-network learning algorithms for pattern classification. Trans Neural Netw 11(2):436–451

    Article  Google Scholar 

  22. Monirul IMd, Abdus SMd, Faijul Md, Xin Y, Kazuyuki M (2009) A new constructive algorithm for architectural and functional adaptation of artificial neural networks. IEEE Trans Syst, Man, Cybern Part B, Cyberne: Publ IEEE Syst, Man, Cybern Soc 39(6):1590–1605

    Article  Google Scholar 

  23. Puma-Villanueva WJ, dos Santos EP, Von Zuben FJ (2012) A constructive algorithm to synthesize arbitrarily connected feedforward neural networks. Neurocomputing 75(1):14–32

    Article  Google Scholar 

  24. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

Download references

Acknowledgements

V.V. was supported in part by the Foundational Questions Institute (FQXi) and the Oak Ridge Institute for Science and Education (ORISE).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrey Grabovsky.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Appendix

A Appendix

Here, we list the pruning algorithms customized for the feedforward neural network with n neurons \(\{x_{1}(t),...x_{n} (t)\}=\{x_{1},...x_{n}\}\) in an hidden layer t such that

$$\begin{aligned} x_{i}=f\left(\sum _kw_{ik}^{t-1}x_{k}(t-1)+b_{i}^{t-1}\right),\quad x_{j}(t+1)=f\left(\sum _kw_{jk}^{t} x_{k}+b_{j}^{t}\right). \end{aligned}$$
(70)

1.1 A.1 Connection cut algorithm

  1. 1.

    Measure variances of neurons on level t and level \(t+1\)

    $$\begin{aligned} C^t_{kk}=\langle \Delta x_k(t)^2\rangle ,\quad C^{t+1}_{ii}=\langle \Delta x_i(t+1)^2\rangle . \end{aligned}$$
    (71)
  2. 2.

    Find the neuron l with minimal efficiency (28)

    $$\begin{aligned} E_l=\min _k E_k =\min _k C^t_{kk} \sum _i \frac{(w_{ik}^t)^2}{C_{ii}^{t+1}}f'_i (\sum _jw_{ij}^{t} \langle x_{j}\rangle +b_{i}^{t} )^2. \end{aligned}$$
    (72)
  3. 3.

    Use \(x_l=\langle x_l\rangle \) as linear dependence equation (39) with

    $$\begin{aligned} a_{k\ne l}=0,\quad a_l=1,\quad a_0=\langle x_l\rangle . \end{aligned}$$
    (73)
  4. 4.

    Remove neuron l from the net according to (41)

    $$\begin{aligned} \sum _kw_{jk}^{t}x_{k}+b_{j}^{t}&\simeq \sum _{k\ne l} w_{jk}^{t}x_{k}+{\tilde{b}}_{j}^{t},\quad {\tilde{b}}_{j}^{t}=w_{jl}^{t} \langle x_l\rangle +b_{j}^{t}. \end{aligned}$$
    (74)
  5. 5.

    Do so while there are neurons with efficiency less than a cutoff or while accuracy or loss stays acceptable.

1.2 A.2 Probability algorithm

  1. 1.

    Measure variances of neurons on level t and level \(t+1\)

    $$\begin{aligned} C^t_{kk}=\langle \Delta x_k(t)^2\rangle ,\quad C^{t+1}_{ii}=\langle \Delta x_i(t+1)^2\rangle . \end{aligned}$$
    (75)
  2. 2.

    Find the neuron l with minimal efficiency (28)

    $$\begin{aligned} E_l=\min _k E_k =\min _k C^t_{kk} \sum _i \frac{(w_{ik}^t)^2}{C_{ii}^{t+1}}f' \left(\sum _jw_{ij}^{t} \langle x_{j}\rangle +b_{i}^{t} \right)^2. \end{aligned}$$
    (76)
  3. 3.

    Use linear dependence equation (39) \(\sum _j a_j x_j=a_0\) with

    $$\begin{aligned} a_j=\sum _i \frac{w_{il}^tw_{ij}^t}{C_{ii}^{t+1}}f' \left(\sum _kw_{ik}^{t} \langle x_{k}\rangle +b_{i}^{t} \right)^2,\quad a_0=\sum _ja_j\langle x_j\rangle . \end{aligned}$$
    (77)
  4. 4.

    Remove neuron l from the net according to (41)

    $$\begin{aligned} \sum _kw_{jk}^{t}x_{k}+b_{j}^{t}&\simeq \sum _{k\ne l} {\tilde{w}}_{jk}^{t}x_{k}+{\tilde{b}}_{j}^{t}, \end{aligned}$$
    (78)

    where

    $$\begin{aligned} {\tilde{w}}_{jk}^{t}=w_{jk}^{t}-w_{jl}^{t}\frac{a_{k}}{a_{l}},\quad \tilde{b}_{j}^{t}=w_{jl}^{t}\frac{ a_0}{a_{l}}+b_{j}^{t}. \end{aligned}$$
    (79)
  5. 5.

    Do so while there are neurons with efficiency less than a cutoff or while accuracy or loss stays acceptable.

1.3 A.3 Covariance algorithm

  1. 1.

    Measure covariance matrix \(C^t_{kj}=\langle \Delta x_k\Delta x_j\rangle \) (10) of the neurons on hidden layer t and find its eigenvectors \(\textbf{v}\) and eigenvalues \(\lambda \) (11).

  2. 2.

    Find the neuron l and the eigenvalue \(\lambda _p\) with minimal efficiency (17)

    $$\begin{aligned} E'_l = \min _k E'_k = \min _{i,k} \frac{ \lambda _i }{\left( \textbf{v}^{(i)}_k\right) ^2}=\frac{ \lambda _p }{\left( \textbf{v}^{(p)}_l\right) ^2}. \end{aligned}$$
    (80)
  3. 3.

    Use \(\sum _{k}{} \textbf{v}^{(p)}_k x_{k}=\lambda _{p}\) as linear dependence equation (39) with

    $$\begin{aligned} a_k=\textbf{v}^{(p)}_k,\quad a_0=\lambda _{p}. \end{aligned}$$
    (81)
  4. 4.

    Remove neuron l from the net according to (41)

    $$\begin{aligned} \sum _kw_{jk}^{t}x_{k}+b_{j}^{t}&\simeq \sum _{k\ne l} {\tilde{w}}_{jk}^{t}x_{k}+{\tilde{b}}_{j}^{t}, \end{aligned}$$
    (82)

    where

    $$\begin{aligned} {\tilde{w}}_{jk}^{t}=w_{jk}^{t}-w_{jl}^{t}\frac{a_{k}}{a_{l}},\quad \tilde{b}_{j}^{t}=w_{jl}^{t}\frac{ a_0}{a_{l}}+b_{j}^{t}. \end{aligned}$$
    (83)
  5. 5.

    Do so while there are neurons with efficiency less than a cutoff or while accuracy or loss stays acceptable.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Grabovsky, A., Vanchurin, V. Bio-inspired machine learning: programmed death and replication. Neural Comput & Applic 35, 20273–20298 (2023). https://doi.org/10.1007/s00521-023-08806-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08806-4

Keywords