Skip to main content
Log in

Architecture evolution of convolutional neural network using monarch butterfly optimization

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Designing suitable convolutional neural networks (CNNs) for different image data requires much human effort and expertise, in recent years, this process has been greatly accelerated by automatic architecture design methods. However, existing work rarely integrates macro-architecture space with depth search space, which usually leads to suboptimal architecture design results. Also, the adopted search strategy often needs to be specially customized for compatibility with architecture encoding. This paper thus proposes an automatic architecture design method based on monarch butterfly optimization (MBO). Specifically, an expressive Neural Function Unit (NFU) based architecture representation is designed, which integrates promising architectures in GoogLeNet, ResNet and DenseNet to facilitate the joint search of macro-architecture and depth of CNNs. Furthermore, a direct architecture encoding is designed to take advantage of the fast convergent MBO, which exploits evolutionary operators that have no complex computations to continuously improve the architecture population via encoding optimization. Extensive experiments conducted on eight benchmark image datasets demonstrate that our method can achieve continuously competitive performance with much less time and computational overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M (2020) Monarch butterfly optimization based convolutional neural network design. Mathematics 8(6):936

    Article  Google Scholar 

  • Baker B, Gupta O, Naik N, Raskar R (2017) Designing neural network architectures using reinforcement learning. In: Proceedings of the 5th International Conference on Learning Representations

  • Cai H, Chen T, Zhang W, Yu Y, Wang J (2018) Efficient architecture search by network transformation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32

  • Chang X, Nie F, Wang S, Yang Y, Zhou X, Zhang C (2016) Compound rank- \(k\) projections for bilinear analysis. IEEE Transact Neural Networks Learning Syst 27(7):1502–1513

    Article  MathSciNet  Google Scholar 

  • Duan H, Zhao W, Wang G, Feng X (2012) Test-sheet composition using analytic hierarchy process and hybrid metaheuristic algorithm ts/bbo. Math Problems Eng 2012

  • Feng Y, Wang G-G, Dong J, Wang L (2018) Opposition-based learning monarch butterfly optimization with gaussian perturbation for large-scale 0–1 knapsack problem. Comput Electrical Eng 67:454–468. https://doi.org/10.1016/j.compeleceng.2017.12.014

    Article  Google Scholar 

  • Gao D, Wang G-G, Pedrycz W (2020) Solving fuzzy job-shop scheduling problem using de algorithm improved by a selection mechanism. IEEE Transact Fuzzy Syst 28(12):3265–3275

    Article  Google Scholar 

  • Gu Z-M, Wang G-G (2020) Improving nsga-iii algorithms with information feedback models for large-scale many-objective optimization. Future Gener Comput Syst 107:49–69. https://doi.org/10.1016/j.future.2020.01.048

    Article  Google Scholar 

  • Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1580–1589

  • Han X, Han Y, Chen Q, Li J, Sang H, Liu Y, Pan Q, Nojima Y (2021) Distributed flow shop scheduling with sequence-dependent setup times using an improved iterated greedy algorithm. Complex Syst Model Simul 1(3):198–217

    Article  Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 770–778

  • He K, Zhang X, Ren S, Sun J (2016b) Identity mappings in deep residual networks. In: European conference on computer vision, pages 630–645. Springer

  • Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

  • Hua Y, Liu Q, Hao K, Jin Y (2021) A survey of evolutionary algorithms for multi-objective optimization problems with irregular pareto fronts. IEEE/CAA J Autom Sinica 8(2):303–318

    Article  MathSciNet  Google Scholar 

  • Huang G, Liu Z, Van Der ML, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708

  • Junior  FEF, Yen GG (2019) Particle swarm optimization of deep neural networks architectures for image classification. Swarm Evol Comput 49:62–74

  • Kingma DP, Ba JL (2015) Adam: A method for stochastic optimization. In: Proceedings of the 3rd international conference on learning representations, pp 1–15

  • Krizhevsky A, Hinton G, et al. (2009) Learning multiple layers of features from tiny images

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25:1097–1105

    Google Scholar 

  • Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y (2007) An empirical evaluation of deep architectures on problems with many factors of variation. In: Proceedings of the 24th international conference on Machine learning, pages 473–480

  • Lawrence T, Zhang L, Lim CP, Phillips E-J (2021) Particle swarm optimization for automatically evolving convolutional neural networks for image classification. IEEE Access 9:14369–14386

  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11):2278–2324

    Article  Google Scholar 

  • Li Z, Nie F, Chang X, Nie L, Zhang H, Yang Y (2018a) Rank-constrained spectral clustering with flexible embedding. IEEE Transact Neural Networks Learn Syst 29(12):6073–6082

    Article  MathSciNet  Google Scholar 

  • Li Z, Nie F, Chang X, Yang Y, Zhang C, Sebe N (2018b) Dynamic affinity graph construction for spectral clustering using multiple features. IEEE Transact Neural Networks Learn Syst 29(12):6323–6332

    Article  MathSciNet  Google Scholar 

  • Li Z, Yao L, Chang X, Zhan K, Sun J, Zhang H (2019) Zero-shot event detection via event-adaptive concept relevance mining. Pattern Recognit 88:595–603

    Article  Google Scholar 

  • Li W, Wang G-G, Alavi AH (2020) Learning-based elephant herding optimization algorithm for solving numerical optimization problems. Knowledge-Based Syst 195:105675. https://doi.org/10.1016/j.knosys.2020.105675

  • Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400

  • Liu H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2018) Hierarchical representations for efficient architecture search. In: Proceedings of the 6th International Conference on Learning Representations

  • Liu H, Simonyan K, Yang Y (2019) Darts: differentiable architecture search. In: Proceedings of the 7th International Conference on Learning Representations

  • Luo M, Chang X, Nie L, Yang Y, Hauptmann AG, Zheng Q (2018a) An adaptive semisupervised feature analysis for video semantic recognition. IEEE Transact Cybern 48(2):648–660

    Article  Google Scholar 

  • Luo M, Nie F, Chang X, Yang Y, Hauptmann AG, Zheng Q (2018b) Adaptive unsupervised feature selection with structure regularization. IEEE Transact Neural Networks Learn Syst 29(4):944–956

    Article  Google Scholar 

  • Ma L, Cheng S, Shi Y (2021a) Enhancing learning efficiency of brain storm optimization via orthogonal learning design. IEEE Transact Syst Man Cybern 51(11):6723–6742

    Article  Google Scholar 

  • Ma L, Huang M, Yang S, Wang R, Wang X (2021b) An adaptive localized decision variable analysis approach to large-scale multiobjective and many-objective optimization. IEEE Transact Cybern. https://doi.org/10.1109/TCYB.2020.3041212

    Article  Google Scholar 

  • Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: International Conference on Machine Learning, pages 2902–2911

  • Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Proceedings of the Aaai Conference on Artificial Intelligence 33:4780–4789

  • Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations

  • Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the Genetic and Evolutionary Computation Conference, pages 497–504

  • Sun Y, Xue B, Zhang M, Yen GG (2019a) Completely automated cnn architecture design based on blocks. IEEE Transact Neural Networks Learn Syst 31(4):1242–1254

  • Sun Y, Xue B, Zhang M, Yen GG (2019b) Evolving deep convolutional neural networks for image classification. IEEE Transact Evol Comput 24(2):394–407

  • Sun Y, Xue B, Zhang M, Yen GG, Lv J (2020) Automatically designing cnn architectures using the genetic algorithm for image classification. IEEE Transact Cybern 50(9):3840–3854

  • Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1–9

  • Wang G-G, Tan Y (2019) Improving metaheuristic algorithms with information feedback models. IEEE Transact Cyberne 49(2):542–555

    Article  Google Scholar 

  • Wang B, Sun Y, Xue B, Zhang M (2018) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. In: 2018 IEEE Congress on Evolutionary Computation (CEC), pages 1–8

  • Wang B, Sun Y, Xue B, Zhang M (2019a) A hybrid ga-pso method for evolving architecture and short connections of deep convolutional neural networks. In: Pacific Rim International Conference on Artificial Intelligence, pages 650–663. Springer

  • Wang G-G, Deb S, Cui Z (2019b) Monarch butterfly optimization. Neural Comput Appl 31(7):1995–2014

    Article  Google Scholar 

  • Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747

  • Yan C, Chang X, Li Z, Guan W, Ge Z, Zhu L, Zheng Q (2021) Zeronas: differentiable generative adversarial networks search for zero-shot learning. IEEE Transact Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3127346

  • Yu E, Sun J, Li J, Chang X, Han X-H, Hauptmann AG (2019) Adaptive semi-supervised feature selection for cross-modal retrieval. IEEE Transact Multimed 21(5):1276–1288

    Article  Google Scholar 

  • Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the 27th British Machine Vision Conference, pages 1–13

  • Zhang L, Luo M, Liu J, Chang X, Yang Y, Hauptmann AG (2020a) Deep top-\(k\) ranking for image-sentence matching. IEEE Transact Multimed 22(3):775–785

    Article  Google Scholar 

  • Zhang Y, Wang G-G, Li K, Yeh W-C, Jian M, Dong J (2020b) Enhancing moea/d with information feedback models for large-scale many-objective optimization. Inform Sci 522:1–16. https://doi.org/10.1016/j.ins.2020.02.066

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang W, Hou W, Li C, Yang W, Gen M (2021) Multidirection update-based multiobjective particle swarm optimization for mixed no-idle flow-shop scheduling problem. Complex Syst Model Simul 1(3):176–197

    Article  Google Scholar 

  • Zhao F, Di S, Cao J, Tang J, Jonrinaldi (2021) A novel cooperative multi-stage hyper-heuristic for combination optimization problems. Complex Syst Model Simul 1(2):91–108

  • Zhong Z, Yan J, Wu W, Shao J, Liu C-L (2018) Practical block-wise neural network architecture generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2423–2432

  • Zhong G, Jiao W, Gao W, Huang K (2020) Automatic design of deep networks with neural blocks. Cognit Comput 12(1):1–12

    Article  Google Scholar 

  • Zhou R, Chang X, Shi L, Shen Y-D, Yang Y, Nie F (2020) Person reidentification via multi-feature fusion with adaptive graph learning. IEEE Transact Neural Networks Learn Syst 31(5):1592–1601

    Article  Google Scholar 

  • Zhu Q-H, Tang H, Huang J-J, Hou Y (2021) Task scheduling for multi-cloud computing subject to security and reliability constraints. IEEE/CAA J Autom Sinica 8(4):848–865

    Article  MathSciNet  Google Scholar 

  • Zoph B, Le QV (2017) Neural architecture search with reinforcement learning. In: Proceedings of the 5th International Conference on Learning Representations, pages 1–16

  • Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8697–8710

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gai-Ge Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Qiao, X. & Wang, GG. Architecture evolution of convolutional neural network using monarch butterfly optimization. J Ambient Intell Human Comput 14, 12257–12271 (2023). https://doi.org/10.1007/s12652-022-03766-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-022-03766-4

Keywords

Navigation