MobileACNet: ACNet-Based Lightweight Model for Image Classification

Jiang, Tao; Zong, Ming; Ma, Yujun; Hou, Feng; Wang, Ruili

doi:10.1007/978-3-031-25825-1_26

Tao Jiang¹⁰,
Ming Zong¹¹,
Yujun Ma¹⁰,
Feng Hou¹⁰ &
…
Ruili Wang¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13836))

Included in the following conference series:

International Conference on Image and Vision Computing New Zealand

804 Accesses

Abstract

Lightweight CNN models aim to extend the application of deep learning from conventional image classification to mobile edge device-based image classification. However, the accuracy of lightweight CNN models currently is not as comparable as traditional large CNN models. To improve the accuracy of mobile platform-based image classification, we propose MobileACNet, a novel ACNet-based lightweight model based on MobileNetV3 (a popular lightweight CNN for image classification on mobile platforms). Our model adopts a similar idea to ACNet: consider global inference and local inference adaptively to improve the classification accuracy. We improve the MobileNetV3 by replacing the inverted residual block with our proposed adaptive inverted residual module (AIR). Experimental results show that our proposed MobileACNet can effectively improve the image classification accuracy by providing additional adaptive global inference on three public datasets, i.e., Cifar-100 dataset, Tiny ImageNet dataset, and a large-scale dataset ImageNet, for mobile-platform-based image classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tian, Y., et al.: Global context assisted structure-aware vehicle retrieval. IEEE Trans. Intell. Transp. Syst. (2020)
Google Scholar
Tian, Y., Cheng, G., Gelernter, J., Shihao, Yu., Song, C., Yang, B.: Joint temporal context exploitation and active learning for video segmentation. Pattern Recogn. 100, 107158 (2020)
Article Google Scholar
Tian, Y., Zhang, Y., Zhou, D., Cheng, G., Chen, W.-G., Wang, R.: Triple attention network for video segmentation. Neurocomputing 417, 202–211 (2020)
Article Google Scholar
Jiang, L., et al.: Underwater species detection using channel sharpening attention. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4259–4267 (2021)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and \(<\)0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016)
Zhou, J., Dai, H.-N., Wang, H.: Lightweight convolution neural networks for mobile edge computing in transportation cyber physical systems. ACM Trans. Intell. Syst. Technol. (TIST) 10(6), 1–20 (2019)
Article Google Scholar
Haque, W.A., Arefin, S., Shihavuddin, A.S.M., Hasan, M.A.: Deepthin: a novel lightweight CNN architecture for traffic sign recognition without GPU requirements. Expert Syst. Appl. 168, 114481 (2021)
Article Google Scholar
Valueva, M.V., Nagornov, N.N., Lyakhov, P.A., Valuev, G.V., Chervyakov, N.I.: Application of the residue number system to reduce hardware costs of the convolutional neural network implementation. Math. Comput. Simul. 177, 232–243 (2020)
Article MathSciNet MATH Google Scholar
He, Y., Li, T.: A lightweight CNN model and its application in intelligent practical teaching evaluation. In: MATEC Web of Conferences, vol. 309, p. 05016. EDP Sciences (2020)
Google Scholar
Luo, J.-H., Wu, J., Lin, W.: Thinet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5058–5066 (2017)
Google Scholar
Shipeng, F., Li, Z., Liu, Z., Yang, X.: Interactive knowledge distillation for image classification. Neurocomputing 449, 411–421 (2021)
Article Google Scholar
Kaiser, L., Gomez, A.N., Chollet, F.: Depthwise separable convolutions for neural machine translation. arXiv preprint arXiv:1706.03059 (2017)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Google Scholar
Howard, A., et al.: Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)
Google Scholar
Mao, G., Anderson, B.D.O.: Towards a better understanding of large-scale network models. IEEE/ACM Trans. Netw. 20(2), 408–421 (2011)
Article Google Scholar
Hosseini, H., Xiao, B., Jaiswal, M., Poovendran, R.: On the limitation of convolutional neural networks in recognizing negative images. In: 2017 16th IEEE International Conference On Machine Learning And Applications (ICMLA), pp. 352–358. IEEE (2017)
Google Scholar
Dua, A., Li, Y., Ren, F.: Systolic-CNN: an OpenCL-defined scalable run-time-flexible FPGA accelerator architecture for accelerating convolutional neural network inference in cloud/edge computing. In: 2020 IEEE 28th Annual International Symposium on Field-programmable Custom Computing Machines (FCCM), p. 231. IEEE (2020)
Google Scholar
Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018)
Zou, S., Chen, W., Chen, H.: Image classification model based on deep learning in internet of things. Wirel. Commun. Mob. Comput. 2020 (2020)
Google Scholar
Wang, G., Wang, K., Lin, L.: Adaptively connected neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1781–1790 (2019)
Google Scholar
Liu, T., Ma, Y., Yang, W., Ji, W., Wang, R., Jiang, P.: Spatial-temporal interaction learning based two-stream network for action recognition. Inf. Sci. (2022)
Google Scholar
Zong, M., Wang, R., Chen, Z., Wang, M., Wang, X., Potgieter, J.: Multi-cue based 3D residual network for action recognition. Neural Comput. Appl. 33(10), 5167–5181 (2021)
Article Google Scholar
Ji, W., Wang, R., Tian, Y., Wang, X.: An attention based dual learning approach for video captioning. Appl. Soft Comput. 117, 108332 (2022)
Article Google Scholar
Ji, W., Wang, R.: A multi-instance multi-label dual learning approach for video captioning. ACM Trans. Multimidia Comput. Commun. Appl. 17(2s), 1–18 (2021)
Google Scholar
Zong, M., Wang, R., Chen, X., Chen, Z., Gong, Y.: Motion saliency based multi-stream multiplier resnets for action recognition. Image Vision Comput. 107, 104108 (2021)
Article Google Scholar
Chen, Z., Wang, R., Zhang, Z., Wang, H., Lizhong, X.: Background-foreground interaction for moving object detection in dynamic scenes. Inf. Sci. 483, 65–81 (2019)
Article Google Scholar
Jing, C., Potgieter, J., Noble, F., Wang, R.: A comparison and analysis of RGB-D cameras’ depth performance for robotics application. In: 2017 24th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), pp. 1–6. IEEE (2017)
Google Scholar
Wang, L., et al.: Multi-cue based four-stream 3D resnets for video-based action recognition. Inf. Sci. 575, 654–665 (2021)
Article MathSciNet Google Scholar
Liu, Z., Li, Z., Wang, R., Zong, M., Ji, W.: Spatiotemporal saliency-based multi-stream networks with attention-aware LSTM for action recognition. Neural Comput. Appl. 32(18), 14593–14602 (2020)
Article Google Scholar
Shamsolmoali, P., et al.: Image synthesis with adversarial networks: a comprehensive survey and case studies. In. Fusion 72, 126–146 (2021)
Article Google Scholar
Hou, F., Wang, R., He, J., Zhou, Y.: Improving entity linking through semantic reinforced entity embeddings. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6843–6848. Association for Computational Linguistics (2020)
Google Scholar
Hou, F., Wang, R., Zhou, Y.: Transfer learning for fine-grained entity typing. Knowl. Inf. Syst. 63(4), 845–866 (2021). https://doi.org/10.1007/s10115-021-01549-5
Article Google Scholar
Ma, Z., et al.: Automatic speech-based smoking status identification. In: Arai, K. (ed.) Science and Information Conference, pp. 193–203. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10467-1_11
Chapter Google Scholar
Ma, Z., Qiu, Y., Hou, F., Wang, R., Chu, J.T.W., Bullen, C.: Determining the best acoustic features for smoker identification. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8177–8181. IEEE (2022)
Google Scholar
Qiu, Y., Wang, R., Hou, F., Singh, S., Ma, Z., Jia, X.: Adversarial multi-task learning with inverse mapping for speech enhancement. Appl. Soft Comput. 120, 108568 (2022)
Article Google Scholar
Hou, F., Wang, R., He, J., Zhou, Y.: Improving entity linking through semantic reinforced entity embeddings. arXiv preprint arXiv:2106.08495 (2021)
Tian, Y., et al.: 3D tooth instance segmentation learning objectness and affinity in point cloud. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 18(4), 1–16 (2022)
Article Google Scholar
Liu, D., Tian, Y., Zhang, Y., Gelernter, J., Wang, X.: Heterogeneous data fusion and loss function design for tooth point cloud segmentation. Neural Comput. Appl. 1–10 (2022)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Orhan, A.E.: Robustness properties of Facebook’s ResNeXt WSL models. arXiv preprint arXiv:1907.07640 (2019)
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)
Google Scholar
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)
Google Scholar
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Google Scholar
Chen, H.-Y., Su, C.-Y.: An enhanced hybrid mobilenet. In: 2018 9th International Conference on Awareness Science and Technology (iCAST), pp. 308–312. IEEE (2018)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Chandrarathne, G., Thanikasalam, K., Pinidiyaarachchi, A.: A comprehensive study on deep image classification with small datasets. In: Zakaria, Z., Ahmad, R. (eds.) Advances in Electronics Engineering. LNEE, vol. 619, pp. 93–106. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-1289-6_9
Chapter Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Wu, J., Zhang, Q., Xu, G.: Tiny imagenet challenge. Technical report (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematical and Computational Sciences, Massey University, Auckland, New Zealand
Tao Jiang, Yujun Ma, Feng Hou & Ruili Wang
School of Computer Science, Peking University, Beijing, China
Ming Zong

Authors

Tao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Zong
View author publications
You can also search for this author in PubMed Google Scholar
Yujun Ma
View author publications
You can also search for this author in PubMed Google Scholar
Feng Hou
View author publications
You can also search for this author in PubMed Google Scholar
Ruili Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ming Zong or Feng Hou .

Editor information

Editors and Affiliations

Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan
Auckland University of Technology, Auckland, New Zealand
Minh Nguyen
Auckland University of Technology, Auckland, New Zealand
Martin Stommel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, T., Zong, M., Ma, Y., Hou, F., Wang, R. (2023). MobileACNet: ACNet-Based Lightweight Model for Image Classification. In: Yan, W.Q., Nguyen, M., Stommel, M. (eds) Image and Vision Computing. IVCNZ 2022. Lecture Notes in Computer Science, vol 13836. Springer, Cham. https://doi.org/10.1007/978-3-031-25825-1_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-25825-1_26
Published: 04 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25824-4
Online ISBN: 978-3-031-25825-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MobileACNet: ACNet-Based Lightweight Model for Image Classification