Abstract
Background
The usage of complex artificial intelligence (AI) models demands expensive computational resources. While currently, available high-performance computing environments can support such complexity, the deployment of AI models in mobile devices, which is an increasing trend, is challenging. Environments with low computational resources imply limitations in the design decisions during the AI-enabled software engineering lifecycle that balance the trade-off between the accuracy and the complexity of the mobile applications.
Objective
Our objective is to systematically assess the trade-off between accuracy and complexity when deploying complex AI models (e.g. neural networks) to mobile devices in pursuit of greener AI solutions. We aim to cover (i) the impact of the design decisions on the achievement of high-accuracy and low resource-consumption implementations; and (ii) the validation of profiling tools for systematically promoting greener AI.
Method
We implement neural networks in mobile applications to solve multiple image and text classification problems on a variety of benchmark datasets. We then profile and model the accuracy, storage weight, and time of CPU usage of the AI-enabled applications in operation with respect to their design decisions. Finally, we provide an open-source data repository following the EMSE open science practices and containing all the experimentation, analysis, and reports in our study.
Results
We find that the number of parameters in the AI models makes the time of CPU usage scale exponentially in convolutional neural networks and logarithmically in fully-connected layers. We also see the storage weight scales linearly with the number of parameters, while the accuracy does not. For this reason, we argue that a good practice for practitioners is to start small and only increase the size of the AI models when their accuracy is low. We also find that Residual Networks (ResNets) and Transformers have a higher baseline cost in time of CPU usage than simple convolutional and recurrent neural networks. Finally, we find that the dataset used for experimentation affects both the scaling properties and accuracy of the AI models, hence showing that researchers must study the presented set of design decisions in each specific problem context.
Conclusions
We have depicted an underlying and existing relationship between the design of AI models and the performance of the applications that integrate these, and we motivate further work and extensions to better characterize this complex relationship.
Similar content being viewed by others
Data Availability
We provide a public repository that can be found https://github.com/roger-creus/Which-Design-Decisions-in-AI-enabled-MobileApplications-Contribute-to-Greener-AI. The repository contains (i) the source code to train all the AI models in the study, which includes the training datasets; (ii) the source code of the AI-enabled mobile applications; (iii) the evaluation datasets used to profile the metrics in the study during operation; (iv) the profiled datasets containing the values of the profiled metrics (i.e. accuracy, time of CPU usage, storage weight profiled during operation); and (v) the source code to carry the statistical analysis of the profiled metrics.
Notes
Note that in this work, whenever we relate to AI models or AI components we mean NNs since these are the ones that we study, develop, and test in this paper
References
Banerjee A, Roychoudhury A (2016) In: Proceedings of the International conference on mobile software engineering and systems (2016), pp 139–150
Bao L, Lo D, Xia X, Wang X, Tian C (2016) In: 2016 IEEE/ACM 13th Working conference on mining software repositories (MSR) IEEE, pp 37–48
Basili VR, Caldiera G, Rombach DH (1994) The Goal Question Metric Approach. Encycl Softw Eng 1
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Advances in neural information processing systems 33:1877
Byun C, Arcand W, Bestor D, Bergeron B, Hubbell M, Kepner J, McCabe A, Michaleas P, Mullen J, O’Gwynn D, et al (2012) In: 2012 IEEE Conference on high performance extreme computing IEEE, pp 1–6
Calero C, Moraga MÁ, Piattini M (2021) Software Sustainability pp 1–15
Calero C, Piattini M (2017) Ontologies for software engineering and software technology
Castanyer RC, Martínez-Fernández S, Franch X (2021) ESEM 2021 REGISTERED REPORT. Available on arXiv:2109.15284
Castanyer RC, Martínez-Fernández S, Franch X (2021) In: 2021 IEEE/ACM 1st Workshop on AI engineering-software engineering for AI (WAIN) IEEE, pp 27–34
Chen Z, Cao Y, Liu Y, Wang H, Xie T, Liu X (2020) In: Proceedings of the 28th ACM ESEC/FSE, pp 750–762
Chowdhury S, Borle S, Romansky S, Hindle A (2019) Empirical Software Engineering 24(4):1649
Cruz L, Abreu R 2019 Emp Softw Eng 24(4):2209
Cruz L, Abreu R (2018) Catalog of energy patterns for mobile applications. Empirical Software Engineering. arXiv:1803.05889
Cruz L, Abreu R (2019) In: 2019 IEEE/ACM 41st International conference on software engineering: new ideas and emerging results (ICSE-NIER), pp 101–104. https://doi.org/10.1109/ICSE-NIER.2019.00034
Deng L (2012) IEEE Signal Processing Magazine 29(6):141
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Di Nucci D, Palomba F, Prota A, Panichella A, Zaidman A, De Lucia A (2017) In: 2017 IEEE 24th International conference on software analysis, evolution and reengineering (SANER), pp 103–114. https://doi.org/10.1109/SANER.2017.7884613
Dowd K, Severance C (2010) High performance computing
Georgiou S, Kechagia M, Sharma T, Sarro F, Zou Y (2022) ACM: Association for Computing Machinery
Go A, Bhayani R, Huang L (2009) CS224N project report, Stanford 1(12):2009
Guo Q, Chen S, Xie X, Ma L, Hu Q, Liu H et al (2019) In: 2019 34th IEEE/ACM International conference on automated software engineering (ASE) IEEE, pp 810–822
Hinton G, Vinyals O, Dean J et al (2015) Distilling the knowledge in a neural network. arXiv:1503.02531 2(7)
Krizhevsky A, Hinton G et al Learning multiple layers of features from tiny images (2009)
Lacoste A, Luccioni A, Schmidt V, Dandres T (2019) Quantifying the carbon emissions of machine learning. arXiv:1910.09700
Lai TL, Robbins H, Wei CZ (1979) Journal of multivariate analysis 9(3):343
LeCun Y, Bengio Y, Hinton G (2015) Nature 521(7553):436
Lwakatare LE, Crnkovic I, Bosch J (2020) In: 2020 SoftCOM, pp 1–6. https://doi.org/10.23919/SoftCOM50211.2020.9238323
Lwakatare LE, Raj A, Crnkovic I, Bosch J, Olsson HH (2020) Information and Software Technology 127:106368
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies association for computational linguistics, Portland, Oregon, USA, pp 142–150. http://www.aclweb.org/anthology/P11-1015
Mao H, Cheung M, She J (2017) In: Proceedings of the 25th ACM international conference on multimedia, pp 1183–1191
Martínez-Fernández S, Bogner J, Franch X, Oriol M, Siebert J, Trendowicz A, Vollmer AM, Wagner S (2022) ACM Trans Softw Eng Methodol 31(2). https://doi.org/10.1145/3487043
Méndez Fernández D, Monperrus M, Feldt R, Zimmermann T (2019) Empirical Software Engineering 24(3):1057
Miles J (2014) Wiley StatsRef: Statistics Reference Online
Miles MB, Huberman AM (1994) Qualitative data analysis: An expanded sourcebook sage
Mohri M, Rostamizadeh A, Talwalkar A (2018) Foundations of machine learning, MIT press
Ni J, Li J, McAuley J (2019) In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 188–197
Oord Avd, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet: A generative model for raw audio. arXiv:1609.03499
Pons L, Ozkaya I (2019) Priority quality attributes for engineering ai-enabled systems. arXiv:1911.02912
Pope P, Webster J (1972) Technometrics 14(2):327
Rasley J, Rajbhandari S, Ruwase O, He Y (2020) In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 3505–3506
Sanh V, Debut L, Chaumond J, Wolf T (2019) DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv:1910.01108
Schwartz R, Dodge J, Smith NA, Etzioni O (2019) Green ai. Commun of the ACM. arXiv:1907.10597
Siebert J, Joeckel L, Heidrich J, Trendowicz A, Nakamichi K, Ohashi K, Namba I, Yamamoto R, Aoyama M (2021). Software Quality Journal. https://doi.org/10.1007/s11219-021-09557-y
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A et al (2017) Nature 550(7676):354
Stallkamp J, Schlipsing M, Salmen J, Igel C (2012) Neural networks 32:323
Student (1908) Biometrika pp. 1–25
Tan M, Le Q (2019) In: International Conference on Machine Learning PMLR, pp 6105–6114
Tappert CC, Suen CY, Wakahara T (1990) IEEE Transactions on pattern analysis and machine intelligence 12(8):787
Verdecchia R, Sallou J, Cruz L (2023) A systematic review of Green AI. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. arXiv:2301.11047
Verdecchia R, Lago P, Ebert C, de Vries C (2021) IEEE Software 38(6):7. https://doi.org/10.1109/MS.2021.3102254
Verdecchia R, Cruz L, Sallou J, Lin M, Wickenden J, Hotellier E (2022) Data-centric green ai an exploratory empirical study. In: 2022 international conference on ICT for sustainability (ICT4S). arXiv:2204.02766
Xu Y, Martínez-Fernández S, Martinez M, Franch X (2023)
Zellers R, Holtzman A, Rashkin H, Bisk Y, Farhadi A, Roesner F, Choi Y (2019) Advances in neural information processing systems. arXiv:1905.12616
Acknowledgements
This work has been partially supported by the GAISSA project (TED2021-130923B-I00, which is funded by MCIN/AEI/10.13039/501100011033 and by the European Union “NextGenerationEU”/PRTR); and, by the “Beatriz Galindo” Spanish Program (BEAGAL18/00064).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interest
The authors declared that they have no conflict of interest.
Additional information
Communicated by: Terese Baldassarre || Mike Papadakis
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Special Issue on Registered Reports.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Castanyer, R.C., Martínez-Fernández, S. & Franch, X. Which design decisions in AI-enabled mobile applications contribute to greener AI?. Empir Software Eng 29, 2 (2024). https://doi.org/10.1007/s10664-023-10407-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10407-7