Skip to main content

GreenHub: a large-scale collaborative dataset to battery consumption analysis of android devices



The development of solutions to improve battery life in Android smartphones and the energy efficiency of apps running on them is hindered by diversity. There are more than 24k Android smartphone models in the world. Moreover, there are multiple active operating system versions, and a myriad application usage profiles.


In such a high-diversity scenario, profiling for energy has only limited applicability. One would need to obtain information about energy use in real usage scenarios to make informed, effective decisions about energy optimization. The goal of our work is to understand how Android usage, apps, operating systems, hardware, and user habits influence battery lifespan.


We leverage crowdsourcing to collect information about energy in real-world usage scenarios. This data is collected by a mobile app, which we developed and made available to the public through Google Play store, and periodically uploaded to a centralized server and made publicly available to researchers, app developers, and smartphone manufacturers through multiple channels (SQL, REST API, zipped CSV/Parquet dump).


This paper presents the results of a wide analysis of the tendency several smart-phone characteristics have on the battery charge/discharge rate, such as the different models, brands, networks, settings, applications, and even countries. Our analysis was performed over the crowdsourced data, and we have presented findings such as which applications tend to be around when battery consumption is the highest, do users from different countries have the same battery usage, and even showcase methods to help developers find and improve energy inefficient processes. The dataset we considered is sizable; it comprises 23+ million (anonymous) data samples stemming from a large number of installations of the mobile app. Moreover, it includes 700+ million data points pertaining to processes running on these devices. In addition, the dataset is diverse. It covers 1.6k+ device brands, 11.8k+ smartphone models, and more than 50 Android versions. We have been using this dataset to perform multiple analyses. For example, we studied what are the most common apps running on these smartphones and related the presence of those apps in memory with the battery discharge rate of these devices. We have also used this dataset in teaching, having had students practicing data analysis and machine learning techniques for relating energy consumption/charging rates with many other hardware and software qualities, attributes and user behaviors.


The dataset we considered can support studies with a wide range of research goals, be those energy efficiency or not. It opens the opportunity to inform and reshape user habits, and even influence the development of both hardware (manufacturers) and software (developers) for mobile devices. Our analysis also shows results which go outside of the common perception of what impacts battery consumption in real-world usage, while exposing new varied, complex, and promising research avenues.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18








  7. The remainder accounts for battery changing by 2% or more






  13. The dataset does not include any data collected during development and testing of the BatteryHub application




  17. The dataset includes samples collected prior to that date, but they correspond to the infrastructure testing period.





  22. In fact, as we will discuss in detail in Section 6, this number includes installations of a clone which was made of BatteryHub, which at least up to a certain version, contributed with data to the dataset.

  23. Google app:

  24. Facebook app:

  25. Messenger app:

  26. AndroidRank:

  27. Instagram app:

  28. Our GreenHub Pipeline notebooks are detailed in Section 7. The notebooks can be found at




  32. It is common that, when such components have a low data throughput, the device automatically puts them under a low level of usage or even idle state, limiting its capacity but saving resources.

  33. Can be one of the other 3, but the Android API could not identify which one







  40. Battery Double:

  41. Fabric is a usage and crashlytics reporter:

  42. DownloadAPK:

  43. APKPure:

  44. AppBrain:

  45. Jadx:

  46. APKtool:

  47. Number of samples required to obtain a confidence level of 95% and a confidence interval of ± 3, considering each dataset as the population.

  48. Vegan:

  49. Dataset-Converter Tool:

  50. Pandas:

  51. Project Jupyter:

  52. Apache Parquet:


  • Anderson MJ (2001) A new method for non-parametric multivariate analysis of variance. Austral Ecology 26(1):32–46

    Google Scholar 

  • Chon Y, Lee G, Ha R, Cha H (2016) Crowdsensing-based smartphone use guide for battery life extension. In: Proceedings of the 2016 ACM international joint conference on pervasive and ubiquitous computing. ACM, pp 958–969

  • Clarke KR (1993) Non-parametric multivariate analysis of changes in community structure. Aust J Ecol 18:117–143

    Article  Google Scholar 

  • Couto M, Pereira R, Ribeiro F, Rua R, Saraiva J (2017) Towards a green ranking for programming languages. In: Proceedings of the 21st Brazilian symposium on programming languages, SBLP 2017. Best Paper. ACM, pp 7:1–7:8

  • Cruz L, Abreu R (2017) Performance-based guidelines for energy efficient mobile applications. In: Proceedings of the 4th international conference on mobile software engineering and systems, MOBILESoft ’17. IEEE Press, pp 46–57

  • Di Nucci D, Palomba F, Prota A, Panichella A, Zaidman A, De Lucia A (2017) Software-based energy profiling of android apps: Simple, efficient and reliable?. In: 2017 IEEE 24th international conference on software analysis, evolution and reengineering (SANER), pp 103–114

  • Fu B, Lin J, Li L, Faloutsos C, Hong J, Sadeh N (2013) Why people hate your app: making sense of user feedback in a mobile app store. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1276–1284

  • Guo Y, Wang C, Chen X (2017) Understanding application-battery interactions on smartphones: a large-scale empirical study. IEEE Access 5:13387–13400

    Article  Google Scholar 

  • Harris (2018) Our phones and gadgets are now endangering the planet. Accessed 24 Jan 2018

  • Hasan S, King Z, Hafiz M, Sayagh M, Adams B, Hindle A (2016) Energy profiles of java collections classes. In: Proceedings of the 38th international conference on software engineering. ACM, pp 225–236

  • Hoque MA, Siekkinen M, Khan KN, Xiao Y, Tarkoma S (2015) Modeling, profiling, and debugging the energy consumption of mobile devices. ACM Comput Surv 48(3):39:1–39:40.

    Google Scholar 

  • Hu Y, Yan J, Yan D, Lu Q, Yan J (2017) Lightweight energy consumption analysis and prediction for android applications. Science of Computer Programming

  • Inc. A (2018) Instruments overview. Accessed 28 Sep 2019

  • Incorporated Q (2014) Trepn profiler. Accessed 28 Sep 2019

  • Jabbarvand R, Sadeghi A, Garcia J, Malek S, Ammann P (2015) Ecodroid: an approach for energy-based ranking of android apps. In: Proceedings of 4th international workshop on green and sustainable software, GREENS ’15. IEEE Press, pp 8–14

  • Khalid H, Shihab E, Nagappan M, Hassan AE (2015) What do mobile app users complain about? IEEE Softw 32(3):70–77

    Article  Google Scholar 

  • Li D, Halfond WGJ (2014) An investigation into energy-saving programming practices for android smartphone app development. In: Proceedings of the 3rd international workshop on green and sustainable software, GREENS 2014. ACM, New York, pp 46–53

  • Li D, Hao S, Halfond WG, Govindan R (2013) Calculating source line level energy information for android applications. In: Proceedings of the 2013 international symposium on software testing and analysis. ACM, pp 78–89

  • Li D, Lyu Y, Gui J, Halfond WGJ (2016) Automated energy optimization of http requests for mobile applications. In: Proceedings of the 38th international conference on software engineering, ICSE ’16. ACM, pp 249–260

  • Lima LG, Melfe G, Soares-Neto F, Lieuthier P, Fernandes JP, Castor F (2016) Haskell in green land: analyzing the energy behavior of a purely functional language. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering (SANER’2016). IEEE, pp 517–528

  • Lin K, Kansal A, Lymberopoulos D, Zhao F (2010) Energy-accuracy trade-off for continuous mobile device location. In: Proceedings of the 8th international conference on mobile systems, applications, and services. ACM, pp 285–298

  • Linares-Vásquez M, Bavota G, Bernal-Cárdenas C, Oliveto R, Di Penta M, Poshyvanyk D (2014) Mining energy-greedy api usage patterns in android apps: an empirical study. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 2–11

  • Linares-Vásquez M, Bavota G, Cárdenas CEB, Oliveto R, Di Penta M, Poshyvanyk D (2015) Optimizing energy consumption of guis in android apps: a multi-objective approach. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering. ACM, pp 143–154

  • LLC G (2014) Profile battery usage with Batterystats and Battery Historian. Accessed 28 Sep 2019

  • LLC G (2018) Inspect energy use with energy profiler. Accessed 28 Sep 2019

  • Manotas I, Bird C, Zhang R, Shepherd D, Jaspan C, Sadowski C, Pollock L, Clause J (2016) An empirical study of practitioners’ perspectives on green software engineering. In: International conference on software engineering (ICSE), 2016 IEEE/ACM 38th, IEEE, pp 237–248

  • Matalonga H, Cabral B, Castor F, Couto M, Pereira R, de Sousa SM, Fernandes JP (2019) Greenhub farmer: real-world data for android energy mining. In: 2019 IEEE/ACM 16th international conference on mining software repositories (MSR). IEEE, pp 171–175

  • Mickle T (2018) Your phone is almost out of battery. remain calm. call a doctor. Last visit: 2019-02-05

  • Nucci DD, Palomba F, Prota A, Panichella A, Zaidman A, Lucia AD (2017) Petra: a software-based tool for estimating the energy profile of android applications. In: 2017 IEEE/ACM 39th international conference on software engineering companion (ICSE-c), pp 3–6

  • Oliner AJ, Iyer AP, Stoica I, Lagerspetz E, Tarkoma S (2013) Carat: collaborative energy diagnosis for mobile devices. In: Proceedings of the 11th ACM conference on embedded networked sensor systems, SenSys ’13, Roma, Italy, November 11-15, 2013. ACM, pp 10:1–10:14

  • Oliveira W, Oliveira R, Castor F (2017) A study on the energy consumption of android app development approaches. In: Proceedings of the 14th international conference on mining software repositories. IEEE Press, pp 42–52

  • Oliveira W Jr, Oliveira R, Castor F, Fernandes B, Pinto G (2019) Recommending energy-efficient java collections. In: Proceedings of the 16th international conference on mining software repositories, MSR 2019. Montreal, Canada, pp 160–170

  • Pang C, Hindle A, Adams B, Hassan AE (2016) What do programmers know about software energy consumption? IEEE Softw 33(3):83–89

    Article  Google Scholar 

  • Pathak A, Hu YC, Zhang M (2012) Where is the energy spent inside my app?: fine grained energy accounting on smartphones with eprof. In: Proceedings of the 7th ACM European conference on computer systems. ACM, pp 29–42

  • Pereira R, Couto M, Ribeiro F, Rua R, Cunha J, Fernandes JP, Saraiva J (2017) Energy efficiency across programming languages: how do energy, time, and memory relate?. In: Proceedings of the 10th ACM SIGPLAN international conference on software language engineering, SLE 2017. ACM, pp 256–267

  • Pereira R, Couto M, Saraiva J, Cunha J, Fernandes JP (2016) The influence of the java collection framework on overall energy consumption. In: Proceedings of the 5th international workshop on green and sustainable software, GREENS ’16. ACM, pp 15–21

  • Pereira R, Simão P, Cunha J, Saraiva J (2018) jStanley: placing a green thumb on java collections. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, ASE 2018. ACM, pp 856–859

  • Pinto G, Castor F (2017) Energy efficiency: a new concern for application software developers. Commun ACM 60(12):68–75

    Article  Google Scholar 

  • Pinto G, Castor F, Liu YD (2014) Mining questions about software energy consumption. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 22–31

  • Pinto G, Liu K, Castor F, Liu YD (2016) A comprehensive study on the energy efficiency of java’s thread-safe collections. In: 2016 IEEE international conference on software maintenance and evolution, ICSME 2016, Raleigh, NC, USA, October 2-7, 2016, pp 20–31

  • Richter (2018). The most wanted smartphone features. Accessed 24 Jan 2018

  • Shiffler RE (1988) Maximum z scores and outliers. The American Statistician 42(1):79–80

    MathSciNet  Google Scholar 

  • Thorwart A, O’Neill D (2017) Camera and battery features continue to drive consumer satisfaction of smartphones in US. Last visit: 2019-02-06

  • Tung L (2015) Android fragmentation: there are now 24,000 devices from 1,300 brands. Accessed 19 Sep 2019

  • Urdan TC (2016) Statistics in plain english, 4th edn. Routledge

  • Wan M, Jin Y, Li D, Gui J, Mahajan S, Halfond WG (2017) Detecting display energy hotspots in android apps. Software Testing, Verification and Reliability 27(6):e1635

    Article  Google Scholar 

Download references


This research and work is funded: by national funds through the FCT - Foundation for Science and Technology, I.P., within the scope of the project CISUC - UID/CEC/00326/2020, project UIDB/50014/2020, and by European Social Fund, through the Regional Operational Program Centro 2020; by operation Centro-01-0145-FEDER-000019 - C4 - Centro de Competências em Cloud Computing, co-financed by the European Regional Development Fund (ERDF) through the Programa Operacional Regional do Centro (Centro 2020), in the scope of the Sistema de Apoio à Investigação Científica e Tecnológica - Programas Integrados de IC&DT; by CNPq/Brazil (304755/2014-1, 406308/2016-0), FACEPE/Brazil (APQ-0839-1.03/14), and INES 2.0, FACEPE grants PRONEX APQ 0388-1.03/14 and APQ-0399-1.03/17, and CNPq grant 465614/2014-0; by NOVA LINCS (UIDB/04516/2020) with the financial support of FCT - Foundation for Science and Technology; The first author was financed by post-doc grant reference C4_SMDS_L1-1_D and the third author financed by FCT grant SFRH/BD/132485/2017; Additionally, this paper acknowledges the support of the Erasmus+ Key Action 2 (Strategic partnership for higher education) project No. 2020-1-PT01-KA203-078646: “SusTrainable - Promoting Sustainability as a Fundamental Driver in Software Development Training and Education”. The information and views set out in this paper are those of the author(s) and do not necessarily reflect the official opinion of the European Union. Neither the European Union institutions and bodies nor any person acting on their behalf may be held responsible for the use which may be made of the information contained therein.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Rui Pereira.

Additional information

Communicated by: Yasutaka Kamei, Andy Zaidman

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Mining Software Repositories (MSR)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pereira, R., Matalonga, H., Couto, M. et al. GreenHub: a large-scale collaborative dataset to battery consumption analysis of android devices. Empir Software Eng 26, 38 (2021).

Download citation

  • Accepted:

  • Published:

  • DOI:


  • Green software
  • Green mining
  • Android
  • Battery consumption analysis