Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Enhancing energy efficiency in the residential sector with smart meter data analytics


Tailored energy efficiency campaigns that make use of household-specific information can trigger substantial energy savings in the residential sector. The information required for such campaigns, however, is often missing. We show that utility companies can extract that information from smart meter data using machine learning. We derive 133 features from smart meter and weather data and use the Random Forest classifier that allows us to recognize 19 household classes related to 11 household characteristics (e.g., electric heating, size of dwelling) with an accuracy of up to 95% (69% on average). The results indicate that even datasets with an hourly or daily resolution are sufficient to impute key household characteristics with decent accuracy and that data from different yearly seasons does not considerably influence the classification performance. Furthermore, we demonstrate that a small training data set consisting of only 200 households already reaches a good performance. Our work may serve as benchmark for upcoming, similar research on smart meter data and provide guidance for practitioners for estimating the efforts of implementing such analytics solutions.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. 1.

    The municipality where the utility is located had approximately 14,000 inhabitants in 2015, the average municipality in Switzerland in the same year had M = 3638 (SD = 12,016) inhabitants (Swiss Federal Statistical Office 2017)

  2. 2.

    R version: 3.4.2; ‘randomForest’ package version 4.6–12

  3. 3.

    We tested four other classifiers that are based on complementary model types and found that Random Forest outperforms the other algorithms. The differences in AUC results were significant for kNN (paired t-test, t(30) = 2.683, p-value <0.01), Naïve Bayes (paired t-test, t(30) = 2.2125, p-value <0.05), SVM (t(30) = 1.6048, p-value <0.1), but not for AdaBoost.


  1. Albert, A., & Rajagopal, R. (2013). Smart meter driven segmentation: What your consumption says about you. IEEE Transactions on Power Systems, 28(4), 4019–4030.

  2. Albert, A., & Rajagopal, R. (2014). Cost-of-service segmentation of energy consumers. IEEE Transactions on Power Systems, 29(6), 2795–2803. https://doi.org/10.1109/TPWRS.2014.2312721.

  3. Allcott, H. (2011). Social norms and energy conservation. Journal of Public Economics, 95(9–10), 1082–1095. https://doi.org/10.1016/j.jpubeco.2011.03.003.

  4. Allcott, H., & Mullainathan, S. (2010). Behavior and energy policy. Science, 327(5970), 1204–1205.

  5. Al-Otaibi, R., Jin, N., Wilcox, T., & Flach, P. (2016). Feature construction and calibration for clustering daily load curves from smart-meter data. IEEE Transactions on Industrial Informatics, 12(2), 645–654. https://doi.org/10.1109/TII.2016.2528819.

  6. Armel, K. C., Gupta, A., Shrimali, G., & Albert, A. (2013). Is disaggregation the holy grail of energy efficiency? The case of electricity. Energy Policy, 52(Supplement C), 213–234. https://doi.org/10.1016/j.enpol.2012.08.062.

  7. Beckel, C., Sadamori, L., & Santini, S. (2012). Towards automatic classification of private households using electricity consumption data. In G. J. Pappas (Ed.), Proceedings of the fourth ACM workshop on embedded sensing Systems for Energy-Efficiency in buildings (pp. 169–176). Toronto: ACM.

  8. Beckel, C., Sadamori, L., & Santini, S. (2013). Automatic socio-economic classification of households using electricity consumption data. In D. Culler & C. Rosenberg (Eds.), Proceedings of the fourth international conference on future energy systems (pp. 75–86). Berkeley: ACM.

  9. Beckel, C., Sadamori, L., Staake, T., & Santini, S. (2014). Revealing household characteristics from smart meter data. Energy, 78, 397–410.

  10. Birt, B. J., Newsham, G. R., Beausoleil-Morrison, I., Armstrong, M. M., Saldanha, N., & Rowlands, I. H. (2012). Disaggregating categories of electrical energy end-use from whole-house hourly data. Energy and Buildings, 50, 93–102.

  11. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

  12. Buchanan, K., Banks, N., Preston, I., & Russo, R. (2016). The British public’s perception of the UK smart metering initiative: Threats and opportunities. Energy Policy, 91, 87–97. https://doi.org/10.1016/j.enpol.2016.01.003.

  13. Chang, H. H., Wong, K. H., & Fang, P. W. (2014). The effects of customer relationship management relational information processes on customer-based performance. Decision Support Systems, 66(Supplement C), 146–159. https://doi.org/10.1016/j.dss.2014.06.010.

  14. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., & Wirth, R. (2000). CRISP-DM 1.0. SPSS. Retrieved from ftp://ftp.software.ibm.com/software/analytics/spss/support/Modeler/Documentation/14/UserManual/CRISP-DM.pdf.

  15. Chicco, G. (2012). Overview and performance assessment of the clustering methods for electrical load pattern grouping. Energy, 42(1), 68–80.

  16. Coltman, T. (2007). Why build a customer relationship management capability? The Journal of Strategic Information Systems, 16(3), 301–320. https://doi.org/10.1016/j.jsis.2007.05.001.

  17. Constantiou, I. D., & Kallinikos, J. (2015). New games, new rules: Big data and the changing context of strategy. Journal of Information Technology, 30(1), 44–57. https://doi.org/10.1057/jit.2014.17.

  18. Cramer, H. (1946). Mathematical methods of statistics. Princeton: Princeton University Press.

  19. Darby, S. (2006). The effectiveness of feedback on energy consumption. University of Oxford. Retrieved from http://www.usclcorp.com/news/DEFRA-report-with-appendix.pdf.

  20. de Silva, D., Xinghuo, Y., Alahakoon, D., & Holmes, G. (2011). A data mining framework for electricity consumption analysis from meter data. IEEE Transactions on Industrial Informatics, 7(3), 399–407.

  21. Dietterich, T. G. (2000). Ensemble methods in machine learning. In: International workshop on multiple classifier systems (pp. 1–15). Springer. https://doi.org/10.1007/3-540-45014-9_1.

  22. Ecoplan. (2015). Smart Metering Roll Out – Kosten und Nutzen: Aktualisierung des Smart Metering Impact Assessments 2012 (Final Report). Bern: Bundesamt für Energie Retrieved from http://www.bfe.admin.ch/php/modules/publikationen/stream.php?extlang=de&name=de_678554277.pdf&endung=Smart%20Metering%20Roll%20Out%20%96%20Kosten%20und%20Nutzen.

  23. European Commission. (2012). Commission recommendation of 9 march 2012 on preparations for the roll-out of smart metering systems. Official Journal of the European Union. Retrieved from http://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX:32012H0148.

  24. European Commission. (2014). COMMISSION STAFF WORKING DOCUMENT Cost-benefit analyses & state of play of smart metering deployment in the EU-27 Accompanying the document Report from the Commission Benchmarking smart metering deployment in the EU-27 with a focus on electricity (COMMISSION STAFF WORKING DOCUMENT no. SWD/2014/0189). Brussels: European Commission.

  25. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.

  26. Fei, H., Kim, Y., Sahu, S., Naphade, M., Mamidipalli, S. K., & Hutchinson, J. (2013). Heat pump detection from coarse grained smart meter data with positive and unlabeled learning. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1330–1338). New York: ACM. https://doi.org/10.1145/2487575.2488203.

  27. Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15(1), 3133–3181.

  28. Flath, C., Nicolay, D., Conte, T., van Dinther, C., & Filipova-Neumann, L. (2012). Cluster analysis of smart metering data. Business & Information Systems Engineering, 4(1), 31–39. https://doi.org/10.1007/s12599-011-0201-5.

  29. Gorodkin, J. (2004). Comparing two K-category assignments by a K-category correlation coefficient. Computational Biology and Chemistry, 28(5), 367–374.

  30. Graml, T., Loock, C.-M., Baeriswyl, M., & Staake, T. (2011). Improving Residential Energy Consumption at Large Using Persuasive Systems. Presented at European Conference on Information Systems (ECIS). In: ECIS 2011 Proceedings. Helsinki, Finland: AIS electronic library. http://aisel.aisnet.org/ecis2011/184/.

  31. Hart, G. W. (1992). Nonintrusive appliance load monitoring. Proceedings of the IEEE, 80(12), 1870–1891. https://doi.org/10.1109/5.192069.

  32. Hopf, K., Sodenkamp, M., Kozlovskiy, I., & Staake, T. (2014). Feature extraction and filtering for household classification based on smart electricity meter data. Computer Science-Research and Development, 31(3), 141–148. Zürich: Springer Berlin Heidelberg. https://doi.org/10.1007/s00450-014-0294-4.

  33. Hopf, K., Sodenkamp, M., & Kozlovskiy, I. (2016). Energy data analytics for improved residential service quality and energy efficiency. Presented at 24. European Conference on Information Systems (ECIS), Istanbul: Turkey, June 12-15, In: ECIS 2016 Proceedings, AIS electronic library. http://aisel.aisnet.org/ecis2016_rip/73/.

  34. Hopf, K., Riechel, S., Sodenkamp, M., & Staake, T. (2017). Predictive customer data analytics – the value of public statistical data and the geographic model transferability. Presented at 38. International Conference on Information Systems (ICIS), Seoul: South Korea 2017, Dec 10-13. In: ICIS 2017 Proceedings, AIS electronic library. http://aisel.aisnet.org/icis2017/DataScience/Presentations/9/.

  35. Jurman, G., Riccadonna, S., & Furlanello, C. (2012). A comparison of MCC and CEN error measures in multi-class prediction. PLoS One, 7(8), e41882. https://doi.org/10.1371/journal.pone.0041882.

  36. Keogh, E., & Mueen, A. (2011). Curse of dimensionality. In: C. Sammut & G. I. Webb (Eds.), Encyclopedia of machine learning (pp. 257–258). Springer US. https://doi.org/10.1007/978-0-387-30164-8_192.

  37. Kim, H., Marwah, M., Arlitt, M., Lyon, G., & Han, J. (2011). Unsupervised disaggregation of low frequency power measurements. In: Proceedings of the 2011 SIAM International Conference on Data Mining (Vols. 1–0, pp. 747–758). Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9781611972818.64.

  38. Kozlovskiy, I., Sodenkamp, M., Hopf, K., & Staake, T. (2016). Energy informatics for environmental, economic and social sustainability: A case of the large-scale detection of households with old heating systems. Presented at 24. European Conference on Information Systems (ECIS), Istanbul: Turkey, June 12-15, In: ECIS 2016 Proceedings, AIS electronic library. https://aisel.aisnet.org/ecis2016_rp/37.

  39. Kwac, J., Tan, C.-W., Sintov, N., Flora, J., & Rajagopal, R. (2013). Utility customer segmentation based on smart meter data: Empirical study. In: Smart Grid Communications (SmartGridComm), 2013 I.E. International Conference on (pp. 720–725).

  40. Lewington, J., De Chernatony, L., & Brown, A. (1996). Harnessing the power of database marketing. Journal of Marketing Management, 12(4), 329–346.

  41. Li, X., Bowers, C. P., & Schnier, T. (2010). Classification of energy consumption in buildings with outlier detection. IEEE Transactions on Industrial Electronics, 57(11), 3639–3644. https://doi.org/10.1109/TIE.2009.2027926.

  42. Liaw, A., & Wiener, M. (2015). randomForest: Breiman and Cutler’s Random Forests for Classification and Regression (Version 4.6–12). Retrieved from https://cran.r-project.org/web/packages/randomForest/index.html.

  43. Loock, C.-M., Staake, T., & Thiesse, F. (2013). Motivating energy-efficient behavior with green IS: An investigation of goal setting and the role of defaults. MIS Quarterly, 37(4), 1313–1332.

  44. McKenna, E., Richardson, I., & Thomson, M. (2012). Smart meter data: Balancing consumer privacy concerns with legitimate applications. Energy Policy, 41, 807–814.

  45. McKerracher, C., & Torriti, J. (2013). Energy consumption feedback in perspective: Integrating Australian data to meta-analyses on in-home displays. Energy Efficiency, 6(2), 387–405. https://doi.org/10.1007/s12053-012-9169-3.

  46. McLoughlin, F., Duffy, A., & Conlon, M. (2012). Characterising domestic electricity consumption patterns by dwelling and occupant socio-economic variables: An Irish case study. Energy and Buildings, 48, 240–248.

  47. Müller, O., Junglas, I., Brocke, J. v., & Debortoli, S. (2016). Utilizing big data analytics for information systems research: Challenges, promises and guidelines. European Journal of Information Systems, 25(4), 289–302. https://doi.org/10.1057/ejis.2016.2.

  48. Otim, S., & Grover, V. (2006). An empirical study on web-based services and customer loyalty. European Journal of Information Systems, 15(6), 527–541. https://doi.org/10.1057/palgrave.ejis.3000652.

  49. Romanski, P., & Kotthoff, L. (2014). FSelector: Selecting attributes. Retrieved from http://CRAN.R-project.org/package=FSelector.

  50. Sodenkamp, M., Kozlovskiy, I., Hopf, K., & Staake, T. (2017). Smart Meter Data Analytics for Enhanced Energy Efficiency in the Residential Sector. In: Wirtschaftsinformatik 2017 Proceedings. St. Gallen: AIS electronic library.

  51. Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45(4), 427–437.

  52. Swiss Federal Statistical Office. (2017). Sustainable development, regional and international disparities / Statistical basis and overviews (dataset no. FSO: Je-d-21.03.01). Retrieved from https://www.bfs.admin.ch/bfs/en/home/statistics/regional-statistics/regional-portraits-key-figures/communes.assetdetail.2422865.html.

  53. Synnott, W. R. (1978). Total customer relationship. MIS Quarterly, 2(3), 15–24.

  54. Tiefenbeck, V. (2017). Bring behaviour into the digital transformation. Nature Energy, 2(6), 17085. https://doi.org/10.1038/nenergy.2017.85.

  55. Tiefenbeck, V., Goette, L., Degen, K., Tasic, V., Fleisch, E., Lalive, R., & Staake, T. (2016). Overcoming salience bias: How real-time feedback fosters resource conservation. Management Science. https://doi.org/10.1287/mnsc.2016.2646.

  56. U.S. Energy Information Administration. (2017). How many smart meters are installed in the United States, and who has them? Retrieved January 18, 2018, from https://www.eia.gov/tools/faqs/faq.php?id=108&t=3.

  57. U.S. National Centers for Environmental Information. (2016). Climate Data Online. Retrieved January 2, 2016, from http://www.ncdc.noaa.gov/cdo-web/.

  58. Verma, A., Asadi, A., Yang, K., & Tyagi, S. (2015). A data-driven approach to identify households with plug-in electrical vehicles (PEVs). Applied Energy, 160(Supplement C), 71–79. https://doi.org/10.1016/j.apenergy.2015.09.013.

  59. Vihinen, M. (2012). How to evaluate performance of prediction methods? Measures and their interpretation in variation effect analysis. BMC Genomics, 13(4), S2. https://doi.org/10.1186/1471-2164-13-S4-S2.

  60. Watson, R. T., Boudreau, M.-C., & Chen, A. J. (2010). Information systems and environmentally sustainable development: Energy informatics and new directions for the IS community.(Essay). MIS Quarterly, 34(1), 23.

  61. Watson, R. T., Howells, J., & Boudreau, M.-C. (2012). Energy informatics: Initial thoughts on data and process management. In J. vom Brocke, S. Seidel, & J. Recker (Eds.), Green business process management (pp. 147–159). Berlin Heidelberg: Springer. https://doi.org/10.1007/978-3-642-27488-6_9.

  62. Wattal, S., Telang, R., Mukhopadhyay, T., & Boatwright, P. (2011). What’s in a “name”? Impact of use of customer information in E-mail advertisements. Information Systems Research, 23(3-part-1), 679–697. https://doi.org/10.1287/isre.1110.0384.

  63. Xu, M., & Walton, J. (2005). Gaining customer knowledge through analytical CRM. Industrial Management & Data Systems, 105(7), 955–971. https://doi.org/10.1108/02635570510616139.

  64. Yoo, Y. (2015). It is not about size: A further thought on big data. Journal of Information Technology, 30(1), 63–65. https://doi.org/10.1057/jit.2014.30.

  65. Zhang, T. C., Agarwal, R., Lucas, J., & Henry, C. (2011). The value of it-enabled retailer learning: Personalized product recommendations and customer store loyalty in electronic markets. MIS Quarterly, 35(4), 859–8A7.

Download references


We thank Ilya Kozlovskiy for his contribution to the data analysis in this study. We kindly acknowledge financial support from the Swiss Federal Office of Energy (Grant numbers SI/501053-01, SI/501202-01) and want to thank Michael Moser and Roland Brüniger for the very helpful comments during the research project.

Author information

Correspondence to Konstantin Hopf.

Additional information

Responsible Editor: Jan Krämer

Appendix: full list of features

Appendix: full list of features

Table 4 Full list of features used in this study with references to earlier works that mention the feature definition and data resolution for which the feature is used

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hopf, K., Sodenkamp, M. & Staake, T. Enhancing energy efficiency in the residential sector with smart meter data analytics. Electron Markets 28, 453–473 (2018). https://doi.org/10.1007/s12525-018-0290-9

Download citation


  • Green information systems
  • Decision support systems
  • Data analytics
  • Energy efficiency
  • Sustainability
  • Classification

JEL classification

  • C80
  • D10
  • M310
  • Q20
  • R20