Skip to main content

Advertisement

Log in

A time-series compression technique and its application to the smart grid

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Time-series data is increasingly collected in many domains. One example is the smart electricity infrastructure, which generates huge volumes of such data from sources such as smart electricity meters. Although today these data are used for visualization and billing in mostly 15-min resolution, its original temporal resolution frequently is more fine-grained, e.g., seconds. This is useful for various analytical applications such as short-term forecasting, disaggregation and visualization. However, transmitting and storing huge amounts of such fine-grained data are prohibitively expensive in terms of storage space in many cases. In this article, we present a compression technique based on piecewise regression and two methods which describe the performance of the compression. Although our technique is a general approach for time-series compression, smart grids serve as our running example and as our evaluation scenario. Depending on the data and the use-case scenario, the technique compresses data by ratios of up to factor 5,000 while maintaining its usefulness for analytics. The proposed technique has outperformed related work and has been applied to three real-world energy datasets in different scenarios. Finally, we show that the proposed compression technique can be implemented in a state-of-the-art database management system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. SQLScript is a procedural programming language in SAP HANA.

  2. L is a programming language similar to C used in SAP HANA.

References

  1. Aggarwal, S.K., Saini, L.M., Kumar, A.: Electricity price forecasting in deregulated markets: a review and evaluation. Int. J. Electr. Power Energy Syst. 31(1), 13–22 (2009)

    Article  Google Scholar 

  2. Barker, S., Mishra, A., Irwin, D., Cecchet, E., Shenoy, P., Albrecht, J.: Smart*: an open data set and tools for enabling research in sustainable homes. In: Workshop on Data Mining Applications in Sustainability (SustKDD) (2012)

  3. Box, G.E.P., Jenkins, G.M.: Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco (1976)

    MATH  Google Scholar 

  4. Brockwell, P.J., Davis, R.A.: Introduction to Time Series and Forecasting. Springer, Berlin (2002)

    Book  MATH  Google Scholar 

  5. Chan, K.P., Fu, A.C.: Efficient time series matching by wavelets. In: International Conference on Data Engineering (ICDE), pp. 126–133 (1999)

  6. Dalai, M., Leonardi, R.: Approximations of one-dimensional digital signals under the \(l^\infty \) norm. IEEE Trans. Signal Process. 54(8), 3111–3124 (2006)

    Article  Google Scholar 

  7. Dannecker, L., Böhm, M., Fischer, U., Rosenthal, F., Hackenbroich, G., Lehner, W.: State-of-the-Art Report on Forecasting—A Survey of Forecast Models for Energy Demand and Supply. Deliverable 4.1, The MIRACLE Consortium, Dresden, Germany (2010)

  8. Eichinger, F., Pathmaperuma, D., Vogt, H., Müller, E.: Data analysis challenges in the future energy domain. In: Yu, T., Chawla, N., Simoff, S. (eds.) Computational Intelligent Data Analysis for Sustainable Development, chap. 7, pp. 181–242. Chapman and Hall/CRC, London (2013)

    Google Scholar 

  9. Elmeleegy, H., Elmagarmid, A.K., Cecchet, E., Aref, W.G., Zwaenepoel, W.: Online piece-wise linear approximation of numerical streams with precision guarantees. In: International Conference on Very Large Data Bases (VLDB), pp. 145–156 (2009)

  10. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. SIGMOD Rec. 23(2), 419–429 (1994)

    Article  Google Scholar 

  11. Färber, F., May, N., Lehner, W., Große, P., Müller, I., Rauhe, H., Dees, J.: The SAP HANA database—an architecture overview. IEEE Data Eng. Bull. 35(1), 28–33 (2012)

    Google Scholar 

  12. Feller, W.: The asymptotic distribution of the range of sums of independent random variables. Ann. Math. Stat. 22(3), 427–432 (1951)

    Article  MATH  MathSciNet  Google Scholar 

  13. Huffman, D.A.: A method for the construction of minimum-redundancy codes. Proc. Inst. Radio Eng. 40(9), 1098–1101 (1952)

    Google Scholar 

  14. Hyndman, R.J., Koehler, A.B.: Another look at measures of forecast accuracy. Int. J. Forecast. 22(4), 679–688 (2006)

    Article  Google Scholar 

  15. Ilic, D., Karnouskos, S., Goncalves Da Silva, P.: Sensing in power distribution networks via large numbers of smart meters. In: Conference on Innovative Smart Grid Technologies (ISGT), pp. 1–6 (2012)

  16. Karnouskos, S.: Demand side management via prosumer interactions in a smart city energy marketplace. In: Conference on Innovative Smart Grid Technologies (ISGT), pp. 1–7 (2011)

  17. Karnouskos, S., Goncalves Da Silva, P., Ilic, D.: Energy services for the smart grid city. In: International Conference on Digital Ecosystem Technologies—Complex Environment Engineering (DEST-CEE), pp. 1–6 (2012)

  18. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Locally adaptive dimensionality reduction for indexing large time series databases. SIGMOD Rec. 30(2), 151–162 (2001)

    Article  Google Scholar 

  19. Keogh, E., Kasetty, S.: On the need for time series data mining benchmarks: a survey and empirical demonstration. In: International Conference on Knowledge Discovery and Data Mining (KDD), pp. 102–111 (2002)

  20. Keogh, E.J., Pazzani, M.J.: An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. In: International Conference on Knowledge Discovery and Data Mining (KDD), pp. 239–243 (1998)

  21. Kolter, J.Z., Johnson, M.: REDD: a public data set for energy disaggregation research. In: Workshop on Data Mining Applications in Sustainability (SustKDD) (2011)

  22. Lazaridis, I., Mehrotra, S.: Capturing sensor-generated time series with quality guarantees. In: International Conference on Data Engineering (ICDE), pp. 429–440 (2003)

  23. Le Borgne, Y.A., Santini, S., Bontempi, G.: Adaptive model selection for time series prediction in wireless sensor networks. Sig. Process. 87, 3010–3020 (2007)

    Article  MATH  Google Scholar 

  24. Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Disc. 15(2), 107–144 (2007)

    Article  MathSciNet  Google Scholar 

  25. Makridakis, S.G., Wheelwright, S.C., Hyndman, R.J.: Forecasting: Methods and Applications, 3rd edn. Wiley, New York (1998)

    Google Scholar 

  26. Mattern, F., Staake, T., Weiss, M.: ICT for green: how computers can help us to conserve energy. In: International Conference on Energy-Efficient Computing and Networking (E-Energy), pp. 1–10 (2010)

  27. SWKiel Netz GmbH: VDEW-Lastprofile (2006). http://www.stadtwerke-kiel.de/index.php?id=swkielnetzgmbh_stromnetz_mustervertraege_haendler_rahmenvertrag. Accessed 25 April 2013

  28. US Department of Energy: Estimating Appliance and Home Electronic Energy Use (2013). http://energy.gov/energysaver/articles/estimating-appliance-and-home-electronic-energy-use. Accessed 20 Nov 2013

  29. Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)

    MATH  Google Scholar 

  30. Nga, D., See, O., Do Nguyet Quang, C., Chee, L.: Visualization techniques in smart grid. Smart Grid Renew. Energy 3(3), 175–185 (2012)

    Article  Google Scholar 

  31. Papaioannou, T.G., Riahi, M., Aberer, K.: Towards online multi-model approximation of time series. In: International Conference on Mobile Data Management (MDM), pp. 33–38 (2011)

  32. Plattner, H., Zeier, A.: In-Memory Data Management—An Inflection Point for Enterprise Applications. Springer (2011)

  33. Ramanathan, R., Engle, R., Granger, C.W., Vahid-Araghi, F., Brace, C.: Short-run forecasts of electricity loads and peaks. Int. J. Forecast. 13(2), 161–174 (1997)

    Article  Google Scholar 

  34. Ratanamahatana, C., Lin, J., Gunopulos, D., Keogh, E., Vlachos, M., Das, G.: Mining time series data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, chap. 56, pp. 1049–1077. Springer, Berlin (2010)

    Google Scholar 

  35. Ringwelski, M., Renner, C., Reinhardt, A., Weigely, A., Turau, V.: The Hitchhiker’s guide to choosing the compression algorithm for your smart meter data. In: International Energy Conference (ENERGYCON), pp. 935–940 (2012)

  36. Salomon, D.: A Concise Introduction to Data Compression. Springer, Berlin (2008)

    Book  MATH  Google Scholar 

  37. Seidel, R.: Small-dimensional linear programming and convex hulls made easy. Discret. Comput. Geom. 6(1), 423–434 (1991)

  38. Shahabi, C., Tian, X., Zhao, W.: TSA-tree: a wavelet-based approach to improve the efficiency of multi-level surprise and trend queries on time-serieseries data. In: International Conference on Scientific and Statistical Database Management (SSDBM), pp. 55–68 (2000)

  39. Shieh, J., Keogh, E.: iSAX: indexing and mining terabyte sized time series. In: International Conference on Knowledge Discovery and Data Mining (KDD), pp. 623–631 (2008)

  40. Taylor, J.W.: Triple seasonal methods for short-term electricity demand forecasting. Eur. J. Oper. Res. 204(1), 139–152 (2010)

  41. Tishler, A., Zang, I.: A min-max algorithm for non-linear regression models. Appl. Math. Comput. 13(1/2), 95–115 (1983)

  42. Vogt, H., Weiss, H., Spiess, P., Karduck, A.P.: Market-based prosumer participation in the smart grid. In: International Conference on Digital Ecosystems and Technologies (DEST), pp. 592–597 (2010)

  43. Wijaya, T.K., Eberle, J., Aberer, K.: Symbolic representation of smart meter data. In: Workshop on Energy Data Management (EnDM), pp. 242–248 (2013)

  44. Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary \(L_p\) norms. In: International Conference on Very Large Data Bases (VLDB), pp. 385–394 (2000)

  45. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

We thank L. Neumann and D. Kurfiss, who have helped us with the database implementation and respective experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Eichinger.

Additional information

Work partly done while F. Eichinger and P. Efros were with SAP AG.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 36 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Eichinger, F., Efros, P., Karnouskos, S. et al. A time-series compression technique and its application to the smart grid. The VLDB Journal 24, 193–218 (2015). https://doi.org/10.1007/s00778-014-0368-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-014-0368-8

Keywords

Navigation