Datenbank-Spektrum

, Volume 13, Issue 1, pp 45–53 | Cite as

Towards Integrated Data Analytics: Time Series Forecasting in DBMS

  • Ulrike Fischer
  • Lars Dannecker
  • Laurynas Siksnys
  • Frank Rosenthal
  • Matthias Boehm
  • Wolfgang Lehner
Fachbeitrag

Abstract

Integrating sophisticated statistical methods into database management systems is gaining more and more attention in research and industry in order to be able to cope with increasing data volume and increasing complexity of the analytical algorithms. One important statistical method is time series forecasting, which is crucial for decision making processes in many domains. The deep integration of time series forecasting offers additional advanced functionalities within a DBMS. More importantly, however, it allows for optimizations that improve the efficiency, consistency, and transparency of the overall forecasting process. To enable efficient integrated forecasting, we propose to enhance the traditional 3-layer ANSI/SPARC architecture of a DBMS with forecasting functionalities. This article gives a general overview of our proposed enhancements and presents how forecast queries can be processed using an example from the energy data management domain. We conclude with open research topics and challenges that arise in this area.

Keywords

Time series forecasting DBMS Architecture Challenges 

References

  1. 1.
    PredictTimeSeries–Microsoft SQL server 2008 books online (2012). http://msdn.microsoft.com/en-us/library/ms132167.aspx
  2. 2.
    Agarwal D, Chen D, ji Lin L, Shanmugasundaram J, Vee E (2010) Forecasting high-dimensional data. In: SIGMOD conference, pp 1003–1012 Google Scholar
  3. 3.
    Böhm M, Dannecker L, Doms A, Dovgan E, Filipic B, Fischer U, Lehner W, Pedersen TB, Pitarch Y, Siksnys L, Tusar T (2012) Data management in the MIRABEL smart grid system. In: EDBT/ICDT workshops, pp 95–102 Google Scholar
  4. 4.
    Cohen J, Dolan B, Dunlap M, Hellerstein JM, Welton C (2009) MAD skills: new analysis practices for big data. Proc VLDB Endow 2(2):1481–1492 Google Scholar
  5. 5.
    Dannecker L, Böhm M, Lehner W, Hackenbroich G (2011) Forcasting evolving time series of energy demand and supply. In: ADBIS, pp 302–315 Google Scholar
  6. 6.
    Dannecker L, Böhm M, Lehner W, Hackenbroich G (2012) Partitioning and multi-core parallelization of multi-equation forecast models. In: SSDBM, pp 106–123 Google Scholar
  7. 7.
    Dannecker L, Schulze R, Böhm M, Lehner W, Hackenbroich G (2011) Context-aware parameter estimation for forecast models in the energy domain. In: SSDBM, pp 491–508 Google Scholar
  8. 8.
    Das S, Sismanis Y, Beyer KS, Gemulla R, Haas PJ, McPherson J (2010) Ricardo: Integrating R and hadoop. In: SIGMOD conference, pp 987–998 Google Scholar
  9. 9.
    Deshpande A, Madden S (2006) MauveDB: supporting model-based user views in database systems. In: SIGMOD conference, pp 73–84 Google Scholar
  10. 10.
    Duan S, Babu S (2007) Processing forecasting queries. In: VLDB’07, pp 711–722 Google Scholar
  11. 11.
    Dunn D, Williams W, DeChaine T (1976) Aggregate versus subaggregate models in local area forecasting. J Am Stat Assoc 71:68–71 CrossRefGoogle Scholar
  12. 12.
    Faerber F, Cha SK, Primsch J, Bornhoevd C, Sigg S, Lehner W (2011) SAP HANA database—data management for modern business applications. SIGMOD Rec 40:45–51 CrossRefGoogle Scholar
  13. 13.
    Fischer U, Böhm M, Lehner W (2011) Offline design tuning for hierarchies of forecast models. In: BTW, pp 167–186 Google Scholar
  14. 14.
    Fischer U, Rosenthal F, Böhm M, Lehner W (2010) Indexing forecast models for matching and maintenance. In: IDEAS, pp 26–31 Google Scholar
  15. 15.
    Fischer U, Rosenthal F, Lehner W (2012) F2DB: the flash-forward database system. In: ICDE, pp 1245–1248 Google Scholar
  16. 16.
    Ge T, Zdonik SB (2008) A skip-list approach for efficiently processing forecasting queries. Proc VLDB Endow 1(1):984–995 Google Scholar
  17. 17.
    Gooijera JGD, Hyndman RJ (2006) 25 years of time series forecasting. Int J Forecast 22:443–473 CrossRefGoogle Scholar
  18. 18.
    Große P, Lehner W, Weichert T, Färber F, Li WS (2011) Bridging two worlds with RICE integrating R into the SAP in-memory computing engine. Proc VLDB Endow 4(12):1307–1317 Google Scholar
  19. 19.
    Hyndman RJ, Ahmed RA, Athanasopoulos G, Shang HL (2011) Optimal combination forecasts for hierarchical time series. Comput Stat Data Anal 55(9):2579–2589 MathSciNetCrossRefGoogle Scholar
  20. 20.
    Hyndman RJ, Khandakar Y (2008) Automatic time series forecasting: the forecast package for R. J Stat Softw 27:1–22 Google Scholar
  21. 21.
    Hyndman RJ, Koehler AB, Snyder RD, Grose S (2000) A state space framework for automatic forecasting using exponential smoothing methods. Int J Forecast 18:439–454 CrossRefGoogle Scholar
  22. 22.
    Jeung H, Yiu ML, Zhou X, Jensen CS (2010) Path prediction and predictive range querying in road network databases. VLDB J 19(4):585–602 CrossRefGoogle Scholar
  23. 23.
    Koc ML, Ré C (2011) Incrementally maintaining classification using an RDBMS. Proc VLDB Endow 4(5):302–313 Google Scholar
  24. 24.
    Lehner W (2003) Datenbanktechnologie für Data-Warehouse-Systeme. Konzepte und Methoden. dpunkt Google Scholar
  25. 25.
    Oracle (2012) Oracle OLAP DML reference: FORECAST–DML statement Google Scholar
  26. 26.
    Parisi F, Sliva A, Subrahmanian VS (2011) Embedding forecast operators in databases. In: Proceedings of the 5th international conference on scalable uncertainty management (SUM’11), pp 373–386 CrossRefGoogle Scholar
  27. 27.
    Ramanathan R, Engle R, Granger CWJ, Vahid-Araghi F, Brace C (1997) Short-run forecasts of electricity loads and peaks. Int J Forecast 13(2):161–174 CrossRefGoogle Scholar
  28. 28.
    Rosenthal F, Lehner W (2011) Efficient in-database maintenance of ARIMA models. In: SSDBM, pp 537–545 Google Scholar
  29. 29.
    Rosenthal F, Volk PB, Hahmann M, Habich D, Lehner W (2009) Drift-Aware ensemble regression. In: Proceedings of the 6th international conference on machine learning and data mining in pattern recognition (MLDM’09), pp 221–235 CrossRefGoogle Scholar
  30. 30.
    Roussopoulos N (1982) The logical access path schema of a database. IEEE Trans Softw Eng 8:563–573 MathSciNetMATHCrossRefGoogle Scholar
  31. 31.
    Sánchez I (2008) Adaptive combination of forecasts with application to wind energy. Int J Forecast 24(4):679–693 CrossRefGoogle Scholar
  32. 32.
    Taylor JW (2009) Triple seasonal methods for Short-term electricity demand forecasting. Eur J Oper Res 204:139–152 CrossRefGoogle Scholar
  33. 33.
    Winter R, Kostamaa P (2010) Large scale data warehousing: trends and observations. In: ICDE, p 1 Google Scholar
  34. 34.
    Xu B, Wolfson O (2003) Time-Series prediction with applications to traffic and moving objects databases. In: Proceedings of the 3rd ACM international workshop on data engineering for wireless and mobile access (MobiDe’03), pp 56–60 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ulrike Fischer
    • 1
  • Lars Dannecker
    • 2
  • Laurynas Siksnys
    • 3
  • Frank Rosenthal
    • 1
  • Matthias Boehm
    • 1
  • Wolfgang Lehner
    • 1
  1. 1.Technische Universität DresdenDresdenGermany
  2. 2.SAP Research DresdenDresdenGermany
  3. 3.Center for Data-intensive SystemsAalborg UniversityAalborgDenmark

Personalised recommendations