Advertisement

RTDW-bench: Benchmark for Testing Refreshing Performance of Real-Time Data Warehouse

  • Jacek Jedrzejczak
  • Tomasz Koszlajda
  • Robert Wrembel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7447)

Abstract

In this paper we propose a benchmark, called RTDW-bench, for testing a performance of a real-time data warehouse. The benchmark is based on TPC-H. In particular, RTDW-bench permits to verify whether an already deployed RTDW is able to handle without any delays a transaction stream of a given arrival rate. The benchmark also includes an algorithm for finding the maximum stream arrival rate that can be handled by a RTDW without delays. The applicability of the proposed benchmark was verified in a RTDW implemented in Oracle11g.

Keywords

Arrival Rate Continuous Query Very Large Data Base Average Arrival Rate Approximate Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Transaction processing performance council, http://www.tpc.org
  2. 2.
    Acharya, S., Gibbons, P.B., Poosala, V.: Congressional samples for approximate answering of group-by queries. In: Proc. of ACM SIGMOD Int. Conf. on Management of Data, pp. 487–498 (2000)Google Scholar
  3. 3.
    Acharya, S., Gibbons, P.B., Poosala, V., Ramaswamy, S.: The Aqua approximate query answering system. SIGMOD Rec. 28, 574–576 (1999)CrossRefGoogle Scholar
  4. 4.
    Chakrabarti, K., Garofalakis, M., Rastogi, R., Shim, K.: Approximate query processing using wavelets. The VLDB Journal 10, 199–223 (2001)zbMATHGoogle Scholar
  5. 5.
    Colby, L.S., Kawaguchi, A., Lieuwen, D.F., Mumick, I.S., Ross, K.A.: Supporting multiple view maintenance policies. In: Proc. of ACM SIGMOD Int. Conf. on Management of Data, pp. 405–416 (1997)Google Scholar
  6. 6.
    Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Gerth, J., Talbot, J., Elmeleegy, K., Sears, R.: Online aggregation and continuous query support in mapreduce. In: Proc. of ACM SIGMOD Int. Conf. on Management of Data, pp. 1115–1118. ACM (2010)Google Scholar
  7. 7.
    Domingos, P., Hulten, G.: Catching up with the data: Research issues in mining data streams. In: Proc. of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (2001)Google Scholar
  8. 8.
    Golab, L., Johnson, T., Shkapenyuk, V.: Scheduling updates in a real-time stream warehouse. In: Proc. of Int. Conf. on Data Engineering (ICDE), pp. 1207–1210. IEEE Computer Society (2009)Google Scholar
  9. 9.
    Graefe, G., König, A.C., Kuno, H.A., Markl, V., Sattler, K.-U.: Robust query processing. Dagstuhl Seminar Proceedings, vol. 10381. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Germany (2011)Google Scholar
  10. 10.
    Karakasidis, A., Vassiliadis, P., Pitoura, E.: Etl queues for active data warehousing. In: Proc. of Int. Workshop on Information Quality in Information Systems, pp. 28–39. ACM (2005)Google Scholar
  11. 11.
    Krueger, J., Tinnefeld, C., Grund, M., Zeier, A., Plattner, H.: A case for online mixed workload processing. In: Proc. of Int. Workshop on Testing Database Systems (DBTest). ACM (2010)Google Scholar
  12. 12.
    Poess, M., Nambiar, R.O., Walrath, D.: Why you should run tpc-ds: a workload analysis. In: Proc. of Int. Conf. on Very Large Data Bases (VLDB), pp. 1138–1149. VLDB Endowment (2007)Google Scholar
  13. 13.
    Polyzotis, N., Skiadopoulos, S., Vassiliadis, P., Simitsis, A., Frantzell, N.: Supporting streaming updates in an active data warehouse. In: Proc. of Int. Conf. on Data Engineering (ICDE), pp. 476–485. ACM (2007)Google Scholar
  14. 14.
    Sharaf, M.A., Chrysanthis, P.K., Labrinidis, A., Pruhs, K.: Algorithms and metrics for processing multiple heterogeneous continuous queries. ACM Trans. Database Syst. 33, 5:1–5:44 (2008)Google Scholar
  15. 15.
    Simitsis, A., Vassiliadis, P., Dayal, U., Karagiannis, A., Tziovara, V.: Benchmarking ETL Workflows. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 199–220. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  16. 16.
    Thiele, M., Fischer, U., Lehner, W.: Partition-based workload scheduling in living data warehouse environments. Information Systems 34(4-5), 382–399 (2009)CrossRefGoogle Scholar
  17. 17.
    Tziovara, V., Vassiliadis, P., Simitsis, A.: Deciding the physical implementation of etl workflows. In: Proc. of ACM Int. Workshop on Data Warehousing and OLAP (DOLAP), pp. 49–56 (2007)Google Scholar
  18. 18.
    Wyatt, L., Caufield, B., Pol, D.: Principles for an ETL Benchmark. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 183–198. Springer, Heidelberg (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jacek Jedrzejczak
    • 1
  • Tomasz Koszlajda
    • 1
  • Robert Wrembel
    • 1
  1. 1.Institute of Computing SciencePoznań University of TechnologyPoznańPoland

Personalised recommendations