Abstract
The demand for so-called living or real-time data warehouses is increasing in many application areas, including manufacturing, event monitoring and telecommunications. In fields like these, users normally expect short response times for their queries and high freshness for the requested data. However, it is truly challenging to meet both requirements at the same time because of the continuous flow of write-only updates and read-only queries as well as the latency caused by arbitrarily complex ETL processes. To optimize the update flow in terms of data freshness maximization and load minimization, we propose two algorithms — local and global scheduling — that operate on the basis of different system information. We want to discuss the benefits and drawbacks of both approaches in detail and derive recommendations regarding the optimal scheduling strategy for any given system setup and workload.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Thiele, M., Fischer, U., Lehner, W.: Partition-based workload scheduling in living data warehouse environments. Information Systems 34, 382–399 (2009)
Thiele, M., Bader, A., Lehner, W.: Multi-objective scheduling for real-time data warehouses. In: Proceedings der 12. GI-Fachtagung für Datenbanksysteme in Business, Technology und Web, GI, pp. 307–326 (2009)
Krompass, S., Kuno, H., Wiener, J.L., Wilkinson, K., Dayal, U., Kemper, A.: Managing long-running queries. In: EDBT ’09: Proceedings of the 12th International Conference on Extending Database Technology, pp. 132–143. ACM, New York (2009)
Gupta, C., Mehta, A., Wang, S., Dayal, U.: Fair, effective, efficient and differentiated scheduling in an enterprise data warehouse. In: EDBT ’09: Proceedings of the 12th International Conference on Extending Database Technology, pp. 696–707. ACM Press, New York (2009)
Thiele, M., Fischer, U., Lehner, W.: Partition-based workload scheduling in living data warehouse environments. In: DOLAP, pp. 57–64. ACM Press, New York (2007)
Leung, J., Kelly, L., Anderson, J.H.: Handbook of Scheduling: Algorithms, Models, and Performance Analysis. CRC Press, Inc., Boca Raton (2004)
Kang, K.D.: Managing deadline miss ratio and sensor data freshness in real-time databases. TKDE 16(10), 1200–1216 (2004); Senior Member-Sang H. Son and Fellow-John A. Stankovic
Kang, K.D., Son, S.H., Stankovic, J.A., Abdelzaher, T.F.: A qos-sensitive approach for timeliness and freshness guarantees in real-time databases. In: ECRTS, pp. 203–212 (2002)
Haritsa, J.R., Carey, M.J., Livny, M.: Value-based scheduling in real-time database systems. The VLDB Journal 2(2), 117–152 (1993)
Hong, D., Johnson, T., Chakravarthy, S.: Real-time transaction scheduling: A cost conscious approach. In: Buneman, P., Jajodia, S. (eds.) SIGMOD, pp. 197–206. ACM Press, New York (1993)
Simitsis, A., Wilkinson, K., Castellanos, M., Dayal, U.: Qox-driven etl design: Reducing the cost of etl consulting engagements. In: Appears in SIGMOD ’09: International Conference on Management of Data, ACM, New York (2009)
Zhou, Y., Chen, Z., Li, K.: Second-level buffer cache management. IEEE Trans. Parallel Distrib. Syst. 15(6), 505–519 (2004)
Gill, B.S.: On multi-level exclusive caching: offline optimality and why promotions are better than demotions. In: FAST’08: Proceedings of the 6th USENIX Conference on File and Storage Technologies, Berkeley, CA, USA, pp. 1–17. USENIX Association (2008)
Chen, Z., Zhang, Y., Zhou, Y., Scott, H., Schiefer, B.: Empirical evaluation of multi-level buffer cache collaboration for storage systems. In: SIGMETRICS ’05: Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, pp. 145–156. ACM Press, New York (2005)
Li, X., Aboulnaga, A., Salem, K., Sachedina, A., Gao, S.: Second-tier cache management using write hints. In: FAST’05: Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies, p. 9. USENIX Association, Berkeley (2005)
Wong, T.M., Wilkes, J.: My cache or yours? making storage more exclusive. In: ATEC ’02: Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference, pp. 161–175. USENIX Association, Berkeley (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thiele, M., Lehner, W. (2010). Evaluation of Load Scheduling Strategies for Real-Time Data Warehouse Environments. In: Castellanos, M., Dayal, U., Miller, R.J. (eds) Enabling Real-Time Business Intelligence. BIRTE 2009. Lecture Notes in Business Information Processing, vol 41. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14559-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-14559-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14558-2
Online ISBN: 978-3-642-14559-9
eBook Packages: Computer ScienceComputer Science (R0)