Abstract
The growing importance of data warehousing [1-3] and the need to provide up-to-date information, changed procedures of data processing [4-8]. Classic data warehouses which are based on a traditional ETL process, proved to be ineffective and limited further development, due to the need of time-sharing of an access time between updates and analysis [9,10]. Introduction of the zero-latency data warehouse, solved the problem of data mining time limit, however it enforces the need to use larger computing power for processing updates and queries in the ETL process. The article presents two ETL systems for zero-latency data warehouses which implement the WINE-HYBRIS algorithm. The first ETL system processes tasks in CUDA and CPU architectures, while the second uses Cloud Computing. The purpose of the article is to describe advantages and disadvantages of each solution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kozielski, S., Wrembel, R. (eds.): New Trends in Data Warehousing and Data Analysis. Annals of Information Systems, vol. 3. Springer (2009)
Wrembel, R.: A Survey of Managing the Evolution of Data Warehouses. IJDWM 5(2), 24–56 (2009)
Gorawski, M., Morzy, T., Wrembel, R.: Special Issue on: Techniques of Advanced Data Processing and Analysis Introduction Control and Cybernetics 38(1), 5–8 (2009)
Andrzejewski, W., Wrembel, R.: GPU-WAH: Applying GPUs to Compressing Bitmap Indexes with Word Aligned Hybrid. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010, Part II. LNCS, vol. 6262, pp. 315–329. Springer, Heidelberg (2010)
Gorawski, M., Marks, P., Gorawski, M.: Collecting data streams from a distributed radio-based measurement system. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds.) DASFAA 2008. LNCS, vol. 4947, pp. 702–705. Springer, Heidelberg (2008)
Gorawski, M., Bańkowski, S., Gorawski, M.: Selection of Structures with Grid Optimization, in Multiagent Data Warehouse. In: Fyfe, C., Tino, P., Charles, D., Garcia-Osorio, C., Yin, H. (eds.) IDEAL 2010. LNCS, vol. 6283, pp. 292–299. Springer, Heidelberg (2010)
Morzy, T.: OLAP with a Database Cluster. In: Wrembel, R., Koncilia, C. (eds.) Data Warehouses and OLAP: Concepts, Architectures and Solutions, pp. 230–252 (2007)
Cichon, P., Huzar, Z., Mazur, Z., Mrozowski, A.: Managing Adaptive Information Projects in the Context of a Software Developer Organizational Structure. In: B.L. (ed.) BIS. LNI, vol. 85 GI, pp. 242–255 (2006)
Gorawski, M., Gorawski, M.: Modified R-MVB tree and BTV algorithm used in a distributed spatio-temporal data warehouse. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) PPAM 2007. LNCS, vol. 4967, pp. 199–208. Springer, Heidelberg (2008)
Gorawski, M., Gorawski, M.: Balanced spatio-temporal data warehouse with R-MVB, STCAT and BITMAP indexes. In: PARELEC 2006, pp. 43–48 (2006)
Bruckner, R., Min Tjoa, A.: Capturing Delays and Valid Times in Data Warehouses – Towards Timely Consistent Analyses. J. Intell. Inf. Syst. 19(2), 169–190 (2002)
Bruckner, R.M., List, B., Schiefer, J.: Striving towards Near Real-Time Data Integration for Data Warehouses. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol. 2454, pp. 317–326. Springer, Heidelberg (2002)
Gorawski, M., Marks, P.: Towards automated analysis of connections network in distributed stream processing system. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds.) DASFAA 2008. LNCS, vol. 4947, pp. 670–677. Springer, Heidelberg (2008)
Gorawski, M., Marks, P.: Towards reliability and fault-tolerance of distributed stream processing system. In: DepCoS – RELCOMEX 2007, pp. 246–253 (2007)
Gorawski, M.: Architecture of Parallel Spatial Data Warehouse: Balancing Algorithm and Resumption of Data Extraction. In: Software Engineering: Evolution And Emerging Technologies, FAIA, vol. 130, pp. 49–59 (2005)
Gorawski, M.: Extended Cascaded Star Schema and ECOLAP Operations for Spatial Data Warehouse. In: Corchado, E., Yin, H. (eds.) IDEAL 2009. LNCS, vol. 5788, pp. 251–259. Springer, Heidelberg (2009)
Gorawski, M., Marks, P.: Resumption of data extraction process in parallel data warehouses. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds.) PPAM 2005. LNCS, vol. 3911, pp. 478–485. Springer, Heidelberg (2006)
Gorawski, M., Marks, P.: Checkpoint-based resumption in data warehouses. In: Sacha, K. (ed.) IFIP International Federation For Information Processing. Software Engineering Techniques: Design for Quality, vol. 227, pp. 313–323. Springer, Boston (2006)
Rahm, E., Hai Do, H.: Data Cleaning: Problems and Current approches. Bulletin of the Technical Committee on Data Engineering 23 (2000)
Waas, F., Wrembel, R., Freudenreich, T., Theile, M., Koncilia, C., Furtado, P.: On-Demand ELT Architecture for Right-Time BI: Extending the Vision. International Journal on Data Warehousing and Mining (to appear, 2013)
Thiele, M., Fischer, U., Lehner, W.: Partition-based Workload Scheduling in Living Data Warehouse Environments. In: DOLAP 2007. ACM, Portugal (2007)
Gorawski, M., Lis, D.: Architektura CUDA w bezopoznieniowych hurtowniach danych. Studia Informatica 32, 157–167 (2011)
CUDA in research, http://www.nvidia.pl/object/cuda_home_new_pl.html
Windows Azure, SDK for Java, http://www.windowsazure4j.org/
Jestratjew, A., Kwiecien, A.: Performance of HTTP Protocol in Networked Control Systems. IEEE Trans. Industrial Informatics 9(1), 271–276 (2013)
Jestratjew, A., Kwiecień, A.: Using Cloud Storage in Production Monitoring Systems. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2010. CCIS, vol. 79, pp. 226–235. Springer, Heidelberg (2010)
Skrzewski, M.: Monitoring Malware Activity on the LAN Network. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2010. CCIS, vol. 79, pp. 253–262. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gorawski, M., Lis, D., Gorawski, M. (2013). The Use of a Cloud Computing and the CUDA Architecture in Zero-Latency Data Warehouses. In: Kwiecień, A., Gaj, P., Stera, P. (eds) Computer Networks. CN 2013. Communications in Computer and Information Science, vol 370. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38865-1_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-38865-1_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38864-4
Online ISBN: 978-3-642-38865-1
eBook Packages: Computer ScienceComputer Science (R0)