Skip to main content

The Use of a Cloud Computing and the CUDA Architecture in Zero-Latency Data Warehouses

  • Conference paper
Computer Networks (CN 2013)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 370))

Included in the following conference series:

Abstract

The growing importance of data warehousing [1-3] and the need to provide up-to-date information, changed procedures of data processing [4-8]. Classic data warehouses which are based on a traditional ETL process, proved to be ineffective and limited further development, due to the need of time-sharing of an access time between updates and analysis [9,10]. Introduction of the zero-latency data warehouse, solved the problem of data mining time limit, however it enforces the need to use larger computing power for processing updates and queries in the ETL process. The article presents two ETL systems for zero-latency data warehouses which implement the WINE-HYBRIS algorithm. The first ETL system processes tasks in CUDA and CPU architectures, while the second uses Cloud Computing. The purpose of the article is to describe advantages and disadvantages of each solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kozielski, S., Wrembel, R. (eds.): New Trends in Data Warehousing and Data Analysis. Annals of Information Systems, vol. 3. Springer (2009)

    Google Scholar 

  2. Wrembel, R.: A Survey of Managing the Evolution of Data Warehouses. IJDWM 5(2), 24–56 (2009)

    Google Scholar 

  3. Gorawski, M., Morzy, T., Wrembel, R.: Special Issue on: Techniques of Advanced Data Processing and Analysis Introduction Control and Cybernetics 38(1), 5–8 (2009)

    Google Scholar 

  4. Andrzejewski, W., Wrembel, R.: GPU-WAH: Applying GPUs to Compressing Bitmap Indexes with Word Aligned Hybrid. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010, Part II. LNCS, vol. 6262, pp. 315–329. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Gorawski, M., Marks, P., Gorawski, M.: Collecting data streams from a distributed radio-based measurement system. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds.) DASFAA 2008. LNCS, vol. 4947, pp. 702–705. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Gorawski, M., Bańkowski, S., Gorawski, M.: Selection of Structures with Grid Optimization, in Multiagent Data Warehouse. In: Fyfe, C., Tino, P., Charles, D., Garcia-Osorio, C., Yin, H. (eds.) IDEAL 2010. LNCS, vol. 6283, pp. 292–299. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. Morzy, T.: OLAP with a Database Cluster. In: Wrembel, R., Koncilia, C. (eds.) Data Warehouses and OLAP: Concepts, Architectures and Solutions, pp. 230–252 (2007)

    Google Scholar 

  8. Cichon, P., Huzar, Z., Mazur, Z., Mrozowski, A.: Managing Adaptive Information Projects in the Context of a Software Developer Organizational Structure. In: B.L. (ed.) BIS. LNI, vol. 85 GI, pp. 242–255 (2006)

    Google Scholar 

  9. Gorawski, M., Gorawski, M.: Modified R-MVB tree and BTV algorithm used in a distributed spatio-temporal data warehouse. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) PPAM 2007. LNCS, vol. 4967, pp. 199–208. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  10. Gorawski, M., Gorawski, M.: Balanced spatio-temporal data warehouse with R-MVB, STCAT and BITMAP indexes. In: PARELEC 2006, pp. 43–48 (2006)

    Google Scholar 

  11. Bruckner, R., Min Tjoa, A.: Capturing Delays and Valid Times in Data Warehouses – Towards Timely Consistent Analyses. J. Intell. Inf. Syst. 19(2), 169–190 (2002)

    Article  Google Scholar 

  12. Bruckner, R.M., List, B., Schiefer, J.: Striving towards Near Real-Time Data Integration for Data Warehouses. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol. 2454, pp. 317–326. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Gorawski, M., Marks, P.: Towards automated analysis of connections network in distributed stream processing system. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds.) DASFAA 2008. LNCS, vol. 4947, pp. 670–677. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Gorawski, M., Marks, P.: Towards reliability and fault-tolerance of distributed stream processing system. In: DepCoS – RELCOMEX 2007, pp. 246–253 (2007)

    Google Scholar 

  15. Gorawski, M.: Architecture of Parallel Spatial Data Warehouse: Balancing Algorithm and Resumption of Data Extraction. In: Software Engineering: Evolution And Emerging Technologies, FAIA, vol. 130, pp. 49–59 (2005)

    Google Scholar 

  16. Gorawski, M.: Extended Cascaded Star Schema and ECOLAP Operations for Spatial Data Warehouse. In: Corchado, E., Yin, H. (eds.) IDEAL 2009. LNCS, vol. 5788, pp. 251–259. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  17. Gorawski, M., Marks, P.: Resumption of data extraction process in parallel data warehouses. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds.) PPAM 2005. LNCS, vol. 3911, pp. 478–485. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. Gorawski, M., Marks, P.: Checkpoint-based resumption in data warehouses. In: Sacha, K. (ed.) IFIP International Federation For Information Processing. Software Engineering Techniques: Design for Quality, vol. 227, pp. 313–323. Springer, Boston (2006)

    Google Scholar 

  19. Rahm, E., Hai Do, H.: Data Cleaning: Problems and Current approches. Bulletin of the Technical Committee on Data Engineering 23 (2000)

    Google Scholar 

  20. Waas, F., Wrembel, R., Freudenreich, T., Theile, M., Koncilia, C., Furtado, P.: On-Demand ELT Architecture for Right-Time BI: Extending the Vision. International Journal on Data Warehousing and Mining (to appear, 2013)

    Google Scholar 

  21. Thiele, M., Fischer, U., Lehner, W.: Partition-based Workload Scheduling in Living Data Warehouse Environments. In: DOLAP 2007. ACM, Portugal (2007)

    Google Scholar 

  22. Gorawski, M., Lis, D.: Architektura CUDA w bezopoznieniowych hurtowniach danych. Studia Informatica 32, 157–167 (2011)

    Google Scholar 

  23. CUDA in research, http://www.nvidia.pl/object/cuda_home_new_pl.html

  24. Windows Azure, SDK for Java, http://www.windowsazure4j.org/

  25. Jestratjew, A., Kwiecien, A.: Performance of HTTP Protocol in Networked Control Systems. IEEE Trans. Industrial Informatics 9(1), 271–276 (2013)

    Article  Google Scholar 

  26. Jestratjew, A., Kwiecień, A.: Using Cloud Storage in Production Monitoring Systems. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2010. CCIS, vol. 79, pp. 226–235. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  27. Skrzewski, M.: Monitoring Malware Activity on the LAN Network. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2010. CCIS, vol. 79, pp. 253–262. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gorawski, M., Lis, D., Gorawski, M. (2013). The Use of a Cloud Computing and the CUDA Architecture in Zero-Latency Data Warehouses. In: Kwiecień, A., Gaj, P., Stera, P. (eds) Computer Networks. CN 2013. Communications in Computer and Information Science, vol 370. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38865-1_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38865-1_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38864-4

  • Online ISBN: 978-3-642-38865-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics