Resumption of Data Extraction Process in Parallel Data Warehouses
ETL processes are sometimes interrupted by occurrence of a failure. In such a case, one of the interrupted extraction resumption algorithms is usually used. In this paper we present a modified Design-Resume (DR) algorithm enriched by the possibility of handling ETL processes containing many loading nodes. We use the DR algorithm to resume a parallel data warehouse load process. The key feature of this algorithm is that it does not impose additional overhead on the normal ETL process. In our work we modify the algorithm to work with more than one loading node, which increases the efficiency of the resumption process. Based on the results of performed tests, the benefits of our improvements are discussed.
KeywordsData Warehouse Total Processing Time Fact Table Data Warehouse System Spatial Data Warehouse
Unable to display preview. Download preview PDF.
- 2.Galhardas, H., Florescu, D., Shasha, D., Simon, E.: Ajax: An Extensible Data Cleaning-Tool. In: Proc. ACM SIGMOD Intl. Conf. On the Management of Data, Teksas (2000)Google Scholar
- 3.Gorawski, M., Malczok, R.: Distributed Spatial Data Warehouse Indexed with Virtual Memory Aggregation Tree. In: 5th Workshop on Spatial-Temporal DataBase Management (STDBM_VLDB 2004), Toronto, Canada (2004)Google Scholar
- 4.Gorawski, M., Piekarek, M.: Development Environment ETL/JavaBeans. Studia Informatica, 24, 4(56) (2003)Google Scholar
- 5.Gorawski, M., Wocaw, A.: Evaluation of the Efficiency of Design-Resume/JavaBeans Recovery Algorithm. Archives of Theoretical and Applied Informatics 15(1) (2003)Google Scholar
- 6.Labio, W., Wiener, J., Garcia-Molina, H., Gorelik, V.: Efficient resumption of interrupted warehouse loads. In: SIGMOD Conference (2000)Google Scholar
- 7.Labio, W., Wiener, J., Garcia-Molina, H., Gorelik, V.: Resumption algorithms. Technical report, Stanford University (1998)Google Scholar
- 8.Oracle. WarehouseBuilder10gOracle, Available at: http://otn.oracle.com/products/warehouse/index.html
- 9.Sagent Technologies Inc.: Personal correspondence with customersGoogle Scholar
- 10.Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Modeling ETL Activities asGraphs. In: Proc. 4th Intl. Workshop on Design and Management of Data Warehouses, Canada (2002)Google Scholar