Skip to main content

Striving towards Near Real-Time Data Integration for Data Warehouses

Part of the Lecture Notes in Computer Science book series (LNCS,volume 2454)

Abstract

The amount of information available to large-scale enterprises is growing rapidly. While operational systems are designed to meet well-specified (short) response time requirements, the focus of data warehouses is generally the strategic analysis of business data integrated from heterogeneous source systems. The decision making process in traditional data warehouse environments is often delayed because data cannot be propagated from the source system to the data warehouse in time. A real-time data warehouse aims at decreasing the time it takes to make business decisions and tries to attain zero latency between the cause and effect of a business decision. In this paper we present an architecture of an ETL environment for real-time data warehouses, which supports a continual near real-time data propagation. The architecture takes full advantage of existing J2EE (Java 2 Platform, Enterprise Edition) technology and enables the implementation of a distributed, scalable, near real-time ETL environment. Instead of using vendor proprietary ETL (extraction, transformation, loading) solutions, which are often hard to scale and often do not support an optimization of allocated time frames for data extracts, we propose in our approach ETLets (spoken “et-lets”) and Enterprise Java Beans (EJB) for the ETL processing tasks.

Keywords

  • Business Intelligence
  • Source System
  • Business Data
  • Enterprise Edition
  • Make Business Decision

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bouzeghoub, M., Fabret, F., Matulovic, M.: Modeling Data Warehouse Refreshment Process as a Workflow Application. Intl. Workshop DMDW’99, Heidelberg, Germany, June 1999.

    Google Scholar 

  2. Bruckner, R.M., Tjoa, A M.: Capturing Delays and Valid Times in Data Warehouses-Towards Timely Consistent Analyses. To appear: Journal of Intelligent Information Systems (JIIS), forthcoming, 2002.

    Google Scholar 

  3. Inmon, W.H.: Building the Data Warehouse. 2nd ed., J.Wiley & Sons, New York, 1996.

    Google Scholar 

  4. Inmon, W.H.: Building the Operational Data Store. 2nd ed., J.Wiley & Sons, NY, 1999.

    Google Scholar 

  5. Inmon, W.H., Terdeman, R.H., Norris-Montanari J., Meers, D.: Data Warehousing for E-Business. J.Wiley & Sons, New York, 2001.

    Google Scholar 

  6. List, B., Schiefer, J., Bruckner, R.M.: Measuring Knowledge with Workflow Management Systems. TAKMA Workshop, in Proc. of 12th Intl. Workshop DEXA’01, IEEE CS Press, pp.467–471, Munich, Germany, September 2001.

    Google Scholar 

  7. Kueng, P., Wettstein, T., List, B.: A Holistic Process Performance Analysis through a Performance Data Warehouse. Proc. AMCIS 2001, Boston, USA, pp. 349–356, Aug. 2001.

    Google Scholar 

  8. Roddick, J.F., Schrefl, M.: Towards an Accommodation of Delay in Temporal Active Databases. Proc. of 11th ADC2000, IEEE CS Press, pp. 115–119, Canberra, Australia, 2000.

    Google Scholar 

  9. Schrefl, M., Thalhammer, T.: On Making Data Warehouses Active. Proc. of the 2ndIntl.Conf. DaWaK, Springer, LNCS 1874, pp. 34–46, London, UK, 2000.

    Google Scholar 

  10. Thalhammer, T., Schrefl, M., Mohania, M.: Active Data Warehouses: Complementing OLAP with Analysis Rules. Data & Knowledge Engineering, Vol. 39(3), pp. 241–269, 2001.

    CrossRef  MATH  Google Scholar 

  11. Theodoratos, D., Bouzeghoub, M.: Data Currency Quality Factors in Data Warehouse Design. Intl. Workshop DMDW’99, Heidelberg, Germany, June 1999.

    Google Scholar 

  12. Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N., Sellis, T.: ARKTOS: towards the Modeling, Design, Control and Execution of ETL Processes. Information Systems, Vol.26(8), pp. 537–561, 2001.

    CrossRef  MATH  Google Scholar 

  13. Yang, J., Widom, J.: Temporal View Self-Maintenance. Proc. of the 7thIntl. Conf. EDBT2000, Springer, LNCS 1777, pp. 395–412, Konstanz, Germany, 2000.

    Google Scholar 

  14. Yang, J.: Temporal Data Warehousing. Ph.D. Thesis, Department of Computer Science, Stanford University, 2001.

    Google Scholar 

  15. Sun Microsystems, J2EE Connector Specification 1.0, 2001.

    Google Scholar 

  16. Sun Microsystems, Designing Enterprise Applications with the Java 2 Platform, Enterprise Edition, Second Edition, 2001.

    Google Scholar 

  17. Sun Microsystems, Enterprise JavaBeans Specification, Version 2.0, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bruckner, R.M., List, B., Schiefer, J. (2002). Striving towards Near Real-Time Data Integration for Data Warehouses. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2002. Lecture Notes in Computer Science, vol 2454. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46145-0_31

Download citation

  • DOI: https://doi.org/10.1007/3-540-46145-0_31

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44123-6

  • Online ISBN: 978-3-540-46145-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics