Skip to main content

Real Time Data Warehouse Updates Through Extraction-Transformation-Loading Process Using Change Data Capture Method

  • Conference paper
  • First Online:
Second International Conference on Computer Networks and Communication Technologies (ICCNCT 2019)

Abstract

The world of big data becomes a Business-critical component for Enterprise resource planning system and Business Intelligence. The ERP system runs big data longer and uses resource locks, which directly blocks the users from running queries on the database. Additionally, users will require updates on real-time data changes. More computational resources are required to reduce the loading cycle creating expensive processes with complete data loads. An ETL technique with CDC is used to resolve problems, through periodic updates of changed data. A process which identifies changed records to reduce the extract volume is knows as CDC. This paper proposes a structure capable of performing CDC by means of timestamps and replication tool designed for spontaneous synchronization between two databases. The overall performance of CDC technique to ERP system is compared. This approach is employed in a real-world project has noticed a transition to near real-time data ETL and performance improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Li, J., Xu, B.: ETL tool research and implementation based on drilling data warehouse. In: Seventh International Conference on Fuzzy Systems and Knowledge Discovery, pp. 2567–2569 (2010)

    Google Scholar 

  2. Tank, D.M., Ganatra, A., Kosta, Y.P., Bhensdadia, C.K.: Speeding ETL processing in data using high-performance joins for changed data capture (CDC). In: International Conference on Advances in Recent Technologies in Communication and Computing, pp. 365–368 (2010)

    Google Scholar 

  3. Woodall, P., Jess, T., Harrison, M., McFarlane, D., Shah, A., Krechel, W., Nicks, E.: A framework for detecting unnecessary industrial data in ETL processes. IEEE, pp. 472–476 (2014)

    Google Scholar 

  4. Pan, B., Zhang, G., Qin, X.: An overview and implementation of extraction-transformation-loading (ETL) process in data warehouse. In: 3rd International Conference on Information and Communication Technology, pp. 70–74 (2015)

    Google Scholar 

  5. Pan, B., Zhang, G., Qin, X.: Design and realization of an ETL method in business intelligence project. In: 3rd IEEE International Conference on Cloud Computing and Big Data Analysis, pp. 275–279 (2018)

    Google Scholar 

  6. Chandra, H.: Analysis of change data capture method in heterogeneous data sources to support RTDW. In: 4th International Conference on Computer and Information Sciences, pp. 1–6 (2018)

    Google Scholar 

  7. Al Faris, F.Z., Nugroho, A.: Development of data warehouse to improve services in IT services company. In: International Conference on Information Management and Technology, pp. 483–488 (2018)

    Google Scholar 

  8. Homayoun, H.: Testing extract-transform-load process in data warehouse systems. In: IEEE International Symposium on Software Reliability Engineering Workshops, pp. 158–161 (2018)

    Google Scholar 

  9. Efficient and Real Time Data Integration with Change Data Capture, An Attunity White Paper, pp. 1–20 (2009)

    Google Scholar 

  10. Mekterović, I., Brkić, L.: Delta view generation for incremental loading of large dimensions in a data warehouse. In: MIPRO 2015, 25–29 May 2015, pp. 1417–1422 (2015)

    Google Scholar 

  11. Ghugarkar, M.P., Borude, M.Y., Irabashetti, P.: Real-time change data capture using staging tables and delta view generation for incremental loading of large dimensions in a data warehouse. Int. J. Innov. Eng. Res.Technol. 1–5

    Google Scholar 

  12. Bokade, M.B., Dhande, S.S., Vyavahare, H.R.: Framework of change data capture and real time data warehouse. Int. J. Eng. Res. Technol. (IJERT) 2(4), pp 1418–1425 (2013)

    Google Scholar 

  13. Atmaja, I.P.M., Saptawijaya, A., Aminah, S.: Implementation of change data capture in ETL process for data warehouse using HDFS and Apache Spark. In: Conference Paper, September 2017

    Google Scholar 

  14. Shi, J., Bao, Y., Leng, F., Yu, G.: Study on log-based change data capture and handling mechanism in real-time data warehouse. In: International Conference on Computer Science and Software Engineering, pp. 478–481 (2008)

    Google Scholar 

  15. Schmidt, F.M., Geyer, C., Schaeffer-Filho, A., DeBloch, S., Hu, Y.: Change data capture in NoSQL databases: a functional and performance comparison. In: 20th IEEE Symposium on Computers and Communication, pp. 562–567 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sunaadh Thulasiram .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Thulasiram, S., Ramaiah, N. (2020). Real Time Data Warehouse Updates Through Extraction-Transformation-Loading Process Using Change Data Capture Method. In: Smys, S., Senjyu, T., Lafata, P. (eds) Second International Conference on Computer Networks and Communication Technologies. ICCNCT 2019. Lecture Notes on Data Engineering and Communications Technologies, vol 44. Springer, Cham. https://doi.org/10.1007/978-3-030-37051-0_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37051-0_62

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37050-3

  • Online ISBN: 978-3-030-37051-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics