Abstract
Learning to tackle and optimize data engineering problems can be challenging due to the many dimensions each problem can take on. At the outset of each new problem, you must think about data discovery, wrangling, ingestion, transformation, and data accountability, which is an umbrella relating to data contracts (strictly defined data definitions), as well as the need to optimize the data ingestion footprint (since data at scale can easily eat into operation costs). There are additional concerns relating to data access, lineage, and governance that need to be back of mind as well. Understanding how to use your collective knowledge to create quick plans of data attack is a skill that will get you far as a modern data engineer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature
About this chapter
Cite this chapter
Haines, S. (2022). A Gentle Introduction to Stream Processing. In: Modern Data Engineering with Apache Spark. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-7452-1_9
Download citation
DOI: https://doi.org/10.1007/978-1-4842-7452-1_9
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-7451-4
Online ISBN: 978-1-4842-7452-1
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)