Abstract
In the last chapter, you learned to use Apache Spark’s powerful aggregation and analytics functions, from the agg operator that enabled powerful columnar aggregation capabilities directly off a grouped dataset, to the analytical window functions that allowed you to partition and analyze datasets using these unique windowing capabilities. This gave you the ability to look back (lag) or forward (lead) across many rows from your current position in an iteration. You learned to use lag over to create row-by-row average deltas and similar techniques to create running cumulative totals.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature
About this chapter
Cite this chapter
Haines, S. (2022). Advanced Analytics with Spark Stateful Structured Streaming. In: Modern Data Engineering with Apache Spark. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-7452-1_13
Download citation
DOI: https://doi.org/10.1007/978-1-4842-7452-1_13
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-7451-4
Online ISBN: 978-1-4842-7452-1
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)
