Semantic Foundations for Deterministic Dataflow and Stream Processing

We propose a denotational semantic framework for deterministic dataflow and stream processing that encompasses a variety of existing streaming models. Our proposal is based on the idea that data streams, stream transformations, and stream-processing programs should be classified using types. The type of a data stream is captured formally by a monoid, an algebraic structure with a distinguished binary operation and a unit. The elements of a monoid model the finite fragments of a stream, the binary operation represents the concatenation of stream fragments, and the unit is the empty fragment. Stream transformations are modeled using monotone functions on streams, which we call stream transductions. These functions can be implemented using abstract machines with a potentially infinite state space, which we call stream transducers. This abstract typed framework of stream transductions and transducers can be used to (1) verify the correctness of streaming computations, that is, that an implementation adheres to the desired behavior, (2) prove the soundness of optimizing transformations, e.g. for parallelization and distribution, and (3) inform the design of programming models and query languages for stream processing. In particular, we show that several useful combinators can be supported by the full class of stream transductions and transducers: serial composition, parallel composition, and feedback composition.


Introduction
Stream processing is the computational paradigm where the input is not presented in its entirety at the beginning of the computation, but instead it is given in an incremental fashion as a potentially unbounded sequence of elements or data items. This paradigm is appropriate in settings where data is created continually in real-time and has to be processed immediately in order to extract actionable insights and enable timely decision-making. Examples of such datasets are streams of business events in an enterprise setting [26], streams of packets that flow through computer networks [37], time-series data that is captured by sensors in healthcare applications [33], etc.
Due to the great variety of streaming applications, there are various proposals for specialized languages, compilers, and runtime systems that deal with the processing of streaming data. Relational database systems and SQL-based languages have been adapted to the streaming setting [1,2,15,16,18,19,32,37,57,91]. Recently, several systems have been developed for the distributed processing of data streams that are based on the distributed dataflow model of computation [6,7,70,86,92,94,108,112,113]. Languages for detecting complex events in distributed systems, which draw on the theory of regular expressions and finite-state automata, have also been proposed [29,40,41,50,53,88,99,111]. The synchronous dataflow formalisms [20,24,28,51,73,107] are based on Kahn's seminal work [59], and they have been used for exposing and exploiting tasklevel and pipeline parallelism within streaming computations in the context of embedded systems. Several formalisms for the runtime verification of reactive systems have been proposed, many of which are based on variants of Temporal Logic and its timed/quantitative extensions [39,43,52,74,105]. Finally, there is a large collection of languages and systems for reactive programming [34,36,38,46,47,55,68,69,77,89,93,103], which focus on the development of event-driven and interactive applications such as GUIs and web programming.
The aforementioned languages and systems have been successfully used in the application domains for which they were developed. However, each one of them typically introduces a unique variant of the streaming model in terms of: (1) the form of the input and output data, (2) the class of expressible stream-processing computations, and (3) the syntax employed to describe these computations. This has resulted in an enormous proliferation of semantic models for stream processing that are difficult to compare. For this reason, we are interested in identifying a semantic unification of several existing streaming models. This paper introduces a typed semantic framework for reasoning about languages and systems for stream processing. Three key questions are tackled: 1. How do we model streams and what is the form of the data that they carry? 2. How do we capture mathematically the notion of a stream transformation? 3. What is a general programming model for specifying streaming computations? The first two questions concern the discovery of an appropriate denotational model for streaming computation. The third question concerns the design of programming and query languages, where a key requirement is that the behavior of a streaming program/query admits a precise mathematical description. Existing works have addressed these questions in the context of specific classes of applications.
Here are examples of various perspectives: − Transductions of strings [8,100,104,110]: A stream is viewed as an unbounded sequence of letters, and a stream transformation is a translation from input sequences to output sequences, which is typically called string/word transduction. These translations are commonly described using finite-state transducers, a class of automata that extend acceptors with output.
− The streaming dataflow model of Gilles Kahn [59,60]: The input and output consist of multiple independent channels that carry unbounded sequences of elements. A transformation is a function from a tuple of input sequences to a tuple of output sequences. Such transformations are specified with dataflow graphs whose nodes describe single-process computations. − Relational transformations [71]: A stream is an unbounded multiset (bag) of tuples, and a stream transformation is a monotone operator (w.r.t. multiset containment) on multisets. This can be generalized to consider more than one input stream. An interesting subclass of these operators can be described syntactically using monotone relational algebra.
− Processing of time-varying relations [16,17]: A stream is a timevarying finite multiset of tuples, i.e. an unbounded sequence of finite multisets of tuples. In this setting, a stream transformation processes the input in a way that preserves the notion of time: after processing t input multisets (i.e., t time units) the output consists of t output multisets. The query language CQL [16] defines a class of such computations that involve relational and windowing operators.
− Transformations of continuous-time signals [27]: An input stream is a continuous-time signal, that is, a function from the real numbers R to an ndimensional space R n . A stream transformation is a mapping from input signals to output signals that is causal, which means that the value of the output at time t depends on the values of input signal up to (and including) time t. Systems of differential equations can be used to describe classes of such transformations.
We are interested here in a unifying framework that encompasses all the aforementioned concrete instances of streaming models and enables formal reasoning about the composition of streaming computations from different models. In order to achieve this we take an abstract algebraic approach that retains only the essential aspects of stream processing without any unnecessary specialization. The rest of the section outlines our proposal.
At the most fundamental level, stream processing is computation over input that is not given at the beginning in full, but rather is presented incrementally as the computation evolves. Since the input is presented piece by piece, the basic concepts that need to be captured mathematically are: (1) what is a piece or fragment of the input, and (2) how do we extend the input. The most general class of algebraic structures that model these notions is the class of monoids, the collection of algebras that have a distinguished binary associative multiplication operation · and an identity element 1 for this operation. A monoid (A, ·, 1) then constitutes a type of data streams, where the elements of the monoid are all the possible finite stream fragments, the identity 1 ∈ A is the empty stream fragment, and the multiplication operation · : A × A → A models the concatenation of stream fragments. Using monoids, we can organize several notions of data streams using types that describe the form of the data, as well any invariants or assumptions about them. Monoids encompass the kinds of data streams that we mentioned earlier and many more: strings of letters, linear sequences of data items, tuples of sequences, multisets (bags) of data items, sets of data items, time-varying relations/multisets, (potentially disordered) timestamped sequences of data items, continuous-time signals, and so on.
Stream transformations can be classified according to the type of their input and output streams, which we call a transduction type. They are modeled using monotone functions that map an input stream history (i.e., the fragment of the input stream that has been received from the beginning of the computation until now) to an output stream history (i.e., the fragment of the output stream produced so far). The monotonicity requirement captures the idea that a stream transformation cannot retract the output that has already been emitted. We call such functions stream transductions, and we propose them as a denotational semantic model for stream processing. This model encompasses string transductions, non-diverging Kahn-computable [59] functions on streams, monotone relational transformations [71], the CQL-definable [16] transformations on time-varying relations, and transformations of continuous-time signals [27].
We also introduce an abstract model of computation for stream processing. The considered programs or abstract machines are called stream transducers, and they are organized using transducer types that specify the input and output stream types. A stream transducer processes the input stream in an incremental fashion, by consuming it fragment by fragment. The consumption of an input fragment results in the emission of an output fragment. Our algebraic setting brings in an unavoidable complication compared to the classical theory of word transducers: not all stream transducers describe a stream transduction. This phenomenon has to do with the generalization of the input and output data streams from sequences of atomic data items to elements of arbitrary monoids. A stream transducer has to respect its input/output type, which means that the way in which the input stream is fragmented into pieces and fed to the transducer does not affect the cumulative output. More concisely, this says that the cumulative output is independent from the fragmentation of the input. In order to formalize this notion, we say that a factorization of an input history u is a sequence of stream fragments u 1 , u 2 , . . . , u n whose concatenation is equal to the input history, i.e. u 1 · u 2 · · · u n = u. Now, the desired restriction can be described as follows: for every input history w and any two factorizations u 1 , . . . , u m and v 1 , . . . , v m of w, the cumulative output that the transducer emits when consuming the fragments u 1 , . . . , u m in sequence is equal to the cumulative output when consuming the fragments v 1 , . . . , v n . Fortunately, this complex property can be distilled into an equivalent property on the structure of the stream transducer that we call coherence property. Every stream transducer that is coherent has a well-defined semantics or denotation in terms of a stream transduction.
We have already outlined the basics of our general framework for streaming computation, which includes: (1) a classification of streams using monoids as types, (2) a denotational semantic model that employs monotone functions from input histories to output histories, and (3) a programming model that generalizes transducers to compute meaninfully on elements of arbitrary monoids. This already allows us to address important questions about specific computations: − Does a streaming program (transducer) behave as intended? This amounts to checking whether the denotation of the transducer is the desired function. − Are two streaming programs (transducers) equivalent? This means that their denotations in terms of stream transductions are the same. The first question is a correctness property. The second question is relevant for semantics-preserving program optimization. We will turn now to the issue of how to modularly specify complex stream transductions and transducers.
One of the most common ways to conceptually organize complex streaming computations is to view the overall computation as the composition of several processes that run independently and are connected with directed communication channels on which streams of data flow. This way of structuring computations is called the dataflow programming model. The simple deterministic parallel model of Karp and Miller [61] is one of the first variants of dataflow, and other notable early works on dataflow models include Dennis's parallel language of actors and links [42] and Kahn's networks [59] of computing stations and communication lines. We investigate three key dataflow combinators for composing stream transductions (i.e., semantic-level) and stream transducers (i.e., program-level): serial composition, parallel composition, and feedback composition. Serial composition is useful for describing pipelines of processing stages, where the output of one stage is streamed as input into the next stage. Parallel composition describes the independent and concurrent computation of two or more components. Feedback composition supports computations whose current output depends on previously produced outputs. We show that our framework supports all these combinators, which facilitate the modular description of complex computations and expose pipeline and task-based parallelism.
Outline of paper. In Sect. 2 we introduce the idea that data streams can be classified using monoids as their types, and in Sect. 3 we propose the semantic model of stream transductions. Sect. 4 is devoted to the description of an abstract model of streaming computation, called stream transducer, and the main properties that it satisfies. In Sect. 5 we show that our abstract model is closed under a fundamental set of dataflow combinators: serial, parallel, and feedback composition. In Sect. 6 we prove the soundness of a streaming optimizing transformation using denotational arguments and algebraic rewriting. Sect. 7 contains related work, and Sect. 8 concludes with a brief summary of our proposal.

Monoids as Types for Streams
Data streams are typically viewed as unbounded linear sequences of data items, where a data item can be thought of as a small indivisible piece of data. This viewpoint is sufficient for describing many useful semantic and programming models, but it is too concrete and unnecessarily restricts the notion of a data stream. In order to see this, consider a computation where the specific order in which the data items arrive is not relevant. Counting is a trivial example of such a computation, and it can be described operationally as follows: every time a new data item arrives, the counting stream algorithm emits the total number of items that have been seen so far. This can be described mathematically by the function β, given by β( x 1 , x 2 , . . . , x n ) = 1, 2, . . . , n , where x 1 , x 2 , . . . , x n is the input and 1, 2, . . . , n is the cumulative output of the computation. For this computation, the input can be meaningfully viewed as a multiset (or bag) instead of a sequence, since the ordering of the data items is irrelevant. This means that multisets can also be viewed as data streams, and in some cases this viewpoint is preferable to the traditional one of "streams = sequences".
The example of the previous paragraph raises an obvious question: What class of mathematical objects can meaningfully serve as data streams? Linear sequences and multisets should certainly be included, but it would be desirable to generalize the notion of streams as much as possible. Recent works explore the idea of generalizing streams to encompass a large class of partial orders [13,85], but we will see later that this approach excludes many useful instances. Stream processing is the computational paradigm where the input is not presented in full at the beginning of the computation, but instead it is given in an incremental fashion or piece by piece. For this reason, there are just three notions that need to be modeled mathematically: (1) a fragment or piece of a data stream, (2) the extension of data with an additional fragment of data, and (3) the empty data stream, i.e. the data seen at the very beginning of the computation. This leads us to consider a kind or type of a data stream as an algebraic structure that satisfies the following: (1) its elements model data stream fragments, (2) it has a distinguished associative operation · for the concatenation of stream fragments, and (3) it has a distinguished element 1 that represents the empty fragment so that 1 is a unit for concatenation. The class of monoids is the largest class of algebraic structures that fulfill these requirements.
More formally, a monoid is an algebraic structure (A, ·, 1), where · : A×A → A is a binary operation called multiplication and 1 ∈ A is a constant called unit, that satisfies the following two axioms: (I) (x · y) · z = x · (y · z) for all x, y, z ∈ A, and (II) 1 · x = x · 1 = x for all x ∈ A. The first axiom says that · is associative, and the second axiom says that 1 is a left and right identity for the · operation. For brevity, we will sometimes write xy to denote x · y.
Suppose that A is a monoid. We write A * for the set of all finite sequences of elements of A and ε for the empty sequence. The finite multiplication function π : A * → A is given by π(ε) = 1 and π(x · y ) = π(x) · y forx ∈ A * and y ∈ A. For sequencesx,ȳ ∈ A * , it holds that π(x ·ȳ) = π(x) · π(ȳ). So, π generalizes the binary multiplication · to a finite but arbitrary number of arguments.
Let (A, · A , 1 A ) and (B, · B , 1 B ) be monoids. Their product is the monoid (A × B, ·, 1), where the multiplication operation is given by (x, y) · (x , y ) = (x · A x , y · B y ) for x, x ∈ A and y, y ∈ B, and the identity is 1 = (1 A , 1 B ).
A monoid homomorphism from a monoid (A, ·, 1) to a monoid (B, ·, 1) is a function h : A → B that commutes with the monoid operations, that is, h(1) = 1 and h(x · y) = h(x) · h(y) for all x, y ∈ A.
As we discussed earlier, we can think of a monoid as a type of data streams. The elements of the monoid represent finite stream fragments. The multiplication operation · models the concatenation of stream fragments, and the unit of the monoid is the empty stream fragment.
For a monoid (A, ·, 1) we define the binary relation as follows: for all x, y ∈ A, we put x y if and only if xz = y for some z ∈ A. Since the relation is reflexive and transitive, we call it the prefix preorder for the monoid A. The unit 1 is a minimal element w.r.t. the relation: 1 · x = x and hence 1 x for every x ∈ A. Define the function prefix : A × A → P(A) as follows: prefix(x, y) = {z ∈ A | xz = y} for all x, y ∈ A. This implies that x y iff prefix(x, y) = ∅. In other words, prefix(x, y) is the set of all witnesses for x y. A partial function ∂ : A × A A is said to be a prefix witness function (or simply a witness function) for the monoid A if its domain is equal to and it satisfies: ∂(x, y) ∈ prefix(x, y) for every x, y ∈ A with x y. We can express this equivalently by requiring that the type of the function ∂ is (x,y)∈ prefix(x, y).
We say that a monoid A satisfies the left cancellation property if xy = xz implies y = z for all x, y, z ∈ A. In this case we say that A is left-cancellative. If A is left-cancellative, then it has a unique prefix witness function, because x y implies that there is a unique z with xz = y.
Example 1 (Finite Sequences). Consider the algebra (FSeq(A), ·, ε), where FSeq(A) is the set A * of all finite words (strings) over a set A, · is word concatenation, and ε is the empty word. This algebra is a monoid. In fact, it is the free monoid with generators A. For u, v ∈ A * , u v iff the word u is a prefix of the word v. There is a unique prefix witness function, because for every x, y ∈ A * with x y there is a unique z ∈ A * such that xz = y.
Let us consider now a variant of Example 1 in order to clear any misunderstandings regarding the order. The set A * , together with the empty sequence ε, and the operation • given by is the set of all finite multisets (bags) over a set A, ∪ is multiset union, and ∅ is the empty multiset. This algebra is a monoid. In fact, it is the free commutative monoid with generators A. It is also left cancellative. For x, y ∈ FBag(A), x y iff x is contained in y. So, we also use the notation ⊆ instead of . There is a unique prefix witness function, because for every x, y ∈ FBag(A) with x ⊆ y there is a unique z ∈ FBag(A) such that xz = y.
Example 3 (Finite Sets). Let A be a set. Consider the algebra (FSet(A), ∪, ∅), where FSet(A) is the set of all finite subsets of A, ∪ is set union, and ∅ is the empty set. This algebra is a monoid. In fact, it is the free commutative idempotent monoid with generators A. For x, y ∈ FBag(A), x y iff x is contained in y. So, we also use the notation ⊆ instead of . For Example 4 (Finite Maps). Let K be a set of keys, and V be a set of values. Consider the algebra (FMap(K, V ), ·, ∅), where FMap(K, V ) is the set of all partial maps K V with a finite domain, ∅ is the partial map with empty domain, and · is defined as follows: is undefined and f (k) is defined undefined, otherwise for every f, g ∈ FMap(K, V ) and k ∈ K. We leave it to the reader to check that This means that FMap(K, V ) has several distinct prefix witness functions.

Example 5 (Bounded-Domain Continuous-Time Signals). Let A be an arbitrary set, and R be the set of real numbers. A bounded-domain continuoustime signal with values in
We define the concatenation operation · for such signals as follows: We write BSig(A) for the set of all these bounded-domain continuous-time signals. The unit signal is the unique function of type [0, 0) → A, whose domain of definition is empty. Observe that BSig(A) is a monoid. For signals f : There is a unique prefix witness function, because for every f, g ∈ BSig(A) with f g there is a unique h ∈ BSig(A) such that f · h = g.
Example 6 (Timed Finite Sequences). We write N to denote the set of natural numbers (non-negative integers). A timed sequence over A is an alternating sequence s 0 a 1 s 1 a 2 . . . a n s n , where s i ∈ N and a i ∈ A for every i. The occurrences s 0 , s 1 , . . . are called time punctuations and indicate the passage of time. So, the set of all timed sequences over A is equal to TFSeq(A) = N·(A·N) * . We define the fusion product of timed sequences as follows: The unit timed sequence is the singleton sequence 0. The algebra (TFSeq(A), , 0) is easily shown to be a monoid. There is a unique prefix witness function, because for all x, y ∈ TFSeq(A) with x y there is a unique z ∈ TFSeq(A) s.t. x z = y.

Example 7 (Finite Time-Varying Multisets).
A finite time-varying multiset over A is a partial function f : N FBag(A) whose domain is equal to [0..n] = {0, . . . , n} for some integer n ≥ 0. We also use the notation f : [0..n] → FBag(A) to convey this information regarding the domain of f . We define the concatenation operation · for finite time-varying multisets as follows: We write TFBag(A) to denote the set of all finite time-varying multisets over A.
The unit time-varying multiset Id We leave it to the reader to also verify that (f · g) · h = f · (g · h) for finite time-varying multisets f , g and h. So, the set TFBag(A) together with · and Id is a monoid. It is not difficult to show that it is left-cancellative. Let us consider now the prefix preorder on finite time-varying multisets.
The examples above highlight the variety of mathematical objects that can be meaningfully viewed as streams. These streams can be organized elegantly using the structure of monoids. The sequences of Example 1, the multisets of Example 2, and the finite time-varying multisets of Example 7 can be described equivalently in terms of the partial orders of [13,85], which have also been suggested as an approach to unify notions of streams. Using partial orders it is also possible to model the timed finite sequences of Example 6, but only with a non-succinct encoding: every time punctuation t ∈ N is encoded with a sequence 11 . . . 1 of t punctuations, one for each time unit. Partial orders cannot encode the sets of Example 3, the maps of Example 4, or the signals of Example 5. Informally, the reason for this is that partial orders can only encode commutation equations, which are insufficient for objects such as sets and maps.

Stream Transductions
In this section we will introduce stream transductions as semantic denotational models of stream transformations. At any given point in a streaming computation, we have seen an input history (the part of the stream from the beginning of the computation until now) and we have produced an output history (the cumulative output that has been emitted from the beginning until now). As a first approximation, a streaming computation can be described mathematically by a function β : A → B, where A and B are monoids that describe the input and output type respectively, which maps an input history x ∈ A to an output history β(x) ∈ B. The function β has to be monotone because the output is cumulative, which means that it can only be extended with more output items as the computation proceeds. An equivalent way to understand the monotonicity property is that it captures the idea that any output that has already been emitted cannot be retracted. Since β takes an entire input history as its argument, it can describe stateful computations, where the output that is emitted at every step potentially depends on the entire input history.

Definition 8 (Stream Transduction & Incremental Form).
Let A and B be monoids. A function β : A → B is said to be monotone (with respect to the prefix preorder) if x y implies β(x) β(y) for all x, y ∈ A. For a monotone β : A → B, we say that the partial function µ is a monotonicity witness function if it maps elements x, y ∈ A and z ∈ prefix(x, y) witnessing that x y to a witness µ(x, y, z) ∈ prefix(β(x), β(y)) for β(x) β(y). That is, we require that the type of µ is x,y∈A prefix(x, y) → prefix(β(x), β(y)). So, the defining property of µ is that for all x, y, z ∈ A with xz = y it holds that β(x) · µ(x, y, z) = β(y). For brevity, we will sometimes write µ(x, z) to denote µ(x, xz, z). The defining property of µ is then written as β(x) · µ(x, z) = β(xz) for all x, z ∈ A.
A stream transduction from A to B is a function β : A → B that is monotone with respect to the prefix preorder, together with a monotonicity witness function µ : x,y∈A prefix(x, y) → prefix(β(x), β(y)). We write STrans(A, B) to denote the set of all stream transductions from A to B.
The incremental form of a stream transduction β, Consider the stream transduction β, µ : STrans(A, B) and the input fragments x, y ∈ A. Notice that µ(x, y) gives the output increment that the streaming computation generates when the input history x is extended into xy. For an arbitrary output monoid B, the output increment µ(x, y) is generally not uniquely determined by β(x) and β(xy). This means that the monotonicity witness function µ generally provides some additional information about the streaming computation that cannot be obtained purely from β. However, if the output monoid B is left-cancellative then there is a unique function µ that witnesses the monotonicity of β.
Suppose that β, µ : STrans(A, B) is a stream transduction. The incremental form F(β, µ) of the transduction β, µ describes the stream transformation in explicit input/output increments. For example, F(β, µ)( Example 9 (Counting). Let A be an arbitrary set. We will describe a streaming computation whose input type is the monoid FBag(A) and whose output type is the monoid FSeq(N). The informal operational description is as follows: there is no initial output, and every time a new data item arrives the computation emits the total number of items seen so far. The formal description is given by the stream transduction β : FBag(A) → FSeq(N), defined by β(∅) = ε and β(x) = 1, 2, . . . , |x| for every non-empty x ∈ FBag(A), where |x| denotes the size of the multiset x. It is easy to see that β is monotone. Since FSeq(N) is left-cancellative, the monotonicity witness function is uniquely determined: µ(x, ∅) = ε and µ(x, y) = |x| + 1, . . . , |x| + |y| when y = ∅.
Example 10 (Per-Key Aggregation). Let K be a set of keys, and V be a set of values. The elements of K × V are typically called key-value pairs. Suppose that op : V × V → V is an associative and commutative operation. So, op can be generalized to an aggregation operation that takes non-empty finite multisets over V as input. We will describe a streaming computation whose input type is the monoid FBag(K × V ) and whose output type is the monoid FMap(K, V ). Informally, every time an item (k, v) is processed, the output map is updated so that the k-indexed entry contains the aggregate (using op) of all values seen so far for the key k. The formal description of this computation is given by the stream transduction β : FBag(K × V ) → FMap(K, V ), defined by β(x) = {k → op(x| k ) | k appears in x} for every multiset x, where x| k denotes the multiset that results from x by keeping only the pairs whose key is equal to k. That is, the domain of β(x) is equal to dom(β(x)) = {k ∈ K | k appears in x} and β(x)(k) = op(x| k ) for every k that appears in x. The monotonicity witness function µ is defined as follows: µ(x, y) is equal to the restriction of the map β(x ∪ y) to the set of all keys that appear in y.
We saw in Sect. 2 that we can form products of monoids: if A and B are monoids, then so is A × B. Intuitively, we can think of A × B as the data stream type that involves two parallel and independent channels: one channel for streams of type A and another channel for streams of type B.
Example 11 (Merging of Multiple Input Channels). Given a set A, we want to describe a transformation with two input channels of type FBag(A) and one output channel of type FBag(A). The monotone function β : FBag(A) × FBag(A) → FBag(A), given by β(x, y) = x ∪ y for multisets x and y, describes the merging of the two input substreams. Operationally, whenever a new data item arrives (regardless of channel) it is propagated to the output channel. Since FBag(A) is left-cancellative, the monotonicity witness function is uniquely determined: µ( x 1 , y 1 , x 2 , y 2 ) = (x 2 ∪ y 2 ) \ (x 1 ∪ y 1 ) for all x 1 , y 1 , x 2 , y 2 ∈ FBag(A).
Example 13 (Split in Batches). Let Σ = {a, b} be an alphabet of symbols. Suppose that we want to describe the decomposition of an element of Σ * into batches of size exactly 3. We describe this using two functions r 1 : Σ * → FSeq(Σ * ) and r 2 : Σ * → Σ * . Informally, r 1 gives the sequence of full batches of size 3, and r 2 gives the remaining incomplete batch. For example, r 1 (abbaabba) = abb, aab and r 2 (abbaabba) = ba.
This idea of splitting in batches can be generalized from the monoid Σ * to an arbitrary monoid A. We say that a splitter for A is a pair r = (r 1 , r 2 ) of functions r 1 : A → FSeq(A) and r 2 : A → A satisfying the following properties: (1) the equality x = π(r 1 (x)) · r 2 (x) says that r 1 and r 2 decompose x ∈ A, (2) r 1 (1 A ) = ε says that the unit cannot be decomposed, (3) r 1 (x · y) = r 1 (x) · r 1 (r 2 (x) · y) and (4) r 2 (x · y) = r 2 (r 2 (x) · y) describe how to decompose the concatenation of two monoid elements. The first two properties imply that r 2 (1 A ) = 1 A . The third property implies that r 1 is monotone. Define µ(x, y) = r 1 (r 2 (x)·y) for x, y ∈ A and observe that r 1 (x)·µ(x, y) = r 1 (xy). It follows that split(r) = r 1 , µ is a stream transduction of type STrans (A, FSeq(A)).
Our denotational model of a stream transformation uses a monotone function whose domain is the monoid of (finite) input histories. We emphasize that such a denotation can also describe the transformation of an infinite stream. To illustrate this point in simple terms, consider a monotone function β : A * → B * , where A (resp., B) is the type of input (resp., output) items. This function extends uniquely to the ω-continuous function β ∞ : A ω is the set of finite and infinite sequences over A, as follows: β ∞ (a 0 a 1 a 2 . . .) is equal to the supremum of the chain β(ε) ≤ β(a 0 ) ≤ β(a 0 a 1 ) ≤ . . .

Model of Computation
We will present an abstract model of computation for stream processing, where the input and output data streams are elements of monoids A and B respectively. A streaming algorithm is described by a transducer, a kind of automaton that produces output values. We consider transducers that can have a potentially infinite state space, which we denote by St. The computation starts at a distinguished initial state init ∈ St, and the initialization triggers some initial output o ∈ B. The computation then proceeds by consuming the input stream incrementally, i.e. fragment by fragment. One step of the computation from a state s ∈ St involves consuming an input fragment x ∈ A, producing an output increment out(s, x) ∈ B and transitioning to the next state next(s, x) ∈ St.
Definition 14 (Stream Transducer). Let A, B be monoids. A stream transducer with inputs from A and outputs from B is a tuple G = (St, init, o, next, out), where St is a nonempty set of states, init ∈ St is the initial state, o ∈ B is the initial output, next : St × A → St is the transition function, and out : St × A → B is the output function. We write G (A, B) to denote the set of all stream transducers with inputs from A and outputs from B.
We define the generalized transition function gnext : St × A * → St by induction: gnext(s, ε) = s and gnext(s, x ·ȳ) = gnext(next(s, x),ȳ) for all s ∈ St, x ∈ A andȳ ∈ A * . A state s ∈ St is said to be reachable in G if there exists a sequencex ∈ A * such that gnext(init,x) = s.
We define the generalized output function gout : St × A * → B by induction on the second argument: gout(s, ε) = 1 and gout(s, x ·ȳ) = out(s, x) · gout(next(s, x),ȳ) for all s ∈ St, x ∈ A andȳ ∈ A * . The extended output function eout : St × A * → B * is defined similarly: eout(s, ε) = ε and eout(x, x ·ȳ) = out(s, x) · eout(next(s, x),ȳ) for all s ∈ St, x ∈ A andȳ ∈ A * . Example 15 (Transducer for Counting). Recall the counting streaming computation that was described in Example 9. We will describe a stream transducer that implements the counting computation. The input monoid is FBag(A) and the output monoid is FSeq(N). The state space is St = N, because the transducer has to maintain a counter that remembers the number of data items seen so far. The initial state is init = 0 and the initial output is o = ε. The transition function increments the counter, i.e. next(s, x) = s + |x| for every s ∈ St and x ∈ FBag(A). The output function is defined by out(s, ∅) = ε and out(s, x) = s + 1, . . . , s + |x| for a nonempty multiset x. The type of this transducer is G (FBag(A), FSeq(N)). The initial output is o = 1 A , the transition function is uniquely determined by next(s, x) = , and the output function is given by out(s, a 1 , . . . , a n ) = a 1 · · · a n .

Example 18 (Split in Batches)
. For a monoid A and a splitter r = (r 1 , r 2 ) for A (Example 13), we describe a transducer Split(r) = (St, init, o, next, out) that implements the transduction split(r) : STrans(A, FSeq(A)). We define St = A, because the transducer needs to remember the remainder of the cumulative input that does not yet form a complete batch, and init = 1 A . The initial output o = ε is the empty sequence. The transition and output functions are defined by next(s, x) = r 2 (s · x) and out(s, x) = r 1 (s · x). Definition 14 does not capture a key requirement for streaming computations over monoids, namely that the cumulative output of a transducer G should be independent of the particular way in which the input history is split into the fragments that are fed to it. More precisely, suppose that w is an input history that can be fragmented (factorized) in two different ways: w = u 1 · u 2 · · · u m and w = v 1 · v 2 · · · v n . Then, the cumulative output of the transducer G when consuming the sequence of fragments (factorization) u 1 , u 2 , . . . , u m should be equal to the cumulative output when consuming v 1 , v 2 , . . . , v n . In Definition 20 below, we formulate a set of coherence conditions that a transducer must adhere to in order to satisfy this "factorization independence" requirement. & Bisimilarity). Let G = (St, init, o, next, out) be a transducer with inputs from A and outputs from B. A relation R ⊆ St × St is a bisimulation for G if for every s, t ∈ St and x ∈ A we have that (s, t) ∈ R implies out(s, x) = out(t, x) and (next(s, x), next(t, x)) ∈ R. We will also use the notation sRt to mean (s, t) ∈ R. We say that the states s, t ∈ R are bisimilar, denoted s ∼ t, if there exists a bisimulation R for G such that sRt. The relation ∼ is called the bisimilarity relation for G.

Definition 19 (Bisimulation
It is well-known that the bisimilarity relation for G is an equivalence relation (reflexive, symmetric, and transitive), and for all s, t ∈ St and x ∈ A it satisfies the following extension property : s ∼ t implies that next(s, x) ∼ next(t, x). It can then be easily seen that the bisimilarity relation is a bisimulation. In fact, it is the largest bisimulation for the transducer G.
Definition 20 (Coherence). Suppose G = (St, init, o, next, out) : G (A, B) is a stream transducer. We say that G is coherent if it satisfies the following: The coherence conditions of Definition 20 capture the idea that the transducer behaves in "essentially the same way" regardless of how the input is split into fragments. For example, the condition (N2) says that the two-step transition init → x s 1 → y s 2 and the single-step transition init → xy t 1 end up in states (s 2 and t 1 ) that will have exactly the same behavior in the subsequent computation. In other words, it does not matter whether the input xy was fed to the transducer as a single fragment xy or as a sequence of two fragments x, y .
Let (A, ·, 1) be a monoid. A factorization of an element x ∈ A is a sequence x 1 , . . . , x n of elements of A such that x = x 1 · · · x n . In particular, the empty sequence ε ∈ A * is a factorization of 1. In other words,x ∈ A * is a factorization of x ∈ A if π(x) = x.
Theorem 21 (Factorization Independence). Let G = (St, init, o, next, out) be a stream transducer of type G (A, B). If G is coherent, then for every x ∈ A and every factorizationx ∈ A * of x we have that o · gout(init,x) = o · out(init, x).
Theorem 21 says that the condition of coherence guarantees a basic correctness property for stream transducers: the output that they produce does not depend on the specific way in which the input was partitioned into fragments.
For a transducer G = (St, init, o, next, out) we define the function G : A * → B * as follows: G (x) = o · eout(init,x) for everyx ∈ A * . We call G the interpretation or denotation of G. The definition of G implies that G (ε) = o and the following holds for everyx ∈ A * and y ∈ A: When G is coherent, Theorem 21 says that the denotation gives the same cumulative output for any two factorizations of the input. We say that the transducers G 1 and G 2 are equivalent if their denotations are equal, i.e. G 1 = G 2 . For any x, y ∈ A, we have to establish that β(x) · µ(x, y) = β(xy). This follows immediately from Part (O2) of the coherence property for G. So, β, µ is a stream transduction. It remains to prove that G implements β, µ , that is,
Theorem 23 provides justification for our definition of the coherence property for stream transducers (recall Definition 20). It says that the definition is exactly appropriate, because it is a necessary and sufficient condition for a stream transducer to have a stream transduction as its denotation. In other words, the coherence property characterizes the transducers have a well-defined denotational semantics in terms of transductions. It offers this guarantee of correctness without limiting their expressive power as implementations of transductions.
Theorem 24 (Expressive Completeness). Let A and B be monoids, and β, µ be a stream transduction in STrans(A, B). There exists a coherent stream transducer that implements β, µ .
Proof. Recall from Definition 8 that the monotonicity witness function µ satisfies the following property: β(x) · µ(x, y) = β(xy) for every x, y ∈ A. Now, we define the transducer G = (St, init, o, next, out) as follows: St = A, init = 1, o = β(1), next(s, x) = s · x and out(s, x) = µ(s, x) for every state s ∈ St and input x ∈ A. The following properties hold for every s ∈ St and x 1 , . . . , x n ∈ A * : Both these properties are shown by induction on the sequence x 1 , . . . , x n . It follows that G (x) = o · eout(init,x) = F(β, µ)(x) for everyx ∈ A * . So, G implements the transduction β, µ . Finally, G is coherent by Theorem 23.
Theorem 24 assures us that the abstract computational model of coherent stream transducers is expressive enough to implement any stream transduction. For this reason, we will be using stream transducers as the basic programming model for describing streaming computations.

Combinators for Deterministic Dataflow
We consider four dataflow combinators: (1) the lifting of pure morphisms to streaming computations, (2) serial composition for exposing pipeline parallelism, (3) parallel composition for exposing task-based parallelism, and (4) feedback composition for describing computations whose current output depends on previously produced output. The combinators are defined both for stream transductions (semantic objects) and for stream transducers (programs). Table 1 shows the definitions. The lifting of pure morphisms is implemented with a stateless transducer (i.e., the state space is a singleton set). Both parallel and serial composition are implemented using a product construction on transducers. In the case of parallel composition, each component computes independently. In the case of serial composition, the output of the first component is passed as input to the second component. In the case of feedback composition, the computation proceeds in well-defined rounds in order to prevent divergence.
We prove a precise correspondence between the semantics-level and programlevel combinators for all cases: lifting (Proposition 27), parallel composition (Propsition 28), serial composition (Proposition 29), and feedback composition (Proposition 30). These are essentially correctness properties for the implementations of the combinators Lift, Par, Serial, Loop. They establish that our typed framework is appropriate for the modular specification of complex streaming computations, as it can support composition constructs that are essential for parallelization and distribution.  A 1 , A 2 , B 1 , B 2 be monoids, β 1 , µ 1 : STrans(A 1 , B 1 ) and β 2 , µ 2 : STrans(A 2 , B 2 ) be transductions, and G 1 : G(A 1 , B 1 ) and G 2 : G(A 2 , B 2 ) be transducers.
Proof. Part (2) follows easily from Part (1) and Theorem 23. In order to prove Part (1) we have to first establish a number of preliminary facts. We define the function M 2 : A * → A as follows: M 2 (ε) = 1, M 2 ( x ) = x for x ∈ A, and M 2 ( x, y ·z) = xy ·z for x, y ∈ A andz ∈ A * . We write G to denote G 1 G 2 .
Let us give an example of how to construct complex computations from simpler ones using the dataflow combinators. Let A, B be sets and op : A → B be a function. We want to describe a streaming computation with two input channels, both of type FBag(A), and one output channel of type FBag(B). The computation transforms both input channels in the same way, namely by applying the function op to each element. This gives two output substreams, both of type FBag(B), that are merged into the output stream. The function op : A → B lifts to a monoid homomorphism op : FBag(A) → FBag(B), given by op(x) = {op(a) | a ∈ x} for every multiset x. The streaming computation described previously can be visualized using the dataflow graph shown below. merge, where merge is described in Example 11. We will now consider the feedback combinator, which introduces cycles in the dataflow graph. One consequence of cyclic graphs in the style of Kahn-MacQueen [60] is that divergence can be introduced, that is, a finite amount of input can cause an operator to enter an infinite loop. Suppose that the singleton input {a} is fed to the input of the dataflow graph above, which corresponds to the first input channel of Merge. This will cause Merge to emit {a}, which will be sent again to the second input channel of Merge.
Intuitively, this will cause the computation to enter an infinite loop (divergence) of consuming and emitting {a}. This behavior is undesirable in systems that process data streams, because divergence can make the system unresponsive. For this reason, we will consider here a form of feedback that eliminates this problem by ensuring that the computation of a feedback loop proceeds in a sequence of rounds. This avoid divergence, because the computation always makes progress by moving from one round to the next, as dictated by the input data. We describe this organization in rounds by requiring that the programmer specifies a splitter (recall Example 18). The splitter decomposes the input stream into batches, and one round of computation for the feedback loop corresponds to consuming one batch of data, generating the corresponding output batch, and sending the output batch along the feedback loop to be available for the next round of processing. This form of feedback allows flexibility in specifying what constitutes a single batch (and thus a single round ), and therefore generalizes the feedback combinator of Synchronous Languages such as Lustre [31].
Proposition 30 (Feedback Composition). Let A and B be monoids, β, µ : STrans(A, B) be a transduction, G : G(A, B) be a transducer, and r = (r 1 , r 2 ) be a splitter for A (see Example 13).
(2) Coherence: If G is coherent, then so is Loop(G, r).
Proof. We leave to the reader the proofs that Split (Example 18) implements split and that Flatten (Example 17) implements flatten. Given Proposition 29, it suffices to show that G = LoopB(G) implements γ, ν = loopB (β, µ). Since G is of type G(FSeq(A), FSeq(B)) it suffices to define the transition and output functions on singleton sequences (as done in Table 1), because there is a unique way to extend them so that G is coherent. It remains to show that G (x) = F(γ, ν)(x) for everyx ∈ FSeq(A) * . The base case is easy, and for the step case it suffices to show that out (gnext (init ,x), y) = ν(π(x), y) for everyx ∈ FSeq(A) * and y ∈ FSeq(A). As we discussed before, gnext and out can be viewed as being defined on elements of A rather than sequences of FSeq(A), so we can equivalently prove that out (gnext (init , a 1 , . . . , a n ), a n+1 ) = ν( a 1 , . . . , a n , a n+1 ) with each a i an element of A. Given that G implements β, µ , the key observation to finish the proof is gnext (init , a 1 , . . . , a n ) = gnext(init, a 1 , b 0 , . . . , a n , b n−1 ), b n , where γ( a 1 , . . . , a n ) = b 0 , b 1 , . . . , b n .
The dataflow combinators of this section could form the basis of query language design. The StreamQRE language [10,84] and related formalisms [9,11,12,14] are based on a set of combinators for efficiently processing linearly-ordered streams (e.g., time series [3,4]). Extending a language like StreamQRE to the typed setting of stream transductions is an interesting research direction.

Algebraic Reasoning for Optimizing Transformations
Our typed denotational framework can be used to validate optimizing transformations using algebraic reasoning. This amounts to establishing that the original transducer is equivalent to the optimized one. A fundamental approach for showing equivalence of composite transducers is to establish algebraic laws between basic building blocks, and then use algebraic rewriting.
As a concrete example, consider the per-key streaming aggregation of Example 10, which is described by the transduction reduce(K, op) : STrans(FBag(K × V ), FMap(K, V )), where K is the set of keys, V is the set of values, and op : V × V → V is an associative and commutative aggregation operation. Let h : K → {1, . . . , n} be a hash function for the keys, and define K h i = h −1 (i) = {k ∈ K | h(k) = i} for every i. Consider two variants of the merging operation of Example 11: (1) kmerge(h) merges n input streams of types FBag(K h 1 × V ), . . . , FBag(K h n × V ) respectively into an output stream of type FBag(K × V ), and (2) mmerge(h) merges n input streams of types FMap(K h 1 , V ), . . . , FMap(K h n , V ) respectively into an output stream of type FMap(K, V ). We also consider the transduction ksplit(h) that partitions an input stream of type FBag(K × V ) into n output substreams of types Using these equations, we can establish the following optimizing transformation for data parallelization, which is useful when processing high-rate data streams.
The above equation illustrates our proposed style of reasoning for establishing the soundness of optimizing streaming transformations: (1) prove equalities between transductions using elementary set-theoretic arguments, (2) prove that the transducers (programs) implement the transductions (denotations) using induction, (3) translate the equalities between transductions into equivalences between transducers using the results of Sect. 5, and finally (4) use algebraic reasoning to establish more complex equivalences. The example of this section is simple but illustrates two key points: (1) our data types for streams (monoids) capture important invariants about the streams that enable transformations, and (2) useful program transformations can be established with denotational arguments that require an appropriate notion of transduction. This approach opens up the possibility of formally verifying the wealth of optimizing transformations that are used in stream processing systems.
The papers [54,101] describe several of them, but use informal arguments that rely on the operational intuition about streaming computations. Our approach here, on the other hand, relies on rigorous denotational arguments.
The equational axiomatizations of arrows [56] and traced monoidal categories [58] are relevant to our setting, but would require adaptation. An interesting question is whether a complete axiomatization can be provided for the basic dataflow combinators of Sect. 5, similarly to how Kleene Algebra (KA) [62,63] and its extensions [49,64,79,83] (as well as other program logics [65,66,78,[80][81][82]) capture properties of imperative programs at the propositional level. We also leave for future work the development of the coalgebraic approach [96][97][98] for reasoning about the equivalence of stream transducers. We have already defined a notion of bisimulation in Sect. 4, which could give an alternative approach for proving equivalence using coinduction on the transducers.

Related Work
Sect. 1 contains several pointers to related literature for stream processing. In this section, we will focus on prior work that specifically addresses aspects of formal semantics for streaming computation.
The seminal work of Gilles Kahn [59] is exemplary in its rigorous treatment of denotational semantics for a language of deterministic dataflow graphs of independent processes, which access their input channels using blocking read statements and the output channels using nonblocking write statements. The language Lustre [31] is a synchronous restriction of Kahn's model, which introduces the semantic idea of a clock for specifying the rate of a stream. Other notable synchronous formalisms are the language Signal [21,72] and Esterel [22,28], and the synchronous dataflow graphs of [73] and [24]. These formalisms are all deterministic, in the sense the the output is determined purely by the input data. Nondeterminism creates unavoidable semantic complications [30].
The CQL language [16] is a streaming extension of a relational database language with additional constructs for time-based windowing. The denotational semantics of CQL [17] can be reconstructed and greatly simplified within our framework using the notion of stream described in Example 7 (finite time-varying multisets). There are several works that deal with the semantics of specific language constructs (e.g., windows), notions of time, punctuations and disordered streams, but do not give a mathematical description of the overall streaming computation [5,7,25,44,67,75,76,109].
The literature on Functional Reactive Programming (FRP ) [34,46,47,55,68,69,93,103] is closely related to the deterministic dataflow formalisms mentioned earlier. The main abstractions in FRP are signals and event sequences, which are linearly ordered data. Processing unordered data (e.g., multisets and maps) and extracting data parallelism (e.g., the per-key aggregation of Sect. 6) require a data model that goes beyond linear orders. In particular, the axioms of arrows [56] (often used in FRP) cannot prove the soundness of the optimizing transformation of Sect. 6, which requires reasoning about multisets.
The idea of using types to classify streams has been recently explored in [85] (see also [13]), but only for a restricted class of types that correspond to partial orders. No general abstract model of computation is presented in [85], and many of the examples in this paper cannot be adequately accomodated.
The mathematical framework of coalgebras [97] has been used to describe streams [98]. One advantage of this approach is that proofs of equivalence can be given using the proof principle of coinduction [96], which in many cases offers a useful alternative to proofs by induction. This line of work mostly focuses on infinite sequences of elements, whereas here we focus on the transformation of streams of data that can be of various different forms (not just sequences). The idea to model the input/output of automata using monoids has appeared in the algebraic theory of automata and transducers. Monoids (non-free, e.g. A * × B * ) have been used to generalize automata from recognizers of languages to recognizers of relations [45], which are sometimes called rational transducers [100]. Our focus here is on (deterministic) functions, as models that recognize relations can give rise to the Brock-Ackerman anomaly [30]. The automata models (with inputs from a free monoid A * ) most closely related to our stream transducers are deterministic: Mealy machines [87], Moore machines [90], sequential transducers [48,95], and sub-sequential transducers [102]. The concept of coherence that we introduce here (Definition 20) does not arise in these models, because they do not operate on input batches. An algebraic generalization of a deterministic acceptor is provided by a right monoid action δ : St × A → St (see page 231 of [100]), which satisfies the following properties for all s ∈ St and x, y ∈ A: (1) δ(s, 1) = s, and (2) δ(δ(s, x), y) = δ(s, xy). These properties look similar to (N1) and (N2) of Definition 20. They are, however, too restrictive for our stream transducers, as they would falsify Theorem 23.

Conclusion
We have presented a typed semantic framework for stream processing, based on the idea of abstracting data streams as elements of algebraic structures called monoids. Data streams are thus classified using monoids as types. Stream transformations are modeled as monotone functions, which are organized by input/output type. We have adapted the classical model of string transducers to our setting, and we have developed a general theory of streaming computation with a formal denotational semantics. The entire technical development in this paper is constructive, and therefore lends itself well to formalization in a proof assistant such as Coq [23,35,106]. Our framework can be used for the formalization of streaming models, and the validation of subtle optimizations of streaming programs (e.g., Sect. 6), such as the ones described in [54,101]. We have restricted our attention in this paper to deterministic streaming computation, in the sense that the behaviors that we model have predictable and reproducible results. Nondeterminism causes fundamental semantic difficulties [30], and it is undesirable in applications where repeatability is important.