1 Introduction

A sorting network consists of wires and comparators. A wire propagates a value, and a comparator connects two wires and rearranges the propagated values in order. Figure 1 shows an example.

There are many sorting networks. It is not easy to verify whether a sorting network correctly reorders the input. One may consider testing every permutation of different values. This naive method is inefficient and requires n! test cases for verifying an n-wires sorting network.

Knuth’s 0–1 principle [1] argues that it is sufficient to consider sequences over 0 and 1 s for verifying sorting networks. It significantly reduces the number of necessary test cases from n! to \(2^n\). This test method is referred to as two-values testing.

Fig. 1
figure 1

Sorting network with four wires and five comparators

This paper investigates the prefix sum (also known as scan, specifically in the functional programming community). Given an associative binary operator \(\oplus \) and series of values, \([x_0, x_1, \ldots , x_{n-1}]\), the prefix sum calculates the sum of every prefix: \([x_0, x_0 \oplus x_1, x_0 \oplus x_1 \oplus x_2, \ldots , x_0 \oplus x_1 \oplus \cdots \oplus x_{n-1}]\). It is especially important in the context of parallel computing [2], and its hardware-level implementations are called prefix sum networks (Fig. 2)

Similar to sorting, there are many variations for the prefix sum. The minimum-work algorithm requires only \(n-1\) calculations of \(\oplus \) for an input of size n but is perfectly sequential (Fig. 2, left). Parallel algorithms (e.g., Fig. 2, right) should handle the trade-off between the amount of work and parallel scalability. It is nontrivial to develop efficient and correct prefix-sum algorithms, and several studies [3,4,5,6,7,8] proposed methods for supporting their design and verification.

Fig. 2
figure 2

Prefix sum networks with eight wires, where the diagonal line denotes the accumulation of value in the lower wire to that in upper one. Left is minimum-work sequential network, and right has more parallelism

Voigtländer [4] focused on the similarity between sorting networks and prefix-sum networks and proved that the prefix-sum algorithms have the 0–1–2 principle. We can verify their correctness by three-values testing, namely by testing arbitrary combinations of 0, 1, and 2 s, and associative binary operators over them. This enables us to avoid testing arbitrary sequences.

He raised the question of whether two-values testing is sufficient if the binary operator is idempotent in addition to associative. This paper answers his question. In fact, idempotency with associativity is insufficient: there is an incorrect implementation for which we cannot detect its flaw by two-values testing (Sect. 3). Two-values testing is sufficient if the binary operator is commutative in addition to associative and idempotent (Sect. 4).

2 Voigtländer’s 0–1–2 Principle

2.1 Prefix Sum

Following Voigtländer [4], this paper uses notations inspired by the Haskell [9] functional programming language. The parentheses for function applications are omitted; hence, f x denotes an application of function f to its argument x. Multi-parameter functions are written in the curried form: f x y is an application of function f to two arguments, x and y. Binary operators are variants of two-parameter functions: writing \(x \oplus y\) is equivalent to \((\oplus )~x~y\). Function applications precede operators; thus, \(x \oplus f~y\) means \(x \oplus (f~y)\).

A value x of type A is denoted as x :  : A. The type of function f that takes a value of type A and returns one of type B is denoted as \(f {:}{:} A \rightarrow B\). If f takes two values of type \(A_1\) and \(A_2\) in this order, its type is \(f {:}{:} A_1 \rightarrow A_2 \rightarrow B\).

The type of a list with elements having type A is denoted as [A]. Given a list \(x = [x_0,x_1,\ldots , x_{n-1}]\), the mth element of x (\(0 \le m \le n-1\)) is \(x \mathbin {!!}m = x_m\). Note that the first element is \(x \mathbin {!!}0\). Function applies the given function to every element in the given list: .

Throughout this paper, only programs that terminate without raising any errors are considered.

Following the Haskell standard library, the prefix-sum function is called \( scanl \). Its type is given below.Footnote 1

$$\begin{aligned} scanl {:}{:} \forall \alpha .~(\alpha \rightarrow \alpha \rightarrow \alpha ) \rightarrow [\alpha ] \rightarrow [\alpha ]\end{aligned}$$

It takes two parameters, a binary operator and list. The type \(\alpha \) is universally quantified by \(\forall \alpha \) because \( scanl \) only provides a computation pattern (in other words, the shape of a network) and relies on the binary operator for the actual summarization.

The operator passed to a prefix-sum function is assumed to be associative. Associativity is necessary for its efficient parallel implementation. Idempotency and commutativity have also been exploited for better parallel implementations. Idempotency enables somewhat redundant computations, thus makes it simpler to deal with exceptional cases. Lynch and Swartzlander [10] developed a redundant parallel adder by taking idempotency into account. Commutativity is particularly useful for summarizing more than two inputs because it enables us to disregard the order of input elements and brings more flexibility to the scheduling of computations. Beaumont-Smith and Lim [11] studied parallel prefix sum networks with such many-in operators.

Associative commutative idempotent operators include the disjunction and conjunction (\(\vee \) and \(\wedge \)) on Boolean values or bit-vectors, binary maximum and minimum operators on numerals, and union and intersection on sets. Associative commutative (but not idempotent) operators include the exclusive-or on Boolean values or bit-vectors, addition and multiplication on numerals, and union on multisets. Associative idempotent (but not commutative) operators include the “left” operator (i.e., \(\triangleleft \) such that \(x \mathbin {\triangleleft } y = x\)) and “right” operator. The prefix-sum computation with these operators appears in several application domains. For example, the visibility testing problem (called “line-of-sight”) and quicksort contain prefix-sum computations with the binary maximum operator and left operator, respectively [2].

There are many possible implementations of \( scanl \). This paper focuses on the following observational characterization.

Definition 1

Function \(f {:}{:} \forall \alpha .~(\alpha \rightarrow \alpha \rightarrow \alpha ) \rightarrow [\alpha ] \rightarrow [\alpha ]\) implements \( scanl \) for associative (associative idempotent and associative commutative idempotent) operators if the following equation holds for every associative (associative idempotent and associative commutative idempotent, respectively) binary operator \(\oplus \), every list \(x = [x_0,x_1,\ldots , x_{n-1}]\) of length n, and every natural number \(0\le m \le n-1\).

$$\begin{aligned} f~(\oplus )~x \mathbin {!!}m = x_0 \oplus x_1 \oplus \cdots \oplus x_{m-1} \end{aligned}$$

In the following, \( scanl \) is used as a name of a reference implementation of \( scanl \).

2.2 0–1–2 Principle

Let and be types that correspond to \(\{0,1\}\) and \(\{0,1,2\}\), respectively. Two-values testing checks whether \(f~(\oplus )~x = scanl ~(\oplus )~x\) holds for arbitrary Footnote 2 and satisfying the required algebraic properties. Similarly, three-values testing checks the equality for arbitrary and satisfying the required algebraic properties. It is trivial that f will pass both if f implements \( scanl \). We are interested in whether the converse holds.

Voigtländer [4] showed that two-values testing is insufficient. Consider a function f such that \(f~(\oplus )~[a_0,a_1] = [a_0, a_0 \oplus a_0 \oplus a_0 \oplus a_1]\). Apparently, f is not equivalent to \( scanl \). Nevertheless, two-values testing cannot detect this flaw: every associative satisfies the following equation for any :

$$\begin{aligned} a_0 \oplus a_0 \oplus a_0 \oplus a_1 = a_0 \oplus a_1. \end{aligned}$$
(1)

He then showed that three-values testing is sufficient: if f passed the three-valued testing, it implements \( scanl \) for associative operators. The proof is omitted because its detail is not relevant to this paper.

3 Prefix Sum with Associative Idempotent Operator

This paper shows that two-values testing cannot distinguish an incorrect implementation of \( scanl \) for associative idempotent operators.

Consider function f such that \(f~(\oplus )~[a_0,a_1,a_2,a_3] \mathbin {!!}3 = a_0 \oplus a_2 \oplus a_1 \oplus a_3\). This implementation is incorrect because the order of \(a_1\) and \(a_2\) are reversed.Footnote 3 However, two-values testing cannot detect this flaw: for any associative and (\(0 \le i \le 3\)), the following equation holds.

$$\begin{aligned} a_0 \oplus a_1 \oplus a_2 \oplus a_3 = a_0 \oplus a_2 \oplus a_1 \oplus a_3. \end{aligned}$$
(2)

Since each hand-side expression uses every value at most once, idempotency is irrelevant.

Table 1 Properties of binary operators on

Table 1 lists all binary operators on . There are 16 operators in total, eight are associative and four are associative and idempotent. From an exhaustive case analysis, we can verify that every associative idempotent operator satisfies Eq. (2).

It is possible to provide another, more axiomatic proof to Eq. (2). Because it trivially holds when \(a_1 \oplus a_2 = a_2 \oplus a_1\), we assume that \(\oplus \) is not commutative. Since \(\oplus \) is idempotent, there are only two possibilities.

Case 1

\(0 \oplus 0 = 0\), \(0 \oplus 1 = 0\), \(1 \oplus 0 = 1\), and \(1 \oplus 1 = 1\).

Case 2

\(0 \oplus 0 = 0\), \(0 \oplus 1 = 1\), \(1 \oplus 0 = 0\), and \(1 \oplus 1 = 1\).

For Case 1, \(x \oplus y = x\) for any ; therefore, \(a_0 \oplus a_1 \oplus a_2 \oplus a_3 = a_0 = a_0 \oplus a_2 \oplus a_1 \oplus a_3\). For Case 2, \(x \oplus y = y\) for any ; hence, \(a_0 \oplus a_1 \oplus a_2 \oplus a_3 = a_3 = a_0 \oplus a_2 \oplus a_1 \oplus a_3\).

4 Prefix Sum with Associative Commutative Idempotent Operator

If the binary operator satisfies commutativity in addition to associativity and idempotency, two-values testing is sufficient. The following proof takes the same approach as those of Knuth’s 0–1 principle by Day et al. [12] and 0–1–2 principle by Voigtländer [4].

The proof is based on the intensive use of relational parametricity [13] (also known as Wadler’s free theorem [14]) for polymorphic types. In particular, the following lemma is the key.

Lemma 1

For any function \(f {:}{:} \forall \alpha .~(\alpha \rightarrow \alpha \rightarrow \alpha ) \rightarrow [\alpha ] \rightarrow [\alpha ]\), binary operators \((\oplus ) {:}{:} A \rightarrow A \rightarrow A\) and \((\oplus ') {:}{:} B \rightarrow B \rightarrow B\), list \(x = [x_0,x_1,\ldots , x_{n-1}] {:}{:} [A]\), and function \(g {:}{:} A \rightarrow B\), the following equation holds,

provided that \(g~(a_1 \oplus a_2) = g~a_1 \oplus ' g~a_2\).

Proof

It is an immediate consequence of the relational parametricity applied to the polymorphic type of f. \(\square \)

The sufficiency of two-values testing can be formulated as the following theorem. Note that this paper shows a stronger argument: it is sufficient to test only the binary disjunction \((\vee )\) defined below.

Theorem 2

Function \(f {:}{:} \forall \alpha .~(\alpha \rightarrow \alpha \rightarrow \alpha ) \rightarrow [\alpha ] \rightarrow [\alpha ]\) implements \( scanl \) for associative commutative idempotent operators if and only if \(f~(\vee )~x = scanl ~(\vee )~x\) for every .

Proof

The “only if” direction is trivial. The proof of the “if” direction consists of the following two propositions.

Proposition 1

f implements \( scanl \) for associative commutative idempotent operators if for every x, where and \((\cup )\) is the set union operator.

Proposition 2

If , there exists such that \(f~(\vee )~y \ne scanl ~(\vee )~y\).

Proposition 1 is an instance of a more general theory for testing polymorphic functions [15, 16].Footnote 4 It is proven here to make the presentation self-contained. The strategy is to show that indicates \(f~(\oplus )~x = scanl ~(\oplus )~x\) for any associative, commutative, and idempotent \(\oplus \). In the following reasoning, \((\circ )\) denotes the function composition operator: \((h_1 \circ h_2)~z = h_1~(h_2~z)\).

figure a

The definition of assumes the associativity and commutativity of \(\oplus \). Moreover, the use of Lemma 1 exploits idempotency: if \(\oplus \) is not idempotent, does not hold when \(x_1\) and \(x_2\) share some elements in common.

Next, Proposition 2 is proven. Assume . Then, there should exist \(b^* \in (r_1 \cup r_2)\) such that \(b^* \not \in (r_1 \cap r_2)\). Define and as follows.

figure b

Since only one of \(r_1\) and \(r_2\) contains \(b^*\), . The following routine reasoning proves \(f~(\vee )~y \ne scanl ~(\vee )~y\), where .

figure c

\(\square \)

One might wonder whether two-values testing suffices if the binary operator is associative and commutative but not idempotent. It does not. Equation (1) is invalid even if \(\oplus \) is commutative.

5 Related Work

Voigtländer [4] showed that test cases are sufficient for verifying a prefix sum for associative operators with n input elements. In fact, n test cases are sufficient when the operator is associative, commutative, and idempotent. The proof of Theorem 2 essentially checks whether a particular element, \(b^*\), is in a certain position, m, by setting 1 on the element corresponding to \(b^*\) and 0 on others.

There are a few other studies on testing-based verification of prefix-sum algorithms. Sheeran [5] proposed a test method similar to Proposition 1 in the proof of Theorem 2: preparing labels that stand for the elements then checking whether the output consists of appropriate sets of labels. It requires only a single test, the input and output of which, respectively, consume \(\mathrm {\Theta }(n \log n)\) and \(\mathrm {\Theta }(n^2 \log n)\) spaces, where n is the number of input elements. It is applicable even when the operator is commutative and/or associative.

Chong et al. [6] improved Sheeran’s method by using intervals instead of sets of labels. For example, given a list \([x_0,x_1,x_2,x_3]\), their approach represents \(x_0 \oplus x_1 \oplus x_2 \oplus x_3\) by an interval, namely, a pair of the first and last indices, (0, 3). The output of their method consumes only \(\mathrm {\Theta }(n \log n)\) spaces. However, we cannot use it when the operator \(\oplus \) is commutative in addition to associative. For example, if we calculate \(x_0 \oplus x_1 \oplus x_2 \oplus x_3\) by \((x_0 \oplus x_2) \oplus (x_1 \oplus x_3)\), neither \(x_0 \oplus x_2\) nor \(x_1 \oplus x_3\) corresponds to any interval.

This paper has developed a verification with n test cases for a prefix sum for associative commutative idempotent operators. By merging them into one with n-bits vectors, one can conduct an equivalent verification by a single test, the input and output of which require \(\mathrm {\Theta }(n^2)\) spaces. The efficiency of this approach is incomparable with Sheeran’s.

6 Conclusion

This paper proved that two-values testing is sufficient for the prefix sum algorithms with associative, commutative, and idempotent operators but not for those with associative and idempotent ones. It thereby answered the question raised by Voigtländer [4]. These results will provide more understanding behind Voigtländer’s 0–1–2 principle and may lead to similar principles for other computation patterns.