Abstract
Knuth’s 0–1 principle argues that the correctness of any swap-based sorting network can be verified by testing arbitrary sequences over Boolean values (i.e., 0 and 1). Voigtländer (Proceedings of the 35th ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL 2008, San Francisco, California, USA, January 7–12, 2008. ACM, New York, NY, pp 29–35, 2008. https://doi.org/10.1145/1328438.1328445) proved a similar result for prefix-sum networks that consist of associative binary operators: the correctness can be verified by testing arbitrary sequences and associative binary operators over three values, namely 0, 1, and 2. He raised the question of whether testing over Boolean values is sufficient if the binary operator is idempotent in addition to associative. This paper answers his question. First, there is an incorrect prefix-sum network for associative idempotent operators, the flaw of which cannot be detected by testing over Boolean values. Second, testing over Boolean values is sufficient if the binary operators are restricted to commutative in addition to associative and idempotent.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A sorting network consists of wires and comparators. A wire propagates a value, and a comparator connects two wires and rearranges the propagated values in order. Figure 1 shows an example.
There are many sorting networks. It is not easy to verify whether a sorting network correctly reorders the input. One may consider testing every permutation of different values. This naive method is inefficient and requires n! test cases for verifying an n-wires sorting network.
Knuth’s 0–1 principle [1] argues that it is sufficient to consider sequences over 0 and 1 s for verifying sorting networks. It significantly reduces the number of necessary test cases from n! to \(2^n\). This test method is referred to as two-values testing.
This paper investigates the prefix sum (also known as scan, specifically in the functional programming community). Given an associative binary operator \(\oplus \) and series of values, \([x_0, x_1, \ldots , x_{n-1}]\), the prefix sum calculates the sum of every prefix: \([x_0, x_0 \oplus x_1, x_0 \oplus x_1 \oplus x_2, \ldots , x_0 \oplus x_1 \oplus \cdots \oplus x_{n-1}]\). It is especially important in the context of parallel computing [2], and its hardware-level implementations are called prefix sum networks (Fig. 2)
Similar to sorting, there are many variations for the prefix sum. The minimum-work algorithm requires only \(n-1\) calculations of \(\oplus \) for an input of size n but is perfectly sequential (Fig. 2, left). Parallel algorithms (e.g., Fig. 2, right) should handle the trade-off between the amount of work and parallel scalability. It is nontrivial to develop efficient and correct prefix-sum algorithms, and several studies [3,4,5,6,7,8] proposed methods for supporting their design and verification.
Voigtländer [4] focused on the similarity between sorting networks and prefix-sum networks and proved that the prefix-sum algorithms have the 0–1–2 principle. We can verify their correctness by three-values testing, namely by testing arbitrary combinations of 0, 1, and 2 s, and associative binary operators over them. This enables us to avoid testing arbitrary sequences.
He raised the question of whether two-values testing is sufficient if the binary operator is idempotent in addition to associative. This paper answers his question. In fact, idempotency with associativity is insufficient: there is an incorrect implementation for which we cannot detect its flaw by two-values testing (Sect. 3). Two-values testing is sufficient if the binary operator is commutative in addition to associative and idempotent (Sect. 4).
2 Voigtländer’s 0–1–2 Principle
2.1 Prefix Sum
Following Voigtländer [4], this paper uses notations inspired by the Haskell [9] functional programming language. The parentheses for function applications are omitted; hence, f x denotes an application of function f to its argument x. Multi-parameter functions are written in the curried form: f x y is an application of function f to two arguments, x and y. Binary operators are variants of two-parameter functions: writing \(x \oplus y\) is equivalent to \((\oplus )~x~y\). Function applications precede operators; thus, \(x \oplus f~y\) means \(x \oplus (f~y)\).
A value x of type A is denoted as x : : A. The type of function f that takes a value of type A and returns one of type B is denoted as \(f {:}{:} A \rightarrow B\). If f takes two values of type \(A_1\) and \(A_2\) in this order, its type is \(f {:}{:} A_1 \rightarrow A_2 \rightarrow B\).
The type of a list with elements having type A is denoted as [A]. Given a list \(x = [x_0,x_1,\ldots , x_{n-1}]\), the mth element of x (\(0 \le m \le n-1\)) is \(x \mathbin {!!}m = x_m\). Note that the first element is \(x \mathbin {!!}0\). Function applies the given function to every element in the given list: .
Throughout this paper, only programs that terminate without raising any errors are considered.
Following the Haskell standard library, the prefix-sum function is called \( scanl \). Its type is given below.Footnote 1
It takes two parameters, a binary operator and list. The type \(\alpha \) is universally quantified by \(\forall \alpha \) because \( scanl \) only provides a computation pattern (in other words, the shape of a network) and relies on the binary operator for the actual summarization.
The operator passed to a prefix-sum function is assumed to be associative. Associativity is necessary for its efficient parallel implementation. Idempotency and commutativity have also been exploited for better parallel implementations. Idempotency enables somewhat redundant computations, thus makes it simpler to deal with exceptional cases. Lynch and Swartzlander [10] developed a redundant parallel adder by taking idempotency into account. Commutativity is particularly useful for summarizing more than two inputs because it enables us to disregard the order of input elements and brings more flexibility to the scheduling of computations. Beaumont-Smith and Lim [11] studied parallel prefix sum networks with such many-in operators.
Associative commutative idempotent operators include the disjunction and conjunction (\(\vee \) and \(\wedge \)) on Boolean values or bit-vectors, binary maximum and minimum operators on numerals, and union and intersection on sets. Associative commutative (but not idempotent) operators include the exclusive-or on Boolean values or bit-vectors, addition and multiplication on numerals, and union on multisets. Associative idempotent (but not commutative) operators include the “left” operator (i.e., \(\triangleleft \) such that \(x \mathbin {\triangleleft } y = x\)) and “right” operator. The prefix-sum computation with these operators appears in several application domains. For example, the visibility testing problem (called “line-of-sight”) and quicksort contain prefix-sum computations with the binary maximum operator and left operator, respectively [2].
There are many possible implementations of \( scanl \). This paper focuses on the following observational characterization.
Definition 1
Function \(f {:}{:} \forall \alpha .~(\alpha \rightarrow \alpha \rightarrow \alpha ) \rightarrow [\alpha ] \rightarrow [\alpha ]\) implements \( scanl \) for associative (associative idempotent and associative commutative idempotent) operators if the following equation holds for every associative (associative idempotent and associative commutative idempotent, respectively) binary operator \(\oplus \), every list \(x = [x_0,x_1,\ldots , x_{n-1}]\) of length n, and every natural number \(0\le m \le n-1\).
In the following, \( scanl \) is used as a name of a reference implementation of \( scanl \).
2.2 0–1–2 Principle
Let and be types that correspond to \(\{0,1\}\) and \(\{0,1,2\}\), respectively. Two-values testing checks whether \(f~(\oplus )~x = scanl ~(\oplus )~x\) holds for arbitrary Footnote 2 and satisfying the required algebraic properties. Similarly, three-values testing checks the equality for arbitrary and satisfying the required algebraic properties. It is trivial that f will pass both if f implements \( scanl \). We are interested in whether the converse holds.
Voigtländer [4] showed that two-values testing is insufficient. Consider a function f such that \(f~(\oplus )~[a_0,a_1] = [a_0, a_0 \oplus a_0 \oplus a_0 \oplus a_1]\). Apparently, f is not equivalent to \( scanl \). Nevertheless, two-values testing cannot detect this flaw: every associative satisfies the following equation for any :
He then showed that three-values testing is sufficient: if f passed the three-valued testing, it implements \( scanl \) for associative operators. The proof is omitted because its detail is not relevant to this paper.
3 Prefix Sum with Associative Idempotent Operator
This paper shows that two-values testing cannot distinguish an incorrect implementation of \( scanl \) for associative idempotent operators.
Consider function f such that \(f~(\oplus )~[a_0,a_1,a_2,a_3] \mathbin {!!}3 = a_0 \oplus a_2 \oplus a_1 \oplus a_3\). This implementation is incorrect because the order of \(a_1\) and \(a_2\) are reversed.Footnote 3 However, two-values testing cannot detect this flaw: for any associative and (\(0 \le i \le 3\)), the following equation holds.
Since each hand-side expression uses every value at most once, idempotency is irrelevant.
Table 1 lists all binary operators on . There are 16 operators in total, eight are associative and four are associative and idempotent. From an exhaustive case analysis, we can verify that every associative idempotent operator satisfies Eq. (2).
It is possible to provide another, more axiomatic proof to Eq. (2). Because it trivially holds when \(a_1 \oplus a_2 = a_2 \oplus a_1\), we assume that \(\oplus \) is not commutative. Since \(\oplus \) is idempotent, there are only two possibilities.
Case 1
\(0 \oplus 0 = 0\), \(0 \oplus 1 = 0\), \(1 \oplus 0 = 1\), and \(1 \oplus 1 = 1\).
Case 2
\(0 \oplus 0 = 0\), \(0 \oplus 1 = 1\), \(1 \oplus 0 = 0\), and \(1 \oplus 1 = 1\).
For Case 1, \(x \oplus y = x\) for any ; therefore, \(a_0 \oplus a_1 \oplus a_2 \oplus a_3 = a_0 = a_0 \oplus a_2 \oplus a_1 \oplus a_3\). For Case 2, \(x \oplus y = y\) for any ; hence, \(a_0 \oplus a_1 \oplus a_2 \oplus a_3 = a_3 = a_0 \oplus a_2 \oplus a_1 \oplus a_3\).
4 Prefix Sum with Associative Commutative Idempotent Operator
If the binary operator satisfies commutativity in addition to associativity and idempotency, two-values testing is sufficient. The following proof takes the same approach as those of Knuth’s 0–1 principle by Day et al. [12] and 0–1–2 principle by Voigtländer [4].
The proof is based on the intensive use of relational parametricity [13] (also known as Wadler’s free theorem [14]) for polymorphic types. In particular, the following lemma is the key.
Lemma 1
For any function \(f {:}{:} \forall \alpha .~(\alpha \rightarrow \alpha \rightarrow \alpha ) \rightarrow [\alpha ] \rightarrow [\alpha ]\), binary operators \((\oplus ) {:}{:} A \rightarrow A \rightarrow A\) and \((\oplus ') {:}{:} B \rightarrow B \rightarrow B\), list \(x = [x_0,x_1,\ldots , x_{n-1}] {:}{:} [A]\), and function \(g {:}{:} A \rightarrow B\), the following equation holds,
provided that \(g~(a_1 \oplus a_2) = g~a_1 \oplus ' g~a_2\).
Proof
It is an immediate consequence of the relational parametricity applied to the polymorphic type of f. \(\square \)
The sufficiency of two-values testing can be formulated as the following theorem. Note that this paper shows a stronger argument: it is sufficient to test only the binary disjunction \((\vee )\) defined below.
Theorem 2
Function \(f {:}{:} \forall \alpha .~(\alpha \rightarrow \alpha \rightarrow \alpha ) \rightarrow [\alpha ] \rightarrow [\alpha ]\) implements \( scanl \) for associative commutative idempotent operators if and only if \(f~(\vee )~x = scanl ~(\vee )~x\) for every .
Proof
The “only if” direction is trivial. The proof of the “if” direction consists of the following two propositions.
Proposition 1
f implements \( scanl \) for associative commutative idempotent operators if for every x, where and \((\cup )\) is the set union operator.
Proposition 2
If , there exists such that \(f~(\vee )~y \ne scanl ~(\vee )~y\).
Proposition 1 is an instance of a more general theory for testing polymorphic functions [15, 16].Footnote 4 It is proven here to make the presentation self-contained. The strategy is to show that indicates \(f~(\oplus )~x = scanl ~(\oplus )~x\) for any associative, commutative, and idempotent \(\oplus \). In the following reasoning, \((\circ )\) denotes the function composition operator: \((h_1 \circ h_2)~z = h_1~(h_2~z)\).
The definition of assumes the associativity and commutativity of \(\oplus \). Moreover, the use of Lemma 1 exploits idempotency: if \(\oplus \) is not idempotent, does not hold when \(x_1\) and \(x_2\) share some elements in common.
Next, Proposition 2 is proven. Assume . Then, there should exist \(b^* \in (r_1 \cup r_2)\) such that \(b^* \not \in (r_1 \cap r_2)\). Define and as follows.
Since only one of \(r_1\) and \(r_2\) contains \(b^*\), . The following routine reasoning proves \(f~(\vee )~y \ne scanl ~(\vee )~y\), where .
\(\square \)
One might wonder whether two-values testing suffices if the binary operator is associative and commutative but not idempotent. It does not. Equation (1) is invalid even if \(\oplus \) is commutative.
5 Related Work
Voigtländer [4] showed that test cases are sufficient for verifying a prefix sum for associative operators with n input elements. In fact, n test cases are sufficient when the operator is associative, commutative, and idempotent. The proof of Theorem 2 essentially checks whether a particular element, \(b^*\), is in a certain position, m, by setting 1 on the element corresponding to \(b^*\) and 0 on others.
There are a few other studies on testing-based verification of prefix-sum algorithms. Sheeran [5] proposed a test method similar to Proposition 1 in the proof of Theorem 2: preparing labels that stand for the elements then checking whether the output consists of appropriate sets of labels. It requires only a single test, the input and output of which, respectively, consume \(\mathrm {\Theta }(n \log n)\) and \(\mathrm {\Theta }(n^2 \log n)\) spaces, where n is the number of input elements. It is applicable even when the operator is commutative and/or associative.
Chong et al. [6] improved Sheeran’s method by using intervals instead of sets of labels. For example, given a list \([x_0,x_1,x_2,x_3]\), their approach represents \(x_0 \oplus x_1 \oplus x_2 \oplus x_3\) by an interval, namely, a pair of the first and last indices, (0, 3). The output of their method consumes only \(\mathrm {\Theta }(n \log n)\) spaces. However, we cannot use it when the operator \(\oplus \) is commutative in addition to associative. For example, if we calculate \(x_0 \oplus x_1 \oplus x_2 \oplus x_3\) by \((x_0 \oplus x_2) \oplus (x_1 \oplus x_3)\), neither \(x_0 \oplus x_2\) nor \(x_1 \oplus x_3\) corresponds to any interval.
This paper has developed a verification with n test cases for a prefix sum for associative commutative idempotent operators. By merging them into one with n-bits vectors, one can conduct an equivalent verification by a single test, the input and output of which require \(\mathrm {\Theta }(n^2)\) spaces. The efficiency of this approach is incomparable with Sheeran’s.
6 Conclusion
This paper proved that two-values testing is sufficient for the prefix sum algorithms with associative, commutative, and idempotent operators but not for those with associative and idempotent ones. It thereby answered the question raised by Voigtländer [4]. These results will provide more understanding behind Voigtländer’s 0–1–2 principle and may lead to similar principles for other computation patterns.
Notes
The \( scanl \) function in the Haskell standard Prelude library is defined differently and takes an additional parameter, which is the initial value. The difference does not affect the discussion in this paper.
Testing arbitrary lists is impossible. We may in reality consider testing lists of a certain length.
Given two characters, \(c_1\) and \(c_2\), let be the substitution that replaces every \(c_2\) in the given string to \(c_1\). For example, \((\texttt{a}/\texttt{b})~\texttt{abcab} = \texttt{aacaa}\). The function composition operator on such substitutions is associative and idempotent (i.e., \( (c_1/c_2) \circ (c_1/c_2) = (c_1/c_2)\)); moreover, it can distinguish \(f_1 = (\texttt{a}/\texttt{b}) \circ (\texttt{b}/\texttt{c}) \circ (\texttt{c}/\texttt{a}) \circ (\texttt{a}/\texttt{b})\) from \(f_2 = (\texttt{a}/\texttt{b}) \circ (\texttt{c}/\texttt{a}) \circ (\texttt{b}/\texttt{c}) \circ (\texttt{a}/\texttt{b})\) because \(f_1~\texttt{ab} = \texttt{aa} \ne \texttt{cc} = f_2~\texttt{ab}\).
These methods [15, 16] instantiate polymorphic functions by using an initial algebra which is commonly understood as trees in functional programming. Proposition 1 uses the fact that sets form an initial algebra when we take associativity, commutativity, and idempotency into account (cf. Section 4.2 of [15]).
References
Knuth, D.: The Art of Computer Programming, vol. 3, 2nd edn. Addison Wesley Longman, Boston, MA, USA (1998)
Blelloch, G.E.: Prefix sums and their applications. In: Reif, J.H. (ed.) Synthesis of Parallel Algorithms, Chap. 1. Morgan Kaufmann Publishers, Burlington, MA, USA (1993)
Hinze, R.: An algebra of scans. In: Kozen, D., Shankland, C. (eds) Mathematics of Program Construction, 7th International Conference, MPC 2004, Stirling, Scotland, UK, July 12–14, 2004, Proceedings. Lecture Notes in Computer Science, vol. 3125, pp. 186–210. Springer, Berlin, Germany (2004). https://doi.org/10.1007/978-3-540-27764-4_11
Voigtländer, J.: Much ado about two (pearl): a pearl on parallel prefix computation. In: Necula, G.C., Wadler, P. (eds.) Proceedings of the 35th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2008, San Francisco, California, USA, January 7–12, 2008, pp. 29–35. ACM, New York, NY, USA (2008). https://doi.org/10.1145/1328438.1328445
Sheeran, M.: Functional and dynamic programming in the design of parallel prefix networks. J. Funct. Program. 21(1), 59–114 (2011). https://doi.org/10.1017/S0956796810000304
Chong, N., Donaldson, A.F., Ketema, J.: A sound and complete abstraction for reasoning about parallel prefix sums. In: Jagannathan, S., Sewell, P. (eds.) The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’14, San Diego, CA, USA, January 20–21, 2014, pp. 397–410. ACM, New York, NY, USA (2014). https://doi.org/10.1145/2535838.2535882
Matsuzaki, K.: Functional models of Hadoop MapReduce with application to scan. Int. J. Parallel Program. 45(2), 362–381 (2017). https://doi.org/10.1007/s10766-016-0414-9
Safari, M., Oortwijn, W., Joosten, S.J.C., Huisman, M.: Formal verification of parallel prefix sum. In: Lee, R., Jha, S., Mavridou, A. (eds.) NASA Formal Methods—12th International Symposium, NFM 2020, Moffett Field, CA, USA, May 11–15, 2020, Proceedings. Lecture Notes in Computer Science, vol. 12229, pp. 170–186. Springer, Berlin, Germany (2020). https://doi.org/10.1007/978-3-030-55754-6_10
Peyton Jones, S. (ed.): Haskell 98 Language and Libraries: The Revised Report. Cambridge University Press, Cambridge, UK (2003)
Lynch, T.W. Jr., Swartzlander, E.: The redundant cell adder. In: 10th IEEE Symposium on Computer Arithmetic, ARITH 1991, Grenoble, France, June 26–28, 1991, pp. 165–170. IEEE (1991). https://doi.org/10.1109/ARITH.1991.145553
Beaumont-Smith, A., Lim, C.: Parallel prefix adder design. In: 15th IEEE Symposium on Computer Arithmetic (Arith-15 2001), 11–17 June 2001, Vail, CO, USA, p. 218. IEEE (2001). https://doi.org/10.1109/ARITH.2001.930122
Day, N.A., Launchbury, J., Lewis, J.: Logical abstractions in Haskell. In: Proceedings of the 1999 Haskell Workshop. Utrecht University Department of Computer Science, Technical Report UU-CS-1999-28, Utrecht, Netherlands (1999)
Reynolds, J.C.: Types, abstraction and parametric polymorphism. Information Processing 83, 513–523 (1983)
Wadler, P.: Theorems for free! In: FPCA’89 Conference on Functional Programming Languages and Computer Architecture. Imperial College, London, England, 11–13 September 1989, pp. 347–359. ACM, New York (1989). https://doi.org/10.1145/99370.99404
Bernardy, J., Jansson, P., Claessen, K.: Testing polymorphic properties. In: Gordon, A.D. (ed.) Programming Languages and Systems, 19th European Symposium on Programming, ESOP 2010, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010, Paphos, Cyprus, March 20–28, 2010. Proceedings. Lecture Notes in Computer Science, vol. 6012, pp. 125–144. Springer, Berlin, Germany (2010). https://doi.org/10.1007/978-3-642-11957-6_8
Hou, K., Wang, Z.: Logarithm and program testing. Proc. ACM Program. Lang. 6(POPL), 1–26 (2022). https://doi.org/10.1145/3498726
Acknowledgements
The author is grateful to the anonymous reviewers and area editor, Ichiro Hasuo, for their instructive comments for improving the paper. This work was supported by JSPS Grant-in-Aid for Scientific Research (C), Grant Number 19K11896.
Funding
Open access funding provided by The University of Tokyo.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.
About this article
Cite this article
Morihata, A. When does 0–1 Principle Hold for Prefix Sums?. New Gener. Comput. 41, 523–531 (2023). https://doi.org/10.1007/s00354-023-00219-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-023-00219-0