Theory and Practice of Formal Methods pp 407426  Cite as
Quicksort Revisited
 2 Citations
 757 Downloads
Abstract
We verify the correctness of a recursive version of Tony Hoare’s \(\texttt {quicksort}\) algorithm using the Hoarelogic based verification tool Dafny. We then develop a nonstandard, iterative version which is based on a stack of pivotlocations rather than the standard stack of ranges. We outline an incomplete Dafny proof for the latter.
Keywords
Automated verification Algorithms Quicksort Program transformation1 Introduction
In 1959, while working on a project for automated translation from Russian to English, Tony Hoare found a recurring need to be able to sort word sequences into alphabetical order. To tackle this problem he invented an algorithm that was significantly faster than existing alternatives. The publication of this algorithm in 1961 as “Quicksort” [7] revolutionised the way we sort, and more generally, the way we think about and develop algorithms.
Since then, quicksort has inspired practitioners and researchers alike, including the recipient of this Festschrift. The algorithm has been modified and implemented millions of times by experienced programmers and students alike in several programming languages, and has even been choreographed as a Hungarian dance [16]. As well as the fascination for its elegant and succinct presentation, it is also interesting because it involves two inner recursive calls, and thus reasoning and program transformations applied to the algorithm are nontrivial.
In 1971, Foley and Hoare presented a handproof of the correctness of quicksort [5], and several proofs have been developed since. Proofs for the recursive as well as the iterative setting have also been proposed by de Boer and his coauthors in [1]. Recently, in his Turing Award lecture, Lamport showed an abstract derivation of iterative quicksort [9]. More recently, and rather surprisingly, de Gouw et al. discovered a subtle bug in \(\texttt {Timsort}\), a sorting algorithm proposed in 2002, and which is the implementation of \(\texttt {java.util.Arrays.sort}\) [13] for nonprimitive types, and part of the Android platform. They discovered the bug while trying to prove the correctness of \(\texttt {Timsort}\) using the Hoarelogic based tool Key [2].
In this paper, we reason about the correctness of two versions of quicksort: a recursive version and an iterative version. We too use a Hoare logicbased tool, namely Dafny [10].
Our recursive quicksort method deviates slightly from the standard version presented in the literature, in that we split the array into three subarrays, the middle one of length one, and then call the function recursively on the first and third subarrays.
Our iterative quicksort method is, to our knowledge, novel, in that rather than storing ranges (i.e. pairs of values) in a stack, we only store the locations of the pivots (i.e. one value), thus saving both space and time.
We have used the tool Dafny to check our implementations. To facilitate the proofs, we have defined and used lemmas in the proof of the code. We have proven some, but not all of these lemmas in Dafny.
1.1 Contributions

A proof of correctness for our variant of recursive \(\texttt {quicksort}\) in Dafny.

A new, iterative version of \(\texttt {quicksort}\) based on the pivot locations.

A proof outline for the correctness of our iterative \(\texttt {quicksort}\) in Dafny.
The complete Dafny code for our work can be found at [3]. To the best of our knowledge, there is no existing proof of imperative recursive quicksort in Dafny before our work. However, Leino has recently developed a proof in Dafny of the standard functional recursive algorithm, as well as an alternative version of the iterative algorithm based on ranges. Both can be found in the Dafny test suite [11]. Also, to the best our knowledge, there is no exiting version of iterative quicksort based on pivots. A comparison of its efficiency with other algorithms is future work.

Section 2 presents the notation and lemmas we will be using to specify and prove \(\texttt {quicksort}\).
 Section 3 shows three recursive versions quicksort:
 (1)
Recursive \(\texttt {quicksort}\) as proposed in Hoare’s original paper.
 (2)
Recursive \(\texttt {quicksort}\) as commonly seen in the literature.
 (3)
Recursive \(\texttt {quicksort}\) with the variation that the two subranges are off by one, and an outline of its proof of correctness.
 (1)
 Section 4 shows two iterative versions of \(\texttt {quicksort}\):
 (1)
Iterative \(\texttt {quicksort}\) with a stack simulation of recursion.
 (2)
Novel iterative \(\texttt {quicksort}\) based on a stack of pivot locations, and outline of its proof of correctness.
 (1)

Section 5 concludes the paper with an evaluation of our work and an identification of future directions of research.
2 Specifying Quicksort
We now turn to one of the most important parts of automated program verification: specifying the program we wish to implement.
2.1 Sorting – The Task
Let’s start by defining the task of sorting the contents of an array.
Given an array Open image in new window of integers^{1} we want to rearrange the array so that the elements of the array are arranged in ascending order. Additionally, we must ensure that no elements are added to or removed from the array.
2.2 Notation, Predicates and Lemmas
Throughout this paper we adopt the Dafny convention of treating arrays as pointers to sequences of values. That is, we think of the array a as a pointer to the sequence \(a\texttt {[}0\texttt {]},a\texttt {[}1\texttt {]},a\texttt {[}2\texttt {]},...,a\texttt {[}a1\texttt {]}\), where a is the length of array a.
Note that Dafny represents sequences with the syntax a[m..n], which is equivalent to the meaning of a[m..n) from our notation. Therefore, whenever the terms Open image in new window or Open image in new window appear in our Dafny code, their meaning should be interpreted as a[m..n), or a[..), respectively.

\(Count(\,a[i..j),\,v\,)\) tracks the number of times that v occurs in the slice a[i..j).

\(\,a[i..j) \sim \,b[m..n)\) states that slice a[i..j) is a permutation of slice b[m..n).

\(Swapped(a[..),\,b[..),\,i,\,j)\) states that the sequences a[..) and b[..) are exactly the same except that the elements at positions i and j have been swapped.
All the operators and predicates above are available, or can be easily encoded, in Dafny. However, they cannot always be written in infix or symbolic notation.
 Deep Equality:$$\begin{aligned} \begin{array}{lll} &{}\,\,\,\,\,\, a \approx b \longrightarrow \ \, b \approx a &{} \qquad \quad a \approx b \, \wedge \, b \approx c \longrightarrow \ \, a \approx c \\ &{} a \approx b \longrightarrow \ \, a = b&{}\qquad \quad a \approx b \longrightarrow \ \, \,a \sim \,b\\ \end{array} \end{aligned}$$
 Ranges:$$\begin{aligned} \begin{array}{l@{}r} a \approx b[0..i) {\mathbin {\texttt {++}}} a[i..j) {\mathbin {\texttt {++}}} b[j..b) \, \wedge \, m {\le } i {\le } j {\le } n {\longrightarrow \ \, } a \approx b[0..m) {\mathbin {\texttt {++}}} a[m..n) {\mathbin {\texttt {++}}} b[n..b) \\ a \approx b[0..i) {\mathbin {\texttt {++}}} a[i..j) {\mathbin {\texttt {++}}} b[j..b) \, \wedge \, \,a[i..j) \sim \,b[i..j) \longrightarrow \ \, \,a \sim \,b \\ a \approx a[0..i) {\mathbin {\texttt {++}}} b[i..j) {\mathbin {\texttt {++}}} a[j..a) \, \wedge \, b \approx c \longrightarrow \ \, a \approx a[0..i) {\mathbin {\texttt {++}}} c[i..j) {\mathbin {\texttt {++}}} a[j..a) \\ \end{array} \end{aligned}$$
 Permutation:$$\begin{aligned} \begin{array}{l@{\quad }r} \,a \sim \,b \longrightarrow \ \, \,b \sim \,a \\ \,a \sim \,b \, \wedge \, \,b \sim \,c \longrightarrow \ \, \,a \sim \,c \\ \,a \sim \,b \longrightarrow \ \, a = b \\ \end{array} \end{aligned}$$
 Swapping:$$\begin{aligned} \begin{array}{l@{\quad }r} Swapped(a,\,b,\,i,\,i) \longrightarrow \ \, a \approx b \\ Swapped(a,\,b,\,i,\,j) \longrightarrow \ \, \,a \sim \,b \\ \end{array} \end{aligned}$$
 Sorting:$$\begin{aligned} \begin{array}{l@{\quad }r} Sorted(\,a[i..j)\,) \, \wedge \, i \le m \, \wedge \, n \le j \longrightarrow \ \, Sorted(\,a[m..n)\,) \\ \end{array} \end{aligned}$$
2.3 Specifying Methods
Method specifications consist of a Precondition, expected to hold before the method is executed, and a Postcondition, that the code must ensure holds after the method terminates. We use the Dafny keywords Open image in new window and Open image in new window to refer to the precondition and postcondition of a method respectively. We use the Dafny keyword Open image in new window within our code to introduce assertions, or midconditions. We also use the Dafny keywords Open image in new window and Open image in new window to introduce variants and invariants for loops and recursive methods.
Given some code Open image in new window with precondition P and postcondition Q, we adopt the total correctness interpretation of such a specification [12], whereby
For all program states that satisfy the precondition P, the code Open image in new window will run without faulting and will terminate in a program state that satisfies the postcondition Q.
Sometimes, in our specifications, we need to refer to both the current and initial values of some variables. For example, in the code snippet \(\texttt {x := x+3}\), the new value of \(\texttt {x}\) depends on its previous value. By default, all of our specifications refer to the current values of variables. As in Dafny, we use the keyword \(old(\,.\,)\) to indicate the value before a method call. For example, \(old(\,\texttt {x}\,)\) represents the value of the program variable \(\texttt {x}\) before the call to the current method. Notice that arrays are pointers to sequences. So, if we have an array a, the term \(old(\,\texttt {a}\,)\) is the value of the pointer before the call, \(old(\,\texttt {a}\,)[..)\) represents the current contents of the pointer before the call, while \(old(\,\texttt {a}[..)\,)\) represents the contents of the array before the call.
When writing specifications we use both Dafny syntax and normal mathematical notation as well as our sequence notation as developed in Sect. 2.2. For example, we write \(\forall \) and \(\wedge \) rather than Open image in new window and Open image in new window .
2.4 The Specification
This specification requires that the input be a nonnull, nonempty array (to rule out pathological input) and ensures that the resulting array is sorted. Additionally, the specification states that no elements are added to or deleted from the array.
3 Recursive Quicksort
Having identified the task that we need to solve, we now provide several different implementations of quicksort, ranging from classic to more inventive solutions.
 1.
Choose an element in the list – this element serves as the pivot. Set it aside (e.g. move it to the beginning or end).
 2.
Partition the array of elements into two sets – those less than the pivot and those greater than or equal to the pivot
 3.
Repeat steps 1 and 2 on each of the two resulting partitions until each set has one or fewer elements.
3.1 The Original Quicksort
3.2 The Standard Quicksort
3.3 Quicksort – Our Version
Below we show our version of recursive \(\texttt {quicksort}\). In fact, this version was shown to us by Krysia Broda. It is very similar to the standard version, but with a little twist added: our version splits the array into three, rather than two parts: one part that is smaller than, one part that is equal to, and one part that is greater than or equal to, the pivot. Then, the recursive calls need only be called on the first and the third subpart; the pivot remains where it was placed by Open image in new window in the current iteration.

PRE: before the method call (i.e. the precondition)

MID_2: after the call of Open image in new window (i.e. at line 14)

MID_3: after the call of Open image in new window (i.e. at line 20)

MID_4: after the first recursive call of Open image in new window (i.e. at line 26)

MID_5: after the second recursive call of Open image in new window (i.e. at line 32)

POST: as an implication of the previous assertion (i.e. again at line 32)
From the eighteen assertions mentioned in the code, Dafny only needed help with the proofs of four, and needed no help at all for the case where \(\texttt {from} + 1 \ge \texttt {to}\). We now list the lemmas used above, using the convention that a, b, c stand for sequences of type T, while \(elem\in T\) is a possible value, and i, j, k, l, m and n are natural numbers.
 L_swap_impl_sameUpTo( a , b , i , j , ):This lemma says that swapping creates a permutation of the original array, leaving the [..i) and the \([i+1..)\) range unmodified. The proof follows by unfolding the definitions.$$\begin{aligned} a = b&\, \wedge \, i \le j < a \, \wedge \, \,a[..) \sim \,b[..) \, \wedge \, Swapped(a,\,b,\,i,\,j)\\&\longrightarrow \ \, a\approx b[0..i) \mathbin {\texttt {++}}a[i..j+1) \mathbin {\texttt {++}}b[j+1..) \wedge \,a[..) \sim \,b[..) \end{aligned}$$
 L_sameUpTo_prsrv_less( a , b , elem , m , n ):This lemma says that if an array a is a permutation of an array b, and is identical with b in the ranges [..m) and [n..), then b is smaller than elem in the range [m..n), then a is also smaller than elem in the range [m..n). The proof follows by establishing that \(\,a[m..m) \sim \,b[m..m)\).$$\begin{aligned}&a = b \, \wedge \, a \approx b[..m) \mathbin {\texttt {++}}a[m..n) \mathbin {\texttt {++}}b[n..) \, \wedge \, \,a[..) \sim \,b[..)\\&\,\wedge \, b[m..n) < elem\\&\qquad \quad \longrightarrow \ \, a[m..n) < elem \end{aligned}$$
 L_sameUpTo_prsrv_grEq( a , b , elem , m , n ):This lemma says that if an array a is a permutation of an array b, and is identical with b in the ranges [..m) and [n..), then b is greater or equal to elem in the range [m..n), then a is also greater or equal to elem in the range [m..n). The proof follows by establishing that \( \,a[m..m) \sim \,b[m..m)\).$$\begin{aligned}&a = b \, \wedge \, a \approx b[..m) \mathbin {\texttt {++}}a[m..n) \mathbin {\texttt {++}}b[n..) \, \wedge \, \,a[..) \sim \,b[..)\\&\, \wedge \, elem \le b[m..n) \\&\,\,\,\qquad \longrightarrow \ \, elem \le a[m..n) \end{aligned}$$
 L_conc_impl_Sorted( a , i , j , k ):This lemma says that concatenation of two sorted subranges \([i..j1)\) and [j..k), where the left subrange contains smaller elements than the element at \(a[j1]\), and where \(a[j1]\) is smaller or equal to the elements at [j..k) produces a sorted range [i..k). The proof follows by unfolding the definitions.$$\begin{aligned}&i < j \le k \le a \, \wedge \, i < a \, \wedge \, Sorted(\,a[i..j1)\,) \, \wedge \, Sorted(\,a[j..k)\,)\\&\, \wedge \, a[i..j1) < a[j1] \le a[j..) \\&\qquad \quad \longrightarrow \ \, Sorted(\,a[i..k)\,) \end{aligned}$$
4 Iterative Quicksort
An iterative version of quicksort can be obtained from the recursive one directly by applying the standard transformation of recursion. This is shown in Sect. 4.1. A more interesting (and more efficient) iterative version can be obtained if we observe some properties of the first version. This is shown in Sect. 4.2.
4.1 Iterative Quicksort Version 1 – Simulating Method Arguments
4.2 Iterative Quicksort Version 2 – Pivot Storage
Preliminaries: We now discuss the second version of iterative \(\texttt {quicksort}\), which, to the best of our knowledge, is novel. Rather than just translating the recursion into iteration, as we did in Sect. 4.1, we instead draw inspiration from observing the following two facts about the code from Sect. 4.1: Firstly, neighbouring \(\texttt {to}\) and \(\texttt {from}\) values are off by 1  this can be seen in lines 22 and 23. Secondly, after swapping the array elements at \(\texttt {from}\) and \({(\texttt {mid}1)}\) (line 19), the contents of the array at \((\texttt {mid}  1)\) never changes.
This led us to the idea that, rather than pushing and popping the ranges on which we operate (i.e. the values \(\texttt {from}\) and \(\texttt {to}\)) we can instead work with the final location of the pivot \((\texttt {mid}  1)\). We know that the contents of the array at this location will not change, and we also know that the next range to operate on will start at the location succeeding the location of the current pivot. Therefore, we use an array of pivot locations, called Open image in new window .
Verification: In our Dafny proof we wrote twentyfour Open image in new window statements to guide the prover, and called five lemmas at the code locations listed above. The lemmas are given below and proven in the next subsection. In the following, a, b and c stand for sequences, while i, j, k, m and n are natural numbers.
 L_sorted_combine( a , m , n ):The lemma above increases the range for which we know that an array a is sorted.$$\begin{aligned}&m \le n \le m+1 \, \wedge \, Sorted(\,a[..m+1)\,) \, \wedge \, a[..n) < a[n] \le a[n+1..)\\&\qquad \quad \,\longrightarrow \ \, Sorted(\,a[..m+2)\,) \end{aligned}$$
 L_prsrv_pivot( a , m ):The lemma above increases the range for which we know that elements are smaller than the elements in the remaining array.$$\begin{aligned} \begin{array}{l} m < a \, \wedge \, a[..m) < a[m] \le a[m+1..) \, \longrightarrow \ \, \, a[..m+1) \le a[m+1..) \end{array} \end{aligned}$$
 L_swap_prsrv_less( a , b , m , n ):The lemma above asserts that, after swapping, a pivot correctly partitions the array. The left subsequence is smaller than the right subsequence and the middle subsequence is smaller than the element \(a[n1]\).$$\begin{aligned}&m < n \le b \, \wedge \, b[..m) < b[m..) \, \wedge \, b[m+1..n) < b[m] \, \wedge \, a = b \\&\wedge \, Swapped(a,\,b,\,m,\,n1) \\&\qquad \quad \longrightarrow \ \, a[..m) < a[m..) \, \wedge \, a[m..n1)< a[n1] \end{aligned}$$
 L_sameUpTo_trans( a , b , c , m , n ):The lemma above asserts that permutation, and array composition from subarrays are transitive relations.$$\begin{aligned}&a = b = c \, \wedge \, m < n \le a \, \wedge a \approx b[..m) \mathbin {\texttt {++}}a[m..n) \mathbin {\texttt {++}}b[n..) \\&\, \wedge \, \,a[..) \sim \,b[..) \, \wedge \, b \approx c[..m+1) \mathbin {\texttt {++}}b[m+1..n) \mathbin {\texttt {++}}c[n..) \\&\, \wedge \, \,b[..) \sim \,c[..) \\&\qquad \qquad \longrightarrow \ \, a \approx c[..m) \mathbin {\texttt {++}}a[m..n) \mathbin {\texttt {++}}c[n..) \, \wedge \, \,a[..) \sim \,c[..) \end{aligned}$$
 L_sameUpTo_prsv_sorted( a , b , i , j ):This lemma ensures that swapping preserves sortedness of subranges of the array.$$\begin{aligned}&a = b \, \wedge \, i < j \le b \, \wedge \, Sorted(\,b[..i+1)\,) \, \wedge \, a[..i) \le a[i..) \, \\&\,\wedge \, a \approx b[..i) \mathbin {\texttt {++}}a[i..j) \mathbin {\texttt {++}}b[j..) \\&\qquad \quad \longrightarrow \ \, Sorted(\,a[..i+1)\,) \end{aligned}$$
4.3 Proofs
We now show the proofs of these lemmas.

Proof of L_sorted_combine( a , m , n ):
Given
\(\begin{array}{lll} \ \ \ &{} (1)\ &{} m \le n \le m+1 \\ &{} (2) &{} Sorted(\,a[..m+1)\,) \\ &{} (3) &{} a[..n) < a[n] \le a[n+1..) \end{array} \)
To show
\(\begin{array}{lll} \ \ \ (A)&Sorted(\,a[..m+2)\,) \end{array} \)
From (1), we obtain that either \(m=n\) or \(m+1=n\). We proceed by case analysis.

1st Case :
(4) \(m=n\)
Then we have
\( \begin{array}{lll} (5) &{} a[..m) < a[m] \le a[m+1..) \ \ \ \ &{} \text{ from } \text{(3) } \text{ and } \text{(4) }\\ (6) &{} a[m]<a[m+1]&{} \text{ from } \text{(5) } \\ (A) &{} Sorted(\,a[..m+2)\,) &{} \text{ from } \text{(2) } \text{ and } \text{(6) } \end{array} \)

2nd Case :
(4) \(m+1=n\)
Then we have
\( \begin{array}{lll} (5) &{} a[..m+1) < a[m+1] \ \ \ \ \ \ \ \ \ &{} \text{ from } \text{(3) } \text{ and } \text{(4) }\\ (A) &{} Sorted(\,a[..m+2)\,) &{} \text{ from } \text{(2) } \text{ and } \text{(5) } \end{array} \)


Proof of L_prsrv_pivot( a , m ): by unfolding the definitions.

Proof of L_swap_prsrv_less( a , b , m , n ):
Given
\(\begin{array}{lll} \ \ \ &{} (1) &{} m < n \le b \\ &{} (2) &{} b[..m) < b[m..) \\ &{} (3) &{} b[m+1..n) < b[m] \le b[n+1..) \\ &{} (4) &{} a = b \\ &{} (5) &{} Swapped(a,\,b,\,m,\,n1) \\ \end{array} \)
To Show
\(\begin{array}{lll} \ \ \ &{} (A) &{} a[..m) < a[m..) \\ &{} (B) &{} a[m ..n1) < a[n1] \le a[n..) \end{array} \)
We obtain
\(\begin{array}{llll} \ \ \ &{} (6) &{} a[..m) \approx b[..m) &{} \text{ from } \text{(5) }\\ &{} (7) &{} a[m] = b[n1] &{} \text{ from } \text{(5) }\\ &{} (8) &{} a[m+1..n1) \approx b[m+1..n1) &{} \text{ from } \text{(5) }\\ &{} (9) &{} a[n1] = b[m] &{} \text{ from } \text{(5) }\\ &{} (10) &{} a[n..) \approx b[n..) &{} \text{ from } \text{(5) }\\ &{} (A) &{} a[..m) < a[m..) &{} \mathrm{from\; (2)}, \text {(7)(10)}\\ &{} (11) &{} a[m..n1) \approx b[n1]\mathbin {\texttt {++}}b[m+1..n1) &{} \text{ from } \text{(7), } \text{(8) } \\ &{} (12) &{} a[m..n1) < b[m] &{} \text{ from } \text{(11), } \text{(2) } \text{ and } \text{(3) } \\ &{} (13) &{} a[m..n1) < a[n1] &{} \text{ from } \text{(12), } \text{(9) } \\ &{} (14) &{} a[n1] \le a[n..) &{} \text{ from } \text{(3), } \text{(9) } \text{ and } \text{(10) } \\ &{} (B) &{} a[m ..n1) < a[n1] \le a[n..) &{} \text{ from } \text{(13) } \text{ and } \text{(14) } \end{array} \)

Proof of L_sameUpTo_trans( a , b , c , m , n ):
Given
\(\begin{array}{lll} \ \ \ &{} (1) &{} a = b = c \\ &{} (2) &{} m < n \le a \\ &{} (3) &{} a \approx b[..m) \mathbin {\texttt {++}}a[m..n) \mathbin {\texttt {++}}b[n..) \\ &{} (4) &{} \,a[..) \sim \,b[..) \\ &{} (5) &{} b \approx c[..m+1) \mathbin {\texttt {++}}b[m+1..n) \mathbin {\texttt {++}}c[n..) \\ &{} (6) &{} \,b[..) \sim \,[..c) \end{array}\)
To Show
\(\begin{array}{lll} \ \ \ &{} (A) &{} a \approx c[..m) \mathbin {\texttt {++}}a[m..n) \mathbin {\texttt {++}}c[n..) \\ &{} (B) &{} \,a[..) \sim \,c[..) \end{array}\)
We obtain
\(\begin{array}{llll} \ \ \ &{} (B) &{} \,a[..) \sim \,c[..) &{} \text{ from } \text{(4) } \text{ and } \text{(6) }\\ &{} (7) &{} b[..m) \approx c[..m) &{} \text{ from } \text{(5), } \text{ and } \text{ by } m < m+1\\ &{} (8) &{} b[n..) \approx c[n..) &{} \text{ from } \text{(5) }\\ &{} (A) &{} a \approx c[..m) \mathbin {\texttt {++}}a[m..n) \mathbin {\texttt {++}}c[n..)\ \ \ \ &{} \text{ from } \text{(3), } \text{(7) } \text{ and } \text{(8) } \end{array}\)

Proof of L_sameUpTo_prsv_sorted( a , b , i , j ):
Given
\(\begin{array}{lll} \ \ \ &{} (1) &{} a = b \\ &{} (2) &{} i < j \le b \\ &{} (3) &{} Sorted(\,b[..i+1)\,) \\ &{} (4) &{} a[..i) \le a[i..) \\ &{} (5) &{} a \approx b[..i) \mathbin {\texttt {++}}a[i..k) \mathbin {\texttt {++}}b[k..) \\ \end{array}\)
To Show
\(\begin{array}{lll} \ \ \ (A)&Sorted(\,a[..i+1)\,) \end{array}\)
We obtain
\(\begin{array}{llll} \ \ \ &{} (6) &{} Sorted(\,b[..i)\,) &{} \text{ from } \text{(3) } \text{ and } \text{ because } i < i+1\\ \ \ \ &{} (7) &{} Sorted(\,a[..i)\,) &{} \text{ from } \text{(5) } \text{ and } \text{(6) }\\ &{} (8) &{} i>1 \ \rightarrow \ a[i1] \le a[i] &{} \text{ from } \text{(4) }\\ &{} (A) &{} Sorted(\,a[..i+1)\,) \ \ \ &{} \text{ from } \text{(7) } \text{ and } \text{(8) } \end{array}\)
5 Experiences, Conclusions and Future Work
Despite extensive testing and handwritten proofs, it was reassuring when Dafny confirmed the correctness of our \(\texttt {quicksort}\). We found arraysequence infix operators to be useful in the development of both the algorithm and reasoning.
Dafny was extremely effective in helping us iron out many little, fiddly bugs at the original stages of our work. As we progressed, the process became both slow and addictive. Those of us new to Dafny were often surprised to see that Dafny/Z3 could automatically discharge proof obligations which were, in our opinion, nontrivial, while it was often unable to discharge what we considered trivial ones. This was due to our limited previous understanding of Z3.
We therefore proceeded in a somewhat experimental fashion. We inserted Open image in new window statements for all the proof obligations, and gradually replaced them by Open image in new window statements. When the verifier was unable to discharge an obligation, we wrote a lemma, whose validity we checked through handwritten proofs. As a result, the lemmas we have developed do not seem to be the most interesting or intuitive ones, and their choice might have been affected by the particular order in which we happened to require them.
The computational power needed for the proofs to go through was considerable. Therefore, we adopted little tricks to focus the tool on particular aspects of the proof. For example, we would replace part of the code with Open image in new window , so that the tool would not need to check validity past this point. We also split the proof of the pivotbased iterative \(\texttt {quicksort}\) into two: First we replaced the code in the \(\texttt {else}\) branch by Open image in new window . This let us prove that the initialization establishes the loop invariant and that the \(\texttt {then}\) branch of the loop preserves it. Then we wrote a function whose body consists of Open image in new window statements for all the loop invariants, followed by the code from the \(\texttt {else}\) branch of the loop and ending in Open image in new window statements for all the loop invariants. This let us prove that the \(\texttt {else}\) branch of the loop also preserves the loop invariant.
The experimental fashion for discovering useful lemmas, and the ticks to focus the tool on certain aspects are often seen in the Verification Corner videos [14]. We believe that Visual Studio should provide more automatic support for steering the proof effort and more help with interactive program and proof development.
As future work, we would like to complete the proofs of the lemmas we have used, complete the proofs of the other two versions of \(\texttt {quicksort}\), and try and unify the arguments used in the various proofs. We would also like to run benchmarks to compare the efficiency of our pivotbased algorithm with that of other algorithms in the literature. Finally, we want to port the Dafny proofs to our tool Apollo [4], which maps Java, Haskell code and proof idioms onto Dafny.
Footnotes
 1.
The sorting task can actually be defined for an array of any type that has a lessthenorequal relation \(\le \).
 2.
The careful reader will notice that the array lookup Open image in new window is not always defined. Nevertheless, the assertion is wellformed, because it stands for \(\forall i \in [\texttt {top}..\texttt {a}).\forall j \in [0.. \texttt {pivs[}i\texttt {]}).\forall k \in [\texttt {pivs[}i\texttt {]+1..a}).\, \texttt {a[}j\texttt {]} < \texttt {a[pivs[}i\texttt {]]} \le \texttt {a[}k\texttt {]}\).
Notes
Acknowledgments
We thank Krysia Broda for showing us the recursive, nonstandard version of quicksort, and the anonymous reviewers of this volume for valuable suggestions and pointers.
Razvan Certezeanu, Benjamin EgelundMuller and Sinduran Sivarajan thank the Department of Computing at Imperial College for funding their Undergraduate Research Opportunities Programme (UROP) Placements, undertaken under Mark Wheelhouse’s supervision, which they spent working on Apollo, and this paper.
Sophia Drossopoulou thanks Microsoft Research and Judith Bishop for a research gift and her very warm hospitality at Microsoft Research, and the EU project Upscale, FP7612985, for supporting part of this work, and for the opportunity to collaborate with Frank S. de Boer, the recipient of this Festschrift.
References
 1.Apt, K., Boer, F., Olderog, E.: Verification of Sequential and Concurrent Programs. Springer, Dordrecht (2009)CrossRefzbMATHGoogle Scholar
 2.Beckert, B., Hähnle, R., Schmitt, P.H. (eds.): Verification of ObjectOriented Software. The KeY Approach. LNCS (LNAI), vol. 4334. Springer, Heidelberg (2007)Google Scholar
 3.Certezeanu, R., Drossopoulou, S., EgelundMuller, B., Sivarajan, S., Wheelhouse, M., Leino, K.: Dafny Code for Variations on Quicksort. http://www.doc.ic.ac.uk/~mjw03/research/quicksort.html
 4.Certezeanu, R., Drossopoulou, S., EgelundMuller, B., Sivarajan, S., Wheelhouse, M., Leino, K.:Apollo: An interactive Program and Proof development tool for Java and Haskell, based on Dafny (to appear)Google Scholar
 5.Foley, M., Hoare, C.: Proof of a recursive program: quicksort. Comput. J. 14, 391–395 (1971)MathSciNetCrossRefzbMATHGoogle Scholar
 6.de Gouw, S., Rot, J., de Boer, F.S., Bubel, R., Hähnle, R.: OpenJDK’s Java.utils.Collection.sort() is broken: the good, the bad and the worst case. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 273–289. Springer, Heidelberg (2015)CrossRefGoogle Scholar
 7.Hoare, C.: Algorithm 64: quicksort. Commun. ACM 4, 321 (1961)CrossRefGoogle Scholar
 8.Hoare, C.: An axiomatic basis for computer programming. Commun. ACM 12, 576–580 (1969)CrossRefzbMATHGoogle Scholar
 9.Lamort, L.:Thinking Above the Code. https://www.youtube.com/watch?v=4Yp3j_jk8Q
 10.Leino, K.R.M.: Dafny: an automatic program verifier for functional correctness. In: Clarke, E.M., Voronkov, A. (eds.) LPAR16 2010. LNCS, vol. 6355, pp. 348–370. Springer, Heidelberg (2010)CrossRefGoogle Scholar
 11.Leino, K.: Dafny: An Automatic Program Verifier for Functional Correctness. http://dafny.codeplex.com
 12.Manna, Z.: Mathematical Theory of Computation. McGrawHill, New York (1974)zbMATHGoogle Scholar
 13.Oracle Documentation: Arrays (Java Platform SE 7). http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html
 14.The Verification Corner  Microsoft Research. http://research.microsoft.com/enus/projects/verificationcorner
 15.Wikipedia: Quicksort. https://en.wikipedia.org/wiki/Quicksort
 16.YouTube: Quicksort with Hungarian (Kkllmenti legnyes) folk dance. https://www.youtube.com/watch?v=ywWBy6J5gz8