QuickXsort: A Fast Sorting Scheme in Theory and Practice

Edelkamp, Stefan; Weiß, Armin; Wild, Sebastian

doi:10.1007/s00453-019-00634-0

QuickXsort: A Fast Sorting Scheme in Theory and Practice

Published: 22 October 2019

Volume 82, pages 509–588, (2020)
Cite this article

Algorithmica Aims and scope Submit manuscript

801 Accesses
6 Citations
Explore all metrics

Abstract

QuickXsort is a highly efficient in-place sequential sorting scheme that mixes Hoare’s Quicksort algorithm with X, where X can be chosen from a wider range of other known sorting algorithms, like Heapsort, Insertionsort and Mergesort. Its major advantage is that QuickXsort can be in-place even if X is not. In this work we provide general transfer theorems expressing the number of comparisons of QuickXsort in terms of the number of comparisons of X. More specifically, if pivots are chosen as medians of (not too fast) growing size samples, the average number of comparisons of QuickXsort and X differ only by o(n)-terms. For median-of-k pivot selection for some constant k, the difference is a linear term whose coefficient we compute precisely. For instance, median-of-three QuickMergesort uses at most $n \lg n - 0.8358n + {\mathcal {O}}(\log n)$ comparisons. Furthermore, we examine the possibility of sorting base cases with some other algorithm using even less comparisons. By doing so the average-case number of comparisons can be reduced down to $n \lg n - 1.4112n + o(n)$ for a remaining gap of only 0.0315n comparisons to the known lower bound (while using only ${\mathcal {O}}(\log n)$ additional space and ${\mathcal {O}}(n\log n)$ time overall). Implementations of these sorting strategies show that the algorithms challenge well-established library implementations like Musser’s Introsort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-Based Clustering Based on Hierarchical Density Estimates

Stratified random sampling from streaming and stored data

Article 23 October 2020

Permutation tests for experimental data

Article Open access 01 April 2023

Notes

We write $\lg $ for $\log _2$ and $\log $ for the logarithm with an unspecified constant base in the ${\mathcal {O}}$ notation. A term $\pm {\mathcal {O}}(f(n))$ indicates an error term with unspecified sign; formal details are given in Sect. 3.
Throughout the text, we avoid the (in our context somewhat ambiguous) terms in-place or in-situ. We instead call an algorithm internal if it needs at most ${\mathcal {O}}(\log n)$ words of space (in addition to the array to be sorted). In particular, Quicksort is an internal algorithm whereas standard Mergesort is not (hence called external) since it uses a linear amount of buffer space for merges.
The first two authors elaborate on how to make this approach worst-case efficient with little additional overhead in a recent article [13].
Merging can be done in place using more advanced tricks (see, e.g., [19, 34]), but those tend not to be competitive in terms of running time with other sorting methods. By changing the global structure, a “pure” internal Mergesort variant [29, 42] can be achieved using part of the input as a buffer (as in QuickMergesort) at the expense of occasionally having to merge runs of very different lengths.
We assume here an unmodified standard Mergesort variant that executes all merges in any case. In particular we assume the following folklore trick is not used: One can check (with one comparison) whether the two runs are already sorted prior to calling the merge routine and skip merging entirely if they are. This optimization leads to a linear best case and will increase the variance.
We remark that this is no longer true for multiway partitioning methods where the number of comparisons per element is not necessarily the same for all possible outcomes. Similarly, the number of swaps in the standard partitioning method depends not only on the rank of the pivot, but also on how “displaced” the elements in the input are.
It is, indeed, a reasonable option to enforce this assumption in an implementation by an explicit random shuffle of the input before we start sorting. Sedgewick and Wayne, for example, do this for the implementation of Quicksort in their textbook [47]. In the context of QuickXsort, a full random shuffle is overkill, though; see Remark 5.2 for more discussion.
Meanwhile, an implementation has been created and made available at https://github.com/rbroesamle/Implementierung-von-in-place-Mergesort-Algorithmen. Although it is surprisingly fast, the experiments by its creators suggest that it does not beat our approach.
We could also sort base cases of some slower growing size with Z, e.g., $\varTheta (\log \log n)$. This avoids a constant factor overhead, but still gives a non-negligible additional term in $\omega (n) \cap o(n \log n)$.
For these experiments we use a different experimental setup: depending on the size of the arrays the displayed numbers are averages over 10–10,000 runs.
Although the statement of our theorem is the same as for [52, Theorem 5.1], our proof here is significantly shorter than the one given there. By first taking the difference $c(n) - x(n)$, we turn the much more complicated terms ${\mathbb {E}}[A_r x(J_{3-r})]$ from t(n) into the simpler ${\mathbb {E}}[x(J_r)]$ in $t'(n)$, which allows us to entirely omit [52, Lemma E.1].

References

Blum, M., Floyd, R.W., Pratt, V.R., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput. Syst. Sci. 7(4), 448–461 (1973)
Article MathSciNet Google Scholar
Boehm, H.-J., Atkinson, R.R., Plass, M.F.: Ropes: an alternative to strings. Softw. Pract. Exp. 25(12), 1315–1330 (1995). https://doi.org/10.1002/spe.4380251203
Article Google Scholar
Cantone, D., Cincotti, G.: Quickheapsort, an efficient mix of classical sorting algorithms. Theor. Comput. Sci. 285(1), 25–42 (2002). https://doi.org/10.1016/S0304-3975(01)00288-2
Article MathSciNet MATH Google Scholar
Diekert, V., Weiß, A.: QuickHeapsort: modifications and improved analysis. Theory Comput. Syst. 59(2), 209–230 (2016). https://doi.org/10.1007/s00224-015-9656-y
Article MathSciNet MATH Google Scholar
Doberkat, E.E.: An average case analysis of Floyd’s algorithm to construct heaps. Inf. Control 61(2), 114–131 (1984). https://doi.org/10.1016/S0019-9958(84)80053-4
Article MathSciNet MATH Google Scholar
Dutton, R.D.: Weak-heap sort. BIT 33(3), 372–381 (1993)
Article MathSciNet Google Scholar
Edelkamp, S., Stiegeler, P.: Implementing HEAPSORT with $n \log n - 0.9n$ and QUICKSORT with $n \log n + 0.2 n$ comparisons. ACM J. Exp. Algorithm. 10(5) (2002). https://doi.org/10.1145/944618.944623
Article Google Scholar
Edelkamp, S., Wegener, I.: On the performance of Weak-Heapsort. In: Symposium on Theoretical Aspects of Computer Science (STACS) 2000, vol. 1770, pp. 254–266. Springer (2000). https://doi.org/10.1007/3-540-46541-3_21
Chapter Google Scholar
Edelkamp, S., Weiß, A.: QuickXsort: efficient sorting with $n \log n - 1.399n + o(n)$ comparisons on average (2013). arXiv:1307.3033
Edelkamp, S., Weiß, A.: QuickXsort: efficient sorting with $n \log n - 1.399 n + o(n)$ comparisons on average. In: International Computer Science Symposium in Russia, pages 139–152. Springer, Berlin (2014). https://doi.org/10.1007/978-3-319-06686-8_11
Google Scholar
Edelkamp, S., Weiß, A.: BlockQuicksort: avoiding branch mispredictions in Quicksort. In: P. Sankowski and C. D. Zaroliagis, editors, European Symposium on Algorithms (ESA) 2016, volume 57 of LIPIcs, pp. 38:1–38:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, (2016). https://doi.org/10.4230/LIPIcs.ESA.2016.38
Edelkamp, S., Weiß, A.: QuickMergesort: practically efficient constant-factor optimal sorting, 2018. arXiv:1804.10062
Edelkamp, S., Weiß, A.: Worst-case efficient sorting with quickmergesort. In: Proceedings of the Twenty-First Workshop on Algorithm Engineering and Experiments, ALENEX 2019, San Diego, CA, USA, January 7–8, pp. 1–14 (2019). https://doi.org/10.1137/1.9781611975499.1
Google Scholar
Edelkamp, S., Weiß, A., Wild, S.: Quickxsort—a fast sorting scheme in theory and practice (2018). arXiv:1811.01259
Elmasry, A., Katajainen, J., Stenmark, M.: Branch mispredictions don’t affect mergesort. Int. Sympos. Exp. Algorithms (SEA) 2012, 160–171 (2012). https://doi.org/10.1007/978-3-642-30850-5_15
Article Google Scholar
Flajolet, P., Golin, M.: Mellin transforms and asymptotics. Acta Inform. 31(7), 673–696 (1994). https://doi.org/10.1007/BF01177551
Article MathSciNet MATH Google Scholar
Ford, L.R., Jr., Johnson, S.M.: A tournament problem. Am. Math. Mon. 66(5), 387–389 (1959). URL: http://www.jstor.org/stable/2308750
Ford, L.R., Johnson, S.M.: A tournament problem. Am. Mathe. Mon. 66(5), 387 (1959). https://doi.org/10.2307/2308750
Article MathSciNet MATH Google Scholar
Geffert, V., Katajainen, J., Pasanen, T.: Asymptotically efficient in-place merging. Theor. Comput. Sci. 237(1–2), 159–181 (2000). https://doi.org/10.1016/S0304-3975(98)00162-5
Article MathSciNet MATH Google Scholar
Golin, M.J., Sedgewick, R.: Queue-mergesort. Inf. Process. Lett. 48(5), 253–259 (1993). https://doi.org/10.1016/0020-0190(93)90088-q
Article MATH Google Scholar
Gonnet, G.H., Munro, J.I.: Heaps on heaps. SIAM J. Comput. 15(4), 964–971 (1986). https://doi.org/10.1137/0215068
Article MathSciNet MATH Google Scholar
Graham, R.L., Knuth, D.E., Patashnik, O.: A Foundation For Computer Science. Addison-Wesley, Concrete Mathematics (1994)
Hennequin, P.: Combinatorial analysis of quicksort algorithm. RAIRO—Theoretical Informatics and Applications—Informatique Théorique et Applications 23(3), 317–333 (1989). http://eudml.org/doc/92337
Article MathSciNet Google Scholar
Hoare, C.A.R.: Algorithm 65: find. Commun. ACM 4(7), 321–322 (1961). https://doi.org/10.1145/366622.366647
Article Google Scholar
Hwang, H.-K.: Limit theorems for mergesort. Random Struct. Algorithms 8(4), 319–336 (1996). https://doi.org/10.1002/(sici)1098-2418(199607)8:4<319::aid-rsa3>3.0.co;2-0
Article MathSciNet Google Scholar
Hwang, H.-K.: Asymptotic expansions of the mergesort recurrences. Acta Inf. 35(11), 911–919 (1998). https://doi.org/10.1007/s002360050147
Article MathSciNet MATH Google Scholar
Iwama, K., Teruyama, J.: Improved average complexity for comparison-based sorting. In Ellen, F., Kolokolova, A., Sack, J. (eds.) Workshop on Algorithms and Data Structures (WADS), Proceedings, volume 10389 of Lecture Notes in Computer Science, pp. 485–496. Springer (2017). https://doi.org/10.1007/978-3-319-62127-2_41
Google Scholar
Katajainen, J.: The ultimate heapsort. In: Proceedings of the Computing: The 4th Australasian Theory Symposium, Australian Computer Science Communications, pp. 87–96. Springer-Verlag Singapore Pte. Ltd., (1998). URL: http://www.diku.dk/~jyrki/Myris/Kat1998C.html
Katajainen, J., Pasanen, T., Teuhola, J.: Practical in-place mergesort. Nordic J. Comput. 3(1), 27–40 (1996). http://www.diku.dk/~jyrki/Myris/KPT1996J.html
Kim, P.-S., Kutzner, A.: Ratio based stable in-place merging. In: Agrawal, M., Du, D.-Z., Duan, Z., Li, A. (eds.) Theory and Applications of Models of Computation, 5th International Conference, TAMC 2008, Xi’an, China, April 25–29, 2008. Proceedings, volume 4978 of Lecture Notes in Computer Science, pp. 246–257. Springer, (2008). https://doi.org/10.1007/978-3-540-79228-4, https://doi.org/10.1007/978-3-540-79228-4_22
Knuth, D.E.: The Art Of Computer Programming: Searching and Sorting, 2nd edn. Addison Wesley, Boston (1998)
MATH Google Scholar
Knuth, D.E.: Selected Papers on Analysis of Algorithms, volume 102 of CSLI Lecture Notes. Center for the Study of Language and Information Publications (2000)
Mahmoud, H.M.: Sorting: A distribution theory. Wiley, New York (2000)
Book Google Scholar
Mannila, H., Ukkonen, E.: A simple linear-time algorithm for in situ merging. Inf. Process. Lett. 18(4), 203–208 (1984). https://doi.org/10.1016/0020-0190(84)90112-1
Article MathSciNet Google Scholar
Martínez, C., Roura, S.: Optimal sampling strategies in Quicksort and Quickselect. SIAM J. Comput. 31(3), 683–705 (2001). https://doi.org/10.1137/S0097539700382108
Article MathSciNet MATH Google Scholar
McDiarmid, C.J.H.: Concentration. In: Habib, M., McDiarmid, C., Ramirez-Alfonsin, J., Reed, B. (eds.) Probabilistic Methods for Algorithmic Discrete Mathematics, pp. 195–248. Springer, Berlin (1998)
Chapter Google Scholar
McDiarmid, C.J.H., Reed, B.A.: Building heaps fast. J. Algorithms 10, 352–365 (1989)
Article MathSciNet Google Scholar
Mike McFadden. WikiSort. Github repository at https://github.com/BonzaiThePenguin/WikiSort. https://github.com/BonzaiThePenguin/WikiSort
Musser, D.R.: Introspective sorting and selection algorithms. Softw. Pract. Exp. 27(8), 983–993 (1997)
Article Google Scholar
NIST Digital Library of Mathematical Functions. Release 1.0.10; Release date 2015-08-07. URL: http://dlmf.nist.gov
Panny, W., Prodinger, H.: Bottom-up mergesort—a detailed analysis. Algorithmica 14(4), 340–354 (1995). https://doi.org/10.1007/BF01294131
Article MathSciNet MATH Google Scholar
Reinhardt, K.: Sorting in-place with a worst case complexity of $n \log n - 1.3n+O(\log n)$ comparisons and $\epsilon n \log n+O(1)$ transports. In: International Symposium on Algorithms and Computation (ISAAC), pp. 489–498 (1992). https://doi.org/10.1007/3-540-56279-6_101
Chapter Google Scholar
Roura, S.: Divide-and-Conquer Algorithms and Data Structures. Tesi doctoral (Ph. D. thesis, Universitat Politècnica de Catalunya (1997)
Roura, S.: Improved master theorems for divide-and-conquer recurrences. J. ACM 48(2), 170–205 (2001). https://doi.org/10.1145/375827.375837
Article MathSciNet MATH Google Scholar
Sedgewick, R.: The analysis of Quicksort programs. Acta Inf. 7(4), 327–355 (1977). https://doi.org/10.1007/BF00289467
Article MathSciNet MATH Google Scholar
Sedgewick, R., Flajolet, P.: An Introduction to the Analysis of Algorithms, 2nd edn. Addison-Wesley-Longman, Boston (2013)
MATH Google Scholar
Sedgewick, R., Wayne, K.: Algorithms, 4th edn. Addison-Wesley, Boston (2011)
MATH Google Scholar
Sohrab, H.H.: Basic Real Analysis, 2nd edn. Springer Birkhäuser, Berlin (2014)
MATH Google Scholar
Stober, F., Weiß, A.: On the average case of MergeInsertion. In: International Workshop on Combinatorial Algorithms (IWOCA) 2019 (2019). arXiv:1905.09656
Google Scholar
Wegener, I.: Bottom-up-Heapsort, a new variant of Heapsort beating, on an average, Quicksort (if $n$ is not very small). Theor. Comput. Sci. 118(1), 81–98 (1993)
Article MathSciNet Google Scholar
Wild, S.: Dual-pivot quicksort and beyond: analysis of multiway partitioning and its practical potential. Doktorarbeit (Ph.D. thesis), Technische Universität Kaiserslautern, 2016. ISBN 978-3-00-054669-3. http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:hbz:386-kluedo-44682
Wild, S.: Average cost of QuickXsort with pivot sampling. In Fill, J.A., Ward, M.D. (eds.) International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2018), LIPIcs (2018). https://doi.org/10.4230/LIPIcs.AofA.2018.36
Wild, S.: Quicksort is optimal for many equal keys. In: Workshop on Analytic Algorithmics and Combinatorics (ANALCO) 2018, pp. 8–22. SIAM, January (2018). arXiv:1608.04906, https://doi.org/10.1137/1.9781611975062.2
Chapter Google Scholar
Wild, S.: Supplementary mathematica notebook for variance computation (2018). https://doi.org/10.5281/zenodo.1463020

Download references

Acknowledgements

We thank our anonymous reviewers for their thoughtful comments which significantly helped improving the presentation.

Funding

The second author was supported by the German Research Foundation (DFG), grant DI 435/7-1. The last author was supported by the Natural Sciences and Engineering Research Council of Canada and the Canada Research Chairs Programme.

Author information

Authors and Affiliations

Department of Informatics, King’s College London, London, UK
Stefan Edelkamp
Institut für Formale Methoden der Informatik, Universität Stuttgart, Stuttgart, Germany
Armin Weiß
Department of Computer Science, University of Liverpool, Liverpool, UK
Sebastian Wild

Authors

Stefan Edelkamp
View author publications
You can also search for this author in PubMed Google Scholar
Armin Weiß
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Wild
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastian Wild.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Parts of this article have been presented (in preliminary form) at the International Computer Science Symposium in Russia (CSR) 2014 [10] and at the International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA) 2018 [52].

Appendices

Appendix

A Notation

1.1 Generic Mathematics

${\mathbb {N}}$, ${\mathbb {N}}_0$, ${\mathbb {Z}}$, ${\mathbb {R}}$:: Natural numbers ${\mathbb {N}}= \{1,2,3,\ldots \}$, ${\mathbb {N}}_0 = {\mathbb {N}}\cup \{0\}$, integers ${\mathbb {Z}}= \{\ldots ,-2,-1,0,1,2,\ldots \}$, real numbers ${\mathbb {R}}$.
${\mathbb {R}}_{>1}$, ${\mathbb {N}}_{\ge 3}$ etc.:: Restricted sets $X_\text {pred} = \{x\in X : x \text { fulfills } \text {pred} \}$.
$\ln (n)$, $\lg (n)$, $\log n$:: Natural and binary logarithm; $\ln (n) = \log _e(n)$, $\lg (n) = \log _2(n)$. We use $\log $ for an unspecified (constant) base in ${\mathcal {O}}$-terms
X :: To emphasize that X is a random variable it is Capitalized.
[a, b):: Real intervals, the end points with round parentheses are excluded, those with square brackets are included.
[m..n], [n]:: Integer intervals, $[m..n] = \{m,m+1,\ldots ,n\}$; $[n] = [1..n]$.
$\llbracket \text {stmt}\rrbracket $, $\llbracket x=y\rrbracket $:: Iverson bracket, $\llbracket \text {stmt}]\rrbracket = 1$ if stmt is true, $\llbracket \text {stmt}\rrbracket = 0$ otherwise.
$H_{n}$ :: nth harmonic number; $H_{n} = \sum _{i=1}^n 1/i$.
$x \pm y$ :: x with absolute error |y|; formally the interval $x \pm y = [x-|y|,x+|y|]$; as with ${\mathcal {O}}$-terms, we use one-way equalities $z=x\pm y$ instead of $z \in x \pm y$.
$a^{{\underline{b}}}$, $a^{{{\overline{b}}}}$:: Factorial powers; “a to the b falling resp. rising”; e.g., $x^{{\underline{3}}} = x(x-1)(x-2)$, $x^{\underline{-3}} = 1/((x+1)(x+2)(x+3))$.
$\left( {\begin{array}{c}n\\ k\end{array}}\right) $ :: Binomial coefficients; $\left( {\begin{array}{c}n\\ k\end{array}}\right) = n^{{\underline{k}}} \big / k!$.
$\text {B}(\lambda ,\rho )$ :: For $\lambda ,\rho \in {\mathbb {R}}_+$; the beta function, $\text {B}(\lambda ,\rho ) = \int _0^1 z^{\lambda -1}(1-z)^{\rho -1}\, dz$; see also Eq. (9) on page 42.
$I_{x,y}(\lambda ,\rho )$ :: The regularized incomplete beta function; $I_{x,y}(\lambda ,\rho )= \int _x^y \frac{z^{\lambda -1}(1-z)^{\rho -1}}{\text {B}(\lambda ,\rho )} dz$ for $\lambda ,\rho \in {\mathbb {R}}_+$, $0\le x\le y\le 1$.

1.2 Stochastics-Related Notation

${\mathbb {P}}[E]$, ${\mathbb {P}}[X=x]$:: Probability of an event E resp. probability for random variable X to attain value x.
${\mathbb {E}}[X]$ :: Expected value of X; we write ${\mathbb {E}}[XY]$ for the conditional expectation of X given Y, and to emphasize that expectation is taken w.r.t. random variable X.
:: Equality in distribution; X and Y have the same distribution.
${\mathcal {U}}(a,b)$ :: Uniformly in $(a,b)\subset {\mathbb {R}}$ distributed random variable.
$\text {Beta}(\lambda ,\rho )$ :: Beta distributed random variable with shape parameters $\lambda \in {\mathbb {R}}_{>0}$ and $\rho \in {\mathbb {R}}_{>0}$.
$\text {Bin}(n,p)$ :: Binomial distributed random variable with $n\in {\mathbb {N}}_0$ trials and success probability $p\in [0,1]$.
$\text {BetaBin}(n,\lambda ,\rho )$ :: Beta-binomial distributed random variable; $n\in {\mathbb {N}}_0$, $\lambda ,\rho \in {\mathbb {R}}_{>0}$;

1.3 Specific Notation for Algorithms and Analysis

n :: Length of the input array, i.e., the input size.
k, t:: Sample size $k\in {\mathbb {N}}_{\ge 1}$, odd; $k=2t+1$, $t\in {\mathbb {N}}_0$; we write k(n) to emphasize that k might depend on n.
w :: Threshold for recursion, for $n\le w$, we sort inputs by X; we require $w\ge k-1$.
$\alpha $ :: $\alpha \in [0,1]$; method X may use buffer space for $\lfloor \alpha n\rfloor $ elements.
c(n):: Expected costs of QuickXsort; see Sect. 9.2.
x(n), a, b:: Expected costs of X, $x(n) = a n \lg n + b n \pm o(n)$; see Sect. 9.2.
$J_1$, $J_2$:: (Random) Subproblem sizes; $J_1+J_2 = n-1$; $J_1 = t + I_1$;
$I_1$, $I_2$:: (Random) Segment sizes in partitioning; ; $I_2 = n-k-I_1$; $J_1 = t + I_1$
R :: (One-based) Rank of the pivot; $R=J_1+1$.
s(k):: (Expected) cost for pivot sampling, i.e., cost for choosing median of k elements.
$A_1$, $A_2$, A:: Indicator random variables; $A_1 = \llbracket \text {left subproblem sorted recursively}\rrbracket $; see Sect. 9.2.

B Mathematical Preliminaries

In this appendix, we restate some known results for the reader’s convenience.

1.1 Hölder Continuity

A function $f:I\rightarrow R$ defined on a bounded interval I is Hölder-continuous with exponent $\eta \in (0,1]$ if

$$\begin{aligned} \exists C\; \forall x,y\in I\;: \bigl | f(x) - f(y) \bigr | \;\le C |x-y|^\eta . \end{aligned}$$

Hölder-continuity is a notion of smoothness that is stricter than (uniform) continuity but slightly more liberal than Lipschitz-continuity (which corresponds to $\eta =1$). $f:[0,1]\rightarrow {\mathbb {R}}$ with $f(z) = z \ln (1/z)$ is a stereotypical function that is Hölder-continuous (for any $\eta \in (0,1)$) but not Lipschitz (see Lemma 9.1 below).

One useful consequence of Hölder-continuity is given by the following lemma: an error bound on the difference between an integral and the Riemann sum ([51, Proposition 2.12–(b)]).

Lemma B.1

(Hölder integral bound) Let $f:[0,1] \rightarrow {\mathbb {R}}$ be Hölder-continuous with exponent $\eta $. Then

$$\begin{aligned} \int _{0}^1 f(x) \, dx&\;\;= \frac{1}{n} \sum _{i = 0}^{n-1} f(i / n) \;\;\pm {\mathcal {O}}(n^{-\eta }), \qquad (n\rightarrow \infty ). \end{aligned}$$

$\square $

Remark B.2

(Properties of Hölder-continuity) We considered only the unit interval as the domain of functions, but this is no restriction: Hölder-continuity (on bounded domains) is preserved by addition, subtraction, multiplication and composition (see, e.g., [48, Section 4.6] for details). Since any linear function is Lipschitz, the result above holds for Hölder-continuous functions $f:[a,b] \rightarrow {\mathbb {R}}$.

If our functions are defined on a bounded domain, Lipschitz-continuity implies Hölder-continuity and Hölder-continuity with exponent $\eta $ implies Hölder-continuity with exponent $\eta ' < \eta $. A real-valued, differentiable function is Lipschitz if its derivative is bounded.

1.2 Chernoff Bound

We write if X is has a binomial distribution with $n\in {\mathbb {N}}_0$ trials and success probability $p\in [0,1]$. Since X is a sum of independent random variables with bounded influence on the result, Chernoff bounds imply strong concentration results for X. We will only need a very basic variant given in the following lemma.

Lemma B.3

(Chernoff Bound, Theorem 2.1 of [36]) Let and $\delta \ge 0$. Then

(26)

$\square $

1.3 Continuous Master Theorem

For solving recurrences, we build upon Roura’s master theorems [44]. The relevant continuous master theorem is restated here for convenience:

Theorem B.4

(Roura’s Continuous Master Theorem (CMT)) Let $F_n$ be recursively defined by

$$\begin{aligned} F_n \;\;= {\left\{ \begin{array}{ll} b_n\,, &{}\hbox { for } 0 \le n < N ; \\ t_n \,+ \smash {\sum _{j=0}^{n-1} w_{n,j} \, F_j}, &{}\hbox { for }n \ge N\,, \end{array}\right. } \end{aligned}$$

(27)

where $t_n$, the toll function, satisfies $t_n \sim K n^\sigma \log ^\tau (n)$ as $n\rightarrow \infty $ for constants $K\ne 0$, $\sigma \ge 0$ and $\tau > -1$. Assume there exists a function $w:[0,1]\rightarrow {\mathbb {R}}_{\ge 0}$, the shape function, with $\int _0^1 w(z) dz \ge 1 $ and

$$\begin{aligned} \sum _{j=0}^{n-1} \,\biggl | w_{n,j} \,- \! \int _{j/n}^{(j+1)/n} w(z) \, dz \biggr | \;\;= {\mathcal {O}}(n^{-d}), \qquad (n\rightarrow \infty ), \end{aligned}$$

(28)

for a constant $d>0$. With $\displaystyle H {:}{=}1 - \int _0^1 \!z^\sigma w(z) \, dz$, we have the following cases:

1.
If $H > 0$, then $\displaystyle F_n \sim \frac{t_n}{H}$.
2.
If $H = 0$, then $\displaystyle F_n \sim \frac{t_n \ln n}{{{\widetilde{H}}}}$ with $\displaystyle {{\widetilde{H}}} = -(\tau +1)\int _0^1 \!z^\sigma \ln (z) \, w(z) \, dz$.
3.
If $H < 0$, then $F_n = {\mathcal {O}}(n^c)$ for the unique $c\in {\mathbb {R}}$ with $\displaystyle \int _0^1 \!z^c w(z) \, dz = 1$.

$\square $

Theorem B.4 is the “reduced form” of the CMT, which appears as Theorem 1.3.2 in Roura’s doctoral thesis [43], and as Theorem 18 of [35]. The full version (Theorem 3.3 in [44]) allows us to handle sublogarithmic factors in the toll function, as well, which we do not need here.

C Pseudocode

In this appendix, we list explicit pseudocode for the most important merging procedures.

1.1 Simple Merge by Swaps

Algorithm C.1 is a merging procedure with an in-place interface: upon termination, the merge result is found in the same area previously occupied by the merged runs. The method works by moving the left run into a buffer area first. More specifically, we move $A[\ell ..m-1]$ into the buffer area $B[b..b+n_1-1]$ and then merge it with the second run A[m..r]—still residing in the original array—into the empty slot left by the first run. By the time this first half is filled, we either have consumed enough of the second run to have space to grow the merged result, or the merging was trivial, i.e., all elements in the first run were smaller.

This method is mostly for demonstration purposes, how little changes are required to use Mergesort in QuickXsort. It is not the best solution, though.

1.2 Ping-Pong Mergesort

In this section, we give detailed code for the “ping-pong” Mergesort variant that we use in our QuickMergesort implementation. It also uses $\alpha =\frac{1}{2}$, i.e., buffer space to hold half of the input. By using a special procedure, MergesortOuter, for the outermost call following the strategy illustrated in Fig. 3, we reduce the task to the case of $\alpha =1$ for two subproblems. These are solved by moving elements from the input area to the output area (while sorting them), which is easily achieved by a recursive procedure (MergesortInner).

1.3 Reinhardt’s Merge

In this section, we give the code for Reinhardt’s merge [42], which allows a smaller buffer $\alpha = \frac{1}{4}$. We do not present Reinhardt’s whole Mergesort algorithm based on it but merely the actual merging routine as illustrated in Fig. 4. The organization of the different calls to merging is easy but requires many tedious index calculations, which do not add much insight for the reader.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Edelkamp, S., Weiß, A. & Wild, S. QuickXsort: A Fast Sorting Scheme in Theory and Practice. Algorithmica 82, 509–588 (2020). https://doi.org/10.1007/s00453-019-00634-0

Download citation

Received: 16 October 2018
Accepted: 24 September 2019
Published: 22 October 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s00453-019-00634-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

QuickXsort: A Fast Sorting Scheme in Theory and Practice

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates