Skip to main content
Log in

QuickXsort: A Fast Sorting Scheme in Theory and Practice

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

QuickXsort is a highly efficient in-place sequential sorting scheme that mixes Hoare’s Quicksort algorithm with X, where X can be chosen from a wider range of other known sorting algorithms, like Heapsort, Insertionsort and Mergesort. Its major advantage is that QuickXsort can be in-place even if X is not. In this work we provide general transfer theorems expressing the number of comparisons of QuickXsort in terms of the number of comparisons of X. More specifically, if pivots are chosen as medians of (not too fast) growing size samples, the average number of comparisons of QuickXsort and X differ only by o(n)-terms. For median-of-k pivot selection for some constant k, the difference is a linear term whose coefficient we compute precisely. For instance, median-of-three QuickMergesort uses at most \(n \lg n - 0.8358n + {\mathcal {O}}(\log n)\) comparisons. Furthermore, we examine the possibility of sorting base cases with some other algorithm using even less comparisons. By doing so the average-case number of comparisons can be reduced down to \(n \lg n - 1.4112n + o(n)\) for a remaining gap of only 0.0315n comparisons to the known lower bound (while using only \({\mathcal {O}}(\log n)\) additional space and \({\mathcal {O}}(n\log n)\) time overall). Implementations of these sorting strategies show that the algorithms challenge well-established library implementations like Musser’s Introsort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. We write \(\lg \) for \(\log _2\) and \(\log \) for the logarithm with an unspecified constant base in the \({\mathcal {O}}\) notation. A term \(\pm {\mathcal {O}}(f(n))\) indicates an error term with unspecified sign; formal details are given in Sect. 3.

  2. Throughout the text, we avoid the (in our context somewhat ambiguous) terms in-place or in-situ. We instead call an algorithm internal if it needs at most \({\mathcal {O}}(\log n)\) words of space (in addition to the array to be sorted). In particular, Quicksort is an internal algorithm whereas standard Mergesort is not (hence called external) since it uses a linear amount of buffer space for merges.

  3. The first two authors elaborate on how to make this approach worst-case efficient with little additional overhead in a recent article [13].

  4. Merging can be done in place using more advanced tricks (see, e.g., [19, 34]), but those tend not to be competitive in terms of running time with other sorting methods. By changing the global structure, a “pure” internal Mergesort variant [29, 42] can be achieved using part of the input as a buffer (as in QuickMergesort) at the expense of occasionally having to merge runs of very different lengths.

  5. We assume here an unmodified standard Mergesort variant that executes all merges in any case. In particular we assume the following folklore trick is not used: One can check (with one comparison) whether the two runs are already sorted prior to calling the merge routine and skip merging entirely if they are. This optimization leads to a linear best case and will increase the variance.

  6. We remark that this is no longer true for multiway partitioning methods where the number of comparisons per element is not necessarily the same for all possible outcomes. Similarly, the number of swaps in the standard partitioning method depends not only on the rank of the pivot, but also on how “displaced” the elements in the input are.

  7. It is, indeed, a reasonable option to enforce this assumption in an implementation by an explicit random shuffle of the input before we start sorting. Sedgewick and Wayne, for example, do this for the implementation of Quicksort in their textbook [47]. In the context of QuickXsort, a full random shuffle is overkill, though; see Remark 5.2 for more discussion.

  8. Meanwhile, an implementation has been created and made available at https://github.com/rbroesamle/Implementierung-von-in-place-Mergesort-Algorithmen. Although it is surprisingly fast, the experiments by its creators suggest that it does not beat our approach.

  9. We could also sort base cases of some slower growing size with Z, e.g., \(\varTheta (\log \log n)\). This avoids a constant factor overhead, but still gives a non-negligible additional term in \(\omega (n) \cap o(n \log n)\).

  10. For these experiments we use a different experimental setup: depending on the size of the arrays the displayed numbers are averages over 10–10,000 runs.

  11. Although the statement of our theorem is the same as for [52, Theorem 5.1], our proof here is significantly shorter than the one given there. By first taking the difference \(c(n) - x(n)\), we turn the much more complicated terms \({\mathbb {E}}[A_r x(J_{3-r})]\) from t(n) into the simpler \({\mathbb {E}}[x(J_r)]\) in \(t'(n)\), which allows us to entirely omit [52, Lemma E.1].

References

  1. Blum, M., Floyd, R.W., Pratt, V.R., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput. Syst. Sci. 7(4), 448–461 (1973)

    Article  MathSciNet  Google Scholar 

  2. Boehm, H.-J., Atkinson, R.R., Plass, M.F.: Ropes: an alternative to strings. Softw. Pract. Exp. 25(12), 1315–1330 (1995). https://doi.org/10.1002/spe.4380251203

    Article  Google Scholar 

  3. Cantone, D., Cincotti, G.: Quickheapsort, an efficient mix of classical sorting algorithms. Theor. Comput. Sci. 285(1), 25–42 (2002). https://doi.org/10.1016/S0304-3975(01)00288-2

    Article  MathSciNet  MATH  Google Scholar 

  4. Diekert, V., Weiß, A.: QuickHeapsort: modifications and improved analysis. Theory Comput. Syst. 59(2), 209–230 (2016). https://doi.org/10.1007/s00224-015-9656-y

    Article  MathSciNet  MATH  Google Scholar 

  5. Doberkat, E.E.: An average case analysis of Floyd’s algorithm to construct heaps. Inf. Control 61(2), 114–131 (1984). https://doi.org/10.1016/S0019-9958(84)80053-4

    Article  MathSciNet  MATH  Google Scholar 

  6. Dutton, R.D.: Weak-heap sort. BIT 33(3), 372–381 (1993)

    Article  MathSciNet  Google Scholar 

  7. Edelkamp, S., Stiegeler, P.: Implementing HEAPSORT with \(n \log n - 0.9n\) and QUICKSORT with \(n \log n + 0.2 n\) comparisons. ACM J. Exp. Algorithm. 10(5) (2002). https://doi.org/10.1145/944618.944623

    Article  Google Scholar 

  8. Edelkamp, S., Wegener, I.: On the performance of Weak-Heapsort. In: Symposium on Theoretical Aspects of Computer Science (STACS) 2000, vol. 1770, pp. 254–266. Springer (2000). https://doi.org/10.1007/3-540-46541-3_21

    Chapter  Google Scholar 

  9. Edelkamp, S., Weiß, A.: QuickXsort: efficient sorting with \(n \log n - 1.399n + o(n)\) comparisons on average (2013). arXiv:1307.3033

  10. Edelkamp, S., Weiß, A.: QuickXsort: efficient sorting with \(n \log n - 1.399 n + o(n)\) comparisons on average. In: International Computer Science Symposium in Russia, pages 139–152. Springer, Berlin (2014). https://doi.org/10.1007/978-3-319-06686-8_11

    Google Scholar 

  11. Edelkamp, S., Weiß, A.: BlockQuicksort: avoiding branch mispredictions in Quicksort. In: P. Sankowski and C. D. Zaroliagis, editors, European Symposium on Algorithms (ESA) 2016, volume 57 of LIPIcs, pp. 38:1–38:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, (2016). https://doi.org/10.4230/LIPIcs.ESA.2016.38

  12. Edelkamp, S., Weiß, A.: QuickMergesort: practically efficient constant-factor optimal sorting, 2018. arXiv:1804.10062

  13. Edelkamp, S., Weiß, A.: Worst-case efficient sorting with quickmergesort. In: Proceedings of the Twenty-First Workshop on Algorithm Engineering and Experiments, ALENEX 2019, San Diego, CA, USA, January 7–8, pp. 1–14 (2019). https://doi.org/10.1137/1.9781611975499.1

    Google Scholar 

  14. Edelkamp, S., Weiß, A., Wild, S.: Quickxsort—a fast sorting scheme in theory and practice (2018). arXiv:1811.01259

  15. Elmasry, A., Katajainen, J., Stenmark, M.: Branch mispredictions don’t affect mergesort. Int. Sympos. Exp. Algorithms (SEA) 2012, 160–171 (2012). https://doi.org/10.1007/978-3-642-30850-5_15

    Article  Google Scholar 

  16. Flajolet, P., Golin, M.: Mellin transforms and asymptotics. Acta Inform. 31(7), 673–696 (1994). https://doi.org/10.1007/BF01177551

    Article  MathSciNet  MATH  Google Scholar 

  17. Ford, L.R., Jr., Johnson, S.M.: A tournament problem. Am. Math. Mon. 66(5), 387–389 (1959). URL: http://www.jstor.org/stable/2308750

  18. Ford, L.R., Johnson, S.M.: A tournament problem. Am. Mathe. Mon. 66(5), 387 (1959). https://doi.org/10.2307/2308750

    Article  MathSciNet  MATH  Google Scholar 

  19. Geffert, V., Katajainen, J., Pasanen, T.: Asymptotically efficient in-place merging. Theor. Comput. Sci. 237(1–2), 159–181 (2000). https://doi.org/10.1016/S0304-3975(98)00162-5

    Article  MathSciNet  MATH  Google Scholar 

  20. Golin, M.J., Sedgewick, R.: Queue-mergesort. Inf. Process. Lett. 48(5), 253–259 (1993). https://doi.org/10.1016/0020-0190(93)90088-q

    Article  MATH  Google Scholar 

  21. Gonnet, G.H., Munro, J.I.: Heaps on heaps. SIAM J. Comput. 15(4), 964–971 (1986). https://doi.org/10.1137/0215068

    Article  MathSciNet  MATH  Google Scholar 

  22. Graham, R.L., Knuth, D.E., Patashnik, O.: A Foundation For Computer Science. Addison-Wesley, Concrete Mathematics (1994)

  23. Hennequin, P.: Combinatorial analysis of quicksort algorithm. RAIRO—Theoretical Informatics and Applications—Informatique Théorique et Applications 23(3), 317–333 (1989). http://eudml.org/doc/92337

    Article  MathSciNet  Google Scholar 

  24. Hoare, C.A.R.: Algorithm 65: find. Commun. ACM 4(7), 321–322 (1961). https://doi.org/10.1145/366622.366647

    Article  Google Scholar 

  25. Hwang, H.-K.: Limit theorems for mergesort. Random Struct. Algorithms 8(4), 319–336 (1996). https://doi.org/10.1002/(sici)1098-2418(199607)8:4<319::aid-rsa3>3.0.co;2-0

    Article  MathSciNet  Google Scholar 

  26. Hwang, H.-K.: Asymptotic expansions of the mergesort recurrences. Acta Inf. 35(11), 911–919 (1998). https://doi.org/10.1007/s002360050147

    Article  MathSciNet  MATH  Google Scholar 

  27. Iwama, K., Teruyama, J.: Improved average complexity for comparison-based sorting. In Ellen, F., Kolokolova, A., Sack, J. (eds.) Workshop on Algorithms and Data Structures (WADS), Proceedings, volume 10389 of Lecture Notes in Computer Science, pp. 485–496. Springer (2017). https://doi.org/10.1007/978-3-319-62127-2_41

    Google Scholar 

  28. Katajainen, J.: The ultimate heapsort. In: Proceedings of the Computing: The 4th Australasian Theory Symposium, Australian Computer Science Communications, pp. 87–96. Springer-Verlag Singapore Pte. Ltd., (1998). URL: http://www.diku.dk/~jyrki/Myris/Kat1998C.html

  29. Katajainen, J., Pasanen, T., Teuhola, J.: Practical in-place mergesort. Nordic J. Comput. 3(1), 27–40 (1996). http://www.diku.dk/~jyrki/Myris/KPT1996J.html

  30. Kim, P.-S., Kutzner, A.: Ratio based stable in-place merging. In: Agrawal, M., Du, D.-Z., Duan, Z., Li, A. (eds.) Theory and Applications of Models of Computation, 5th International Conference, TAMC 2008, Xi’an, China, April 25–29, 2008. Proceedings, volume 4978 of Lecture Notes in Computer Science, pp. 246–257. Springer, (2008). https://doi.org/10.1007/978-3-540-79228-4, https://doi.org/10.1007/978-3-540-79228-4_22

  31. Knuth, D.E.: The Art Of Computer Programming: Searching and Sorting, 2nd edn. Addison Wesley, Boston (1998)

    MATH  Google Scholar 

  32. Knuth, D.E.: Selected Papers on Analysis of Algorithms, volume 102 of CSLI Lecture Notes. Center for the Study of Language and Information Publications (2000)

  33. Mahmoud, H.M.: Sorting: A distribution theory. Wiley, New York (2000)

    Book  Google Scholar 

  34. Mannila, H., Ukkonen, E.: A simple linear-time algorithm for in situ merging. Inf. Process. Lett. 18(4), 203–208 (1984). https://doi.org/10.1016/0020-0190(84)90112-1

    Article  MathSciNet  Google Scholar 

  35. Martínez, C., Roura, S.: Optimal sampling strategies in Quicksort and Quickselect. SIAM J. Comput. 31(3), 683–705 (2001). https://doi.org/10.1137/S0097539700382108

    Article  MathSciNet  MATH  Google Scholar 

  36. McDiarmid, C.J.H.: Concentration. In: Habib, M., McDiarmid, C., Ramirez-Alfonsin, J., Reed, B. (eds.) Probabilistic Methods for Algorithmic Discrete Mathematics, pp. 195–248. Springer, Berlin (1998)

    Chapter  Google Scholar 

  37. McDiarmid, C.J.H., Reed, B.A.: Building heaps fast. J. Algorithms 10, 352–365 (1989)

    Article  MathSciNet  Google Scholar 

  38. Mike McFadden. WikiSort. Github repository at https://github.com/BonzaiThePenguin/WikiSort. https://github.com/BonzaiThePenguin/WikiSort

  39. Musser, D.R.: Introspective sorting and selection algorithms. Softw. Pract. Exp. 27(8), 983–993 (1997)

    Article  Google Scholar 

  40. NIST Digital Library of Mathematical Functions. Release 1.0.10; Release date 2015-08-07. URL: http://dlmf.nist.gov

  41. Panny, W., Prodinger, H.: Bottom-up mergesort—a detailed analysis. Algorithmica 14(4), 340–354 (1995). https://doi.org/10.1007/BF01294131

    Article  MathSciNet  MATH  Google Scholar 

  42. Reinhardt, K.: Sorting in-place with a worst case complexity of \(n \log n - 1.3n+O(\log n)\) comparisons and \(\epsilon n \log n+O(1)\) transports. In: International Symposium on Algorithms and Computation (ISAAC), pp. 489–498 (1992). https://doi.org/10.1007/3-540-56279-6_101

    Chapter  Google Scholar 

  43. Roura, S.: Divide-and-Conquer Algorithms and Data Structures. Tesi doctoral (Ph. D. thesis, Universitat Politècnica de Catalunya (1997)

  44. Roura, S.: Improved master theorems for divide-and-conquer recurrences. J. ACM 48(2), 170–205 (2001). https://doi.org/10.1145/375827.375837

    Article  MathSciNet  MATH  Google Scholar 

  45. Sedgewick, R.: The analysis of Quicksort programs. Acta Inf. 7(4), 327–355 (1977). https://doi.org/10.1007/BF00289467

    Article  MathSciNet  MATH  Google Scholar 

  46. Sedgewick, R., Flajolet, P.: An Introduction to the Analysis of Algorithms, 2nd edn. Addison-Wesley-Longman, Boston (2013)

    MATH  Google Scholar 

  47. Sedgewick, R., Wayne, K.: Algorithms, 4th edn. Addison-Wesley, Boston (2011)

    MATH  Google Scholar 

  48. Sohrab, H.H.: Basic Real Analysis, 2nd edn. Springer Birkhäuser, Berlin (2014)

    MATH  Google Scholar 

  49. Stober, F., Weiß, A.: On the average case of MergeInsertion. In: International Workshop on Combinatorial Algorithms (IWOCA) 2019 (2019). arXiv:1905.09656

    Google Scholar 

  50. Wegener, I.: Bottom-up-Heapsort, a new variant of Heapsort beating, on an average, Quicksort (if \(n\) is not very small). Theor. Comput. Sci. 118(1), 81–98 (1993)

    Article  MathSciNet  Google Scholar 

  51. Wild, S.: Dual-pivot quicksort and beyond: analysis of multiway partitioning and its practical potential. Doktorarbeit (Ph.D. thesis), Technische Universität Kaiserslautern, 2016. ISBN 978-3-00-054669-3. http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:hbz:386-kluedo-44682

  52. Wild, S.: Average cost of QuickXsort with pivot sampling. In Fill, J.A., Ward, M.D. (eds.) International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2018), LIPIcs (2018). https://doi.org/10.4230/LIPIcs.AofA.2018.36

  53. Wild, S.: Quicksort is optimal for many equal keys. In: Workshop on Analytic Algorithmics and Combinatorics (ANALCO) 2018, pp. 8–22. SIAM, January (2018). arXiv:1608.04906, https://doi.org/10.1137/1.9781611975062.2

    Chapter  Google Scholar 

  54. Wild, S.: Supplementary mathematica notebook for variance computation (2018). https://doi.org/10.5281/zenodo.1463020

Download references

Acknowledgements

We thank our anonymous reviewers for their thoughtful comments which significantly helped improving the presentation.

Funding

The second author was supported by the German Research Foundation (DFG), grant DI 435/7-1. The last author was supported by the Natural Sciences and Engineering Research Council of Canada and the Canada Research Chairs Programme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Wild.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Parts of this article have been presented (in preliminary form) at the International Computer Science Symposium in Russia (CSR) 2014 [10] and at the International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA) 2018 [52].

Appendices

Appendix

A Notation

1.1 Generic Mathematics

\({\mathbb {N}}\), \({\mathbb {N}}_0\), \({\mathbb {Z}}\), \({\mathbb {R}}\):

Natural numbers \({\mathbb {N}}= \{1,2,3,\ldots \}\), \({\mathbb {N}}_0 = {\mathbb {N}}\cup \{0\}\), integers \({\mathbb {Z}}= \{\ldots ,-2,-1,0,1,2,\ldots \}\), real numbers \({\mathbb {R}}\).

\({\mathbb {R}}_{>1}\), \({\mathbb {N}}_{\ge 3}\) etc.:

Restricted sets \(X_\text {pred} = \{x\in X : x \text { fulfills } \text {pred} \}\).

\(\ln (n)\), \(\lg (n)\), \(\log n\):

Natural and binary logarithm; \(\ln (n) = \log _e(n)\), \(\lg (n) = \log _2(n)\). We use \(\log \) for an unspecified (constant) base in \({\mathcal {O}}\)-terms

X :

To emphasize that X is a random variable it is Capitalized.

[ab):

Real intervals, the end points with round parentheses are excluded, those with square brackets are included.

[m..n], [n]:

Integer intervals, \([m..n] = \{m,m+1,\ldots ,n\}\); \([n] = [1..n]\).

\(\llbracket \text {stmt}\rrbracket \), \(\llbracket x=y\rrbracket \):

Iverson bracket, \(\llbracket \text {stmt}]\rrbracket = 1\) if stmt is true, \(\llbracket \text {stmt}\rrbracket = 0\) otherwise.

\(H_{n}\) :

nth harmonic number; \(H_{n} = \sum _{i=1}^n 1/i\).

\(x \pm y\) :

x with absolute error |y|; formally the interval \(x \pm y = [x-|y|,x+|y|]\); as with \({\mathcal {O}}\)-terms, we use one-way equalities \(z=x\pm y\) instead of \(z \in x \pm y\).

\(a^{{\underline{b}}}\), \(a^{{{\overline{b}}}}\):

Factorial powers; “a to the b falling resp. rising”; e.g., \(x^{{\underline{3}}} = x(x-1)(x-2)\), \(x^{\underline{-3}} = 1/((x+1)(x+2)(x+3))\).

\(\left( {\begin{array}{c}n\\ k\end{array}}\right) \) :

Binomial coefficients; \(\left( {\begin{array}{c}n\\ k\end{array}}\right) = n^{{\underline{k}}} \big / k!\).

\(\text {B}(\lambda ,\rho )\) :

For \(\lambda ,\rho \in {\mathbb {R}}_+\); the beta function, \(\text {B}(\lambda ,\rho ) = \int _0^1 z^{\lambda -1}(1-z)^{\rho -1}\, dz\); see also Eq. (9) on page 42.

\(I_{x,y}(\lambda ,\rho )\) :

The regularized incomplete beta function; \(I_{x,y}(\lambda ,\rho )= \int _x^y \frac{z^{\lambda -1}(1-z)^{\rho -1}}{\text {B}(\lambda ,\rho )} dz\) for \(\lambda ,\rho \in {\mathbb {R}}_+\), \(0\le x\le y\le 1\).

1.2 Stochastics-Related Notation

\({\mathbb {P}}[E]\), \({\mathbb {P}}[X=x]\):

Probability of an event E resp. probability for random variable X to attain value x.

\({\mathbb {E}}[X]\) :

Expected value of X; we write \({\mathbb {E}}[XY]\) for the conditional expectation of X given Y, and to emphasize that expectation is taken w.r.t. random variable X.

:

Equality in distribution; X and Y have the same distribution.

\({\mathcal {U}}(a,b)\) :

Uniformly in \((a,b)\subset {\mathbb {R}}\) distributed random variable.

\(\text {Beta}(\lambda ,\rho )\) :

Beta distributed random variable with shape parameters \(\lambda \in {\mathbb {R}}_{>0}\) and \(\rho \in {\mathbb {R}}_{>0}\).

\(\text {Bin}(n,p)\) :

Binomial distributed random variable with \(n\in {\mathbb {N}}_0\) trials and success probability \(p\in [0,1]\).

\(\text {BetaBin}(n,\lambda ,\rho )\) :

Beta-binomial distributed random variable; \(n\in {\mathbb {N}}_0\), \(\lambda ,\rho \in {\mathbb {R}}_{>0}\);

1.3 Specific Notation for Algorithms and Analysis

n :

Length of the input array, i.e., the input size.

k, t:

Sample size \(k\in {\mathbb {N}}_{\ge 1}\), odd; \(k=2t+1\), \(t\in {\mathbb {N}}_0\); we write k(n) to emphasize that k might depend on n.

w :

Threshold for recursion, for \(n\le w\), we sort inputs by X; we require \(w\ge k-1\).

\(\alpha \) :

\(\alpha \in [0,1]\); method X may use buffer space for \(\lfloor \alpha n\rfloor \) elements.

c(n):

Expected costs of QuickXsort; see Sect. 9.2.

x(n), a, b:

Expected costs of X, \(x(n) = a n \lg n + b n \pm o(n)\); see Sect. 9.2.

\(J_1\), \(J_2\):

(Random) Subproblem sizes; \(J_1+J_2 = n-1\); \(J_1 = t + I_1\);

\(I_1\), \(I_2\):

(Random) Segment sizes in partitioning; ; \(I_2 = n-k-I_1\); \(J_1 = t + I_1\)

R :

(One-based) Rank of the pivot; \(R=J_1+1\).

s(k):

(Expected) cost for pivot sampling, i.e., cost for choosing median of k elements.

\(A_1\), \(A_2\), A:

Indicator random variables; \(A_1 = \llbracket \text {left subproblem sorted recursively}\rrbracket \); see Sect. 9.2.

B Mathematical Preliminaries

In this appendix, we restate some known results for the reader’s convenience.

1.1 Hölder Continuity

A function \(f:I\rightarrow R\) defined on a bounded interval I is Hölder-continuous with exponent \(\eta \in (0,1]\) if

$$\begin{aligned} \exists C\; \forall x,y\in I\;: \bigl | f(x) - f(y) \bigr | \;\le C |x-y|^\eta . \end{aligned}$$

Hölder-continuity is a notion of smoothness that is stricter than (uniform) continuity but slightly more liberal than Lipschitz-continuity (which corresponds to \(\eta =1\)). \(f:[0,1]\rightarrow {\mathbb {R}}\) with \(f(z) = z \ln (1/z)\) is a stereotypical function that is Hölder-continuous (for any \(\eta \in (0,1)\)) but not Lipschitz (see Lemma 9.1 below).

One useful consequence of Hölder-continuity is given by the following lemma: an error bound on the difference between an integral and the Riemann sum ([51, Proposition 2.12–(b)]).

Lemma B.1

(Hölder integral bound) Let \(f:[0,1] \rightarrow {\mathbb {R}}\) be Hölder-continuous with exponent \(\eta \). Then

$$\begin{aligned} \int _{0}^1 f(x) \, dx&\;\;= \frac{1}{n} \sum _{i = 0}^{n-1} f(i / n) \;\;\pm {\mathcal {O}}(n^{-\eta }), \qquad (n\rightarrow \infty ). \end{aligned}$$

\(\square \)

Remark B.2

(Properties of Hölder-continuity) We considered only the unit interval as the domain of functions, but this is no restriction: Hölder-continuity (on bounded domains) is preserved by addition, subtraction, multiplication and composition (see, e.g., [48, Section 4.6] for details). Since any linear function is Lipschitz, the result above holds for Hölder-continuous functions \(f:[a,b] \rightarrow {\mathbb {R}}\).

If our functions are defined on a bounded domain, Lipschitz-continuity implies Hölder-continuity and Hölder-continuity with exponent \(\eta \) implies Hölder-continuity with exponent \(\eta ' < \eta \). A real-valued, differentiable function is Lipschitz if its derivative is bounded.

1.2 Chernoff Bound

We write if X is has a binomial distribution with \(n\in {\mathbb {N}}_0\) trials and success probability \(p\in [0,1]\). Since X is a sum of independent random variables with bounded influence on the result, Chernoff bounds imply strong concentration results for X. We will only need a very basic variant given in the following lemma.

Lemma B.3

(Chernoff Bound, Theorem 2.1 of [36]) Let and \(\delta \ge 0\). Then

(26)

\(\square \)

1.3 Continuous Master Theorem

For solving recurrences, we build upon Roura’s master theorems [44]. The relevant continuous master theorem is restated here for convenience:

Theorem B.4

(Roura’s Continuous Master Theorem (CMT)) Let \(F_n\) be recursively defined by

$$\begin{aligned} F_n \;\;= {\left\{ \begin{array}{ll} b_n\,, &{}\hbox { for } 0 \le n < N ; \\ t_n \,+ \smash {\sum _{j=0}^{n-1} w_{n,j} \, F_j}, &{}\hbox { for }n \ge N\,, \end{array}\right. } \end{aligned}$$
(27)

where \(t_n\), the toll function, satisfies \(t_n \sim K n^\sigma \log ^\tau (n)\) as \(n\rightarrow \infty \) for constants \(K\ne 0\), \(\sigma \ge 0\) and \(\tau > -1\). Assume there exists a function \(w:[0,1]\rightarrow {\mathbb {R}}_{\ge 0}\), the shape function, with \(\int _0^1 w(z) dz \ge 1 \) and

$$\begin{aligned} \sum _{j=0}^{n-1} \,\biggl | w_{n,j} \,- \! \int _{j/n}^{(j+1)/n} w(z) \, dz \biggr | \;\;= {\mathcal {O}}(n^{-d}), \qquad (n\rightarrow \infty ), \end{aligned}$$
(28)

for a constant \(d>0\). With \(\displaystyle H {:}{=}1 - \int _0^1 \!z^\sigma w(z) \, dz\), we have the following cases:

  1. 1.

    If \(H > 0\), then \(\displaystyle F_n \sim \frac{t_n}{H}\).

  2. 2.

    If \(H = 0\), then \(\displaystyle F_n \sim \frac{t_n \ln n}{{{\widetilde{H}}}}\) with \(\displaystyle {{\widetilde{H}}} = -(\tau +1)\int _0^1 \!z^\sigma \ln (z) \, w(z) \, dz\).

  3. 3.

    If \(H < 0\), then \(F_n = {\mathcal {O}}(n^c)\) for the unique \(c\in {\mathbb {R}}\) with \(\displaystyle \int _0^1 \!z^c w(z) \, dz = 1\).

\(\square \)

Theorem B.4 is the “reduced form” of the CMT, which appears as Theorem 1.3.2 in Roura’s doctoral thesis [43], and as Theorem 18 of [35]. The full version (Theorem 3.3 in [44]) allows us to handle sublogarithmic factors in the toll function, as well, which we do not need here.

C Pseudocode

In this appendix, we list explicit pseudocode for the most important merging procedures.

1.1 Simple Merge by Swaps

Algorithm C.1 is a merging procedure with an in-place interface: upon termination, the merge result is found in the same area previously occupied by the merged runs. The method works by moving the left run into a buffer area first. More specifically, we move \(A[\ell ..m-1]\) into the buffer area \(B[b..b+n_1-1]\) and then merge it with the second run A[m..r]—still residing in the original array—into the empty slot left by the first run. By the time this first half is filled, we either have consumed enough of the second run to have space to grow the merged result, or the merging was trivial, i.e., all elements in the first run were smaller.

figure d

This method is mostly for demonstration purposes, how little changes are required to use Mergesort in QuickXsort. It is not the best solution, though.

1.2 Ping-Pong Mergesort

In this section, we give detailed code for the “ping-pong” Mergesort variant that we use in our QuickMergesort implementation. It also uses \(\alpha =\frac{1}{2}\), i.e., buffer space to hold half of the input. By using a special procedure, MergesortOuter, for the outermost call following the strategy illustrated in Fig. 3, we reduce the task to the case of \(\alpha =1\) for two subproblems. These are solved by moving elements from the input area to the output area (while sorting them), which is easily achieved by a recursive procedure (MergesortInner).

figure e

1.3 Reinhardt’s Merge

In this section, we give the code for Reinhardt’s merge [42], which allows a smaller buffer \(\alpha = \frac{1}{4}\). We do not present Reinhardt’s whole Mergesort algorithm based on it but merely the actual merging routine as illustrated in Fig. 4. The organization of the different calls to merging is easy but requires many tedious index calculations, which do not add much insight for the reader.

figure f

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Edelkamp, S., Weiß, A. & Wild, S. QuickXsort: A Fast Sorting Scheme in Theory and Practice. Algorithmica 82, 509–588 (2020). https://doi.org/10.1007/s00453-019-00634-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-019-00634-0

Keywords

Navigation