The greeNsort® in-memory sorting algorithms improve merging, partitioning and introduce a novel class of in-memory sorting algorithms

## Better merging

The greeNsort® research shows that an algorithm based on merging ($$\mathcal{O}(N\log{N})$$) can smoothly be tuned to linear cost for the presorted best-case ($$\mathcal{O}(N)$$) without sacrificing its worst-case performance (when data is unsorted). The famous Timsort algorithm in Python and Java is known to have a linear best-case, less known is that Timsort is dramatically slower than much simpler algorithms when it comes to real sorting tasks (that are not almost presorted).

The greeNsort® algorithms achieve better sustainability than the combined teaching on Mergesorts by Donald Knuth, Robert Sedgewick, Jyrki Katajainen, B-C. Huang and M. A. Langston and Tim Peters. The greeNsort® research completes an open topic since the invention of Mergesort by John von Neumann in 1945.

## Better partitioning

The greeNsort® research shows that an algorithm based on partitioning ($$\mathcal{O}(N\log{N})$$) can smoothly be tuned to linear cost for the best-case of purely tied keys ($$\mathcal{O}(N\log{d})$$, where d is the number of distinct values) without sacrificing its worst-case performance (when all keys are distinct hence no ties). The famous threeway Quicksort algorithm early-terminates on ties but pays the price of extra-operations in the worst-case compared to the simpler binary Quicksort.

The greeNsort® algorithms achieve $$\mathcal{O}(N\log{d})$$ early termination without extra overhead, unlike the Quicksorts taught by R. C. Singleton (1969), Robert Sedgewick (1977), Wegner (1985) and Bentley&McIlroy (1993), Vladimir Yaroslavskiy (2009) and Edelkamp&Weiß (2016)1. The greeNsort® research hence completes an open topic since the invention of Quicksort by C. A. R. Hoare in 1961.

## Theory: a new algorithmic class

The greeNsort® research has developed a new class of algorithms spanning merging and partitioning

• some embodiments are simple and elegant
• some embodiments are robust and adaptive
• some embodiments are more sustainable than the prior-art: have better trade-offs between speed (energy) and memory requirements (hardware)
• some embodiments are more adaptive than the prior-art: have better trade-offs between worst-case and best-case performance

## Software: validated C-code

The testbed includes 62 greeNsort® algorithms and 49 prior-art algorithms of comparable implementation quality2 written in C. The code is bundled in a well-documented R-package with 36 synthetic data generators and 17 data generators for strings of varying size, in particular:

• all implementations are single threaded (however, greeNsort® algorithms are designed for parallel execution)
• most implementations are binary algorithms (with a few 3-ary and 4-ary exceptions)
• the algorithms have been tested on numerous data patterns
• the algorithms have been tested for correctness (and stable ones for correct stability)
• the algorithms have been tested for performance (they return their runtime, memory and footprint)
• the algorithms have been tested for correct memory access through the Valgrind checker
• the algorithms have been validated by Intel RAPL energy measurements
• the algorithms have been validated against numerous prior-art algorithms

## Empirics: huge savings potential

The best greeNsort® algorithms have been compared against the best prior-art algorithms for a couple of important sorting disciplines, for details see the greeNsort® portfolio. The available greeNsort® portfolio of algorithms delivers faster sorting with less hardware. Depending on the sorting discipline greeNsort® algorithms save 3% to 40% of sorting time and energy. greeNsort® algorithms allow switching to more sustainable sorting disciplines because they deliver competitive speed while saving up to 50% of hardware memory. The results for stable sorting on a mix of real-world data situations are shown in the following.

The prior-art shows a typical space-time trade-of from Grailsort (Grail: 0% buffer) being slowest, over Sqrtsort (Sqrt: almost 0% buffer) and Timsort3 (Tim: 50% buffer) to Mergesort (Knuth: 100% buffer) being fastest. The greeNsort® algorithms are consistently faster than the prior-art given equal amount of buffer. Note that Frogsort2 can sort faster and with less memory than the prior-art when given between 5% and 45% buffer. Note further that Walksort is even faster than Timsort although it requires almost no buffer. Finally note that Omitsort like Timsort achieves linear cost for presorted data but without sacrificing speed on difficult sorting tasks: it’s as fast as Mergesort.

How much the greeNsort® algorithms are better can bee seen when looking at the fair Footprint measure instead of Runtime (which favors algorithms using much memory): Compared to the fastest Mergesort (Knuth) the best greeNsort® algorithm (Frog2) reduces the footprint by about 50% (and is faster), compared to Timsort the footprint is reduced by about 40% and even compared to Sqrtsort4 the footprint reduction is about 30%.

Results! Why, man, I have gotten a lot of results. I know several thousand things that won’t work. — Thomas Edison

1. All those algorithms have been fairly compared with similar implementation quality. Orson Peters (2015) Pattern Defeating Quicksort is not on the above list, because his extremely professional and tuned C++ implementation cannot be compared. Theoretically the greeNsort® algorithms are slightly more efficient however, this has yet to be empirically checked↩︎

2. Exceptions are Tim Peters Timsort the implementation of Timothy Van Slyke is used, and Huang&Langston’s Grailsort and related algorithms implemented by Andrey Astrelin↩︎

3. Note that Timsort is lower than proper standard textbook Mergesort on all five input data patterns, including globally presorted data and locally presorted data↩︎

4. Note that Sqrtsort can only sort equally sized elements, whereas Frogsort2 can handle elements of different size↩︎