Oh, Snap!
A randomly lossy hybrid esoteric sorting algorithm
Any introduction or guide to sorting algorithms usually includes off-the-shelf algorithms —quicksort, merge sort, insertion sort, heap sort , etc.— that are considered appropriate for everyday use — in other words, categorically not the esoteric sorting algorithms I’ve been exploring, such as sleep sort or permutation sort.
Each algorithm has different trade-offs in terms of space, time and stability. Even with a generously stocked shelf of algorithms, however, not every algorithm fits every situation, so hybrid algorithms are common. Adaptation and composition is a common feature of programming, but some hybrids are deliberate enough and interesting enough to be singled out and named. For example, Timsort is a stable sorting algorithm first introduced in Python that combines merge sort and insertion sort; introsort, first used in implementations of the C++ standard library, optimises quicksort with heap sort.
The question, then, is whether or not it makes sense to have hybrid esoteric algorithms. Perhaps ‘makes sense’ is too strong a qualification here, but in short the answer is ‘yes’. As an example, Thanos sort can be considered a mash-up of both dropsort and bogosort:
- Bogosort shuffles values randomly until they are sorted. Although the odds of never reaching a sorted state are low enough to defy both floating-point representation and human imagination, bogosort is not technically guaranteed to terminate. This O(∞) worst-case possibility of non-termination is just a taste of the inefficiency on offer.
- Dropsort simply and efficiently eliminates any value that is not in sorted order. Simple, stable, efficient — executes in linear time, no less! — but somewhat lossy.
Thanos sort has a distinctive structure and outcome drawn from both dropsort and bogosort: simple, stable, lossy and random, but guaranteed to terminate.
Best of both
To grok the name and underlying principle of Thanos sort, you need to know that the Thanos in question is a supervillain from the Marvel Cinematic Universe:
Thanos was a genocidal warlord from Titan, whose objective was to bring stability to the universe by wiping out half of all life at every level, as he believed its massive population would inevitably use up the universe’s entire supply of resources and perish.
Spoiler alert: he achieved his objective at the end of Avengers: Infinity War.
He completed the Infinity Gauntlet, using it to finally complete his goal through the Snap, which resulted in half of all life in the universe being wiped out with a mere snap of his fingers.
Thus, given a sequence of values, the underlying principle of Thanos sort is to repeatedly and randomly eliminate half the values until the remainder is sorted. Leaving aside the not insignificant issue of value retention, it is structurally similar to permutation sort or (unintentionally optimised) bogosort. At a high level this suggests the following JavaScript:
function thanosSort(values) {
while (!isSorted(values))
snap(values);
}
The approach of progressively partitioning — in this case, a part to keep and a part to drop — until a partition is sorted is (very vaguely) reminiscent of quicksort. Such a divide-and-destroy approach might inspire a recursive implementation:
function thanosSort(values) {
if (!isSorted(values)) {
snap(values);
thanosSort(values);
}
}
Either way, elimination only happens when the initial sequence is unsorted.
function isSorted(values) {
return values.every(
(value, index) => index === 0 || value >= values[index - 1]);
}
Thanos sort, therefore, has no effect on already sorted input, which includes empty and single-item sequences. This is in contrast to correctly pessimised implementations of bogosort, which lead with the sorting action rather than the sorted check. The snap function is the sorting action for Thanos sort:
function snap(values) {
const survivors = Math.floor(values.length / 2);
while (values.length > survivors) {
const toSnap = Math.floor(Math.random() * values.length);
values.splice(toSnap, 1);
}
}
Here, elements are dropped randomly one by one until the desired number of survivors is reached.
The number primitive in JavaScript is floating point, hence the recurring use of Math.floor to round the length division down to an integer, first from the result of length division and then from the random-number expression. (Note that I would usually favour Math.trunc as the more intentional operation, truncating the non-integral part and which is more directly equivalent to a cast to int in other languages, but I’m using floor here to be consistent with subsequent pieces of code.)
As a reminder, when we say ‘random’ we mean ‘pseudo-random’ unless otherwise stated. We should be even more keenly aware that what JavaScript offers out of the box in this department is suitable only for code that wants ‘to mix things up a little’. There is no choice over the generator or control over the sequence, making it unsuitable for anything that requires known statistical qualities or repeatability. This limitation curtails not just use in cryptography and physics simulation but also testing. The (surprising and frustrating) absence of a seeding function makes JavaScript’s pseudo-random number generator even more rudimentary than the PRNG facility found in many home computer BASICs of the 70s and 80s.
Survivor bias
There is a subtle bias for certain classes of input lurking in this code. The result of integer division (div) for natural numbers is always either equal to or less than the result of true division, i.e., x div y ≤ x÷y. For example, 4 div 2 and 4÷2 are both 2, but 3 div 2 is 1 whereas 3÷2 is 1.5. Therefore, the number of surviving values will always be less than half for an odd number of candidate values. In other words, the number of values eliminated will, on average, be half or more, which departs ever so slightly from the brief. This is a troubling detail. (And if it doesn’t trouble you, you’re probably reading the wrong blog.)
If we decide to round up rather than down in such cases, i.e., using Math.ceil rather than Math.floor, that simply biases the other way. Arbitrarily choosing which implementation to write — even if the arbitrary choice is done randomly via dice roll or coin toss— leaves it with a fixed bias one way or another. (It’s worth noting that random and arbitrary are not synonyms: randomness is a matter of chance, whereas arbitrariness is a matter of whim. For example, although when a site or app decides to log you out feels random, it is in fact arbitrary. One is an unknowable non-deterministic process, the other is an unknown process.)
In keeping with the spirit of Thanos sort, the solution to our troubles lies in deciding randomly at runtime rather than hardcoding arbitrarily at development time.
function snap(values) {
const round = Math.random() < 0.5 ? Math.floor : Math.ceil;
const survivors = round(values.length / 2);
while (values.length > survivors) {
const toSnap = Math.floor(Math.random() * values.length);
values.splice(toSnap, 1);
}
}
The rounding strategy (down to floor or up to ceiling) is chosen randomly (keeping in mind that 0 ≤ Math.random() < 1) and then applied to the result of the length division. This makes no difference to cases of even division, but for odd division around half the invocations will round down and half will round up. (Note that just using Math.round would not provide the desired behaviour: it always rounds up for decimal parts of 0.5 and over, down otherwise, which means in the case of odd division it will always behave like Math.ceil.)
Halfway house
Having paid such attention to detail to hit the halfway mark, it’s worth also looking at an implementation that subtly misses the mark. This one is based on a common logic banana skin:
- This is true: given that half the elements are to be eliminated, the probability of any element being eliminated is half.
- This is not: given that the probability of an element being eliminated is half, half the elements will be eliminated.
The difference between these framings of the problem is the difference between the code shown previously and the following:
function snap(values) {
for (let toSnap = values.length - 1; toSnap >= 0; --toSnap)
if (Math.random() < 0.5)
values.splice(toSnap, 1);
}
This code traverses the elements in reverse order, deciding by coin toss whether or not to eliminate each element in turn. For an even array length, will this eliminate exactly half the elements? Maybe it will, but most likely it won’t. Using the implementation of snap from the previous section, you’re halfway there; using this latter one, you’re living on a prayer.
There are 4 possible sequences when you flip a fair coin (50% chance of heads, 50% chance of tails) twice in a fair way: HH, HT, TH, TT. The probability half the tosses are heads (i.e., 1 head) and half the tosses are tails (i.e., 1 tail) is 2 in 4 (HT and TH), i.e., 50% not 100%. Thus, what is not true is that the probability of half for an individual event implies half of all events will also have that outcome, whether it’s heads or tails, elimination or retention.
function toss() {
return Math.round(Math.random()) ? 'H' : 'T';
}
The probability of an exactly even split decreases below 50% as N increases above 2.
let tosses = [];
for (let flip = 0; flip !== N; ++flip)
tosses.push(toss());
So, leaving aside the question of odd lengths, the aggregate probability that the coin-toss implementation of snap will eliminate exactly half the elements for a long array is significantly less than 50%, even though the individual element elimination probability is 50%. The probability can be calculated as (N!/(½N)! ²)/2ᴺ.
You can explore this yourself experimentally. Even with N as low as 10, you will find the probability of an even split has dropped to around 25%. By the time N hits 1000 the probability is around 2.5%.
This is normal. Literally. As N increases the probability distribution tends to the Gaussian distribution. This follows from the central limit theorem:
The central limit theorem (CLT) states that the distribution of sample means approximates a normal distribution as the sample size gets larger, regardless of the population’s distribution.
Snip-snap
There are other possible strategies for random elimination that still satisfy our constraints and meet our expectations. For example, instead of randomly dropping elements until we have halved the number, we could randomly choose to drop either the first half or the second half. In Python, this literal binary chopping snapproach can be expressed concisely as
def snapped(values):
pivot = len(values) // 2
return values[:pivot] if choice([False, True]) else values[pivot:]
Following the out-of-place sorting style of Python’s built-in sorted function, this leads to the following realisation of Thanos sort:
def thanos_sorted(values):
result = list(values)
while not is_sorted(result):
result = snapped(result)
return result
While this approach to halving is significantly more specific than the Snap, in which there is no concept of ordered or grouped elimination, it does at least have precedent in how Thanos would ‘balance’ the population of each world he invaded.
Instead of a linearithmic worst case we have gone to logarithmic, i.e., we now have O(log n) instead of O(n log n) calls to a random number generator. This may feel like a big win, but a sense of balance and proportion is appropriate when taking this delinearised improvement into account: it’s is still an inherently inefficient and intrinsically incomplete way to sort. Definitely NSFW, so don’t get too excited over any apparent optimisations.
There is surely nothing quite so useless as doing with great efficiency what should not be done at all.
If we are happy to consider using more fixed strategies for halving, there is no reason we are stuck with halves comprising a contiguous run of elements. For example, we could choose between dropping elements at even- or odd-indexed positions:
def snapped(values):
return values[::2] if choice([False, True]) else values[1::2]
Of course, using a more limited strategy means certain possible sorted outcomes will be unreachable. But, in spite of this constraint, such approaches still retain the key characteristics of Thanos sort: (1) they sort by progressively dropping half the elements and (2) that half is decided randomly.
What does not qualify as a valid implementation, however, is elimination of the same grouping or pattern of elements without any randomisation. For example, if the code is hardcoded to always eliminate the first half or hardcoded to always eliminate the even elements. These variations are not simply more specific implementations of the algorithm: they are different algorithms. Fun as such examples are, imposing and hardwiring a preference that is neither random nor canon makes them predictable variants of dropsort but not Thanos sort. Randomness has been a key and unambiguous feature of Thanos sort since its inception.
Plug and play
Thinking about these algorithmic variations suggests we could lift the specific snapping strategy out of the function and pass in our strategy of choice. This is similar to how randomness could be lifted out of bogosort, but taken to the next level where the whole transformation step (i.e., sorting action) of the algorithm can be parameterised, not just the data that feeds it.
def sorted_by(values, *, sorter):
result = list(values)
while not is_sorted(result):
result = sorter(result)
return result
Picking up on the earlier observation about Thanos sort’s structural similarity to permutation sort and bogosort, the sorted_by function offers a template for sorting where no intermediate state except a previously transformed input is used. It can be called with a transformation step, sorter, that is repeatedly applied until the result is sorted. For example, using any of the snapped variations shown so far, Thanos sort can be implemented as
def thanos_sorted(values):
return sorted_by(values, sorter=snapped)
The structure of sorted_by assumes neither randomness nor elimination, so it is a generalised chassis that accommodates more than just Thanos sort. We can call it with a fixed elimination strategy, for example, such as dropping every second element:
sorted_by(values, sorter=lambda values: values[::2])
Or dropping the second half of the elements:
sorted_by(values, sorter=lambda values: values[:len(values) // 2])
We can also go the other way and implement a shuffle-second version of bogosort:
sorted_by(values, sorter=lambda values: random.sample(values, len(values)))
We can also reimplement permutation sort and pass in a permuter function as the sorter. To do this in Python requires hand rolling a single-step permutation that relies only on its input and not on any additional iteration state. This is in contrast to the permutations generator function that maintains state from which it yields successive permutations. The next_permutation and prev_permutation operations in C++, by contrast, use a lexicographical ordering approach that can also be used as the basis for a sorter argument to sorted_by.
Such an open approach to composition — not extension, which is sometimes confused with composition — offers many possibilities… but might be considered overengineering and overkill for esoteric algorithms. On the other hand, overengineering and overkill might also be considered on message.
Property rights
The algorithms may be esoteric, but we need to know they work. What, then, do we mean by ‘work’ for sorting algorithms? A non-lossy sorting — whether bubble sort or bogosort — is governed by two properties:
- The result must be sorted.
- The result must be a permutation of the original values.
When prompted, people typically remember to state the first property but not the second. We can use the properties as the basis for assertions against general examples — the approach used in property-based testing — but we can also use simple assertions against specific examples to concretely and concisely demonstrate and confirm the effect of sorting:
assert permutation_sorted([]) == []
assert permutation_sorted([42]) == [42]
assert permutation_sorted([42] * 3) == [42] * 3
assert permutation_sorted(range(10)) == list(range(10))
assert permutation_sorted(range(10, 0, -1)) == list(range(1, 11))
assert permutation_sorted([3, 1, 4, 1, 5, 9]) == [1, 1, 3, 4, 5, 9]
Thus, lists that are empty, hold a single value or contain multiple identical values (such as three occurrences of 42) are unchanged by sorting. A list containing integers in ascending order, such as 0 to 9 inclusive, is also already sorted and, therefore, unchanged. A list containing integers in descending order, such as 10 down to 1 inclusive, will be reversed. A list with mixed values, including duplicates, will be sorted into non-descending order, with all of the values — and therefore the length — preserved. Tests for any non-lossy sort will look similar, esoteric or not.
For something lossy like dropsort, however, the second property doesn’t hold: the result is a permutation of some but not necessarily all the original values. The values in the result are defined as the non-descending successors. The result preserves the order in which they occurred in the input, which is easier to illustrate by example:
assert dropsorted([]) == []
assert dropsorted([42]) == [42]
assert dropsorted([42] * 3) == [42] * 3
assert dropsorted(range(10)) == list(range(10))
assert dropsorted(range(10, 0, -1)) == [10]
assert dropsorted([3, 1, 4, 1, 5, 9]) == [3, 4, 5, 9]
The (absence of) effect is the same in cases that are already sorted. For unsorted initial cases, the result is knowable, albeit lossy. The loss is deterministic, i.e., it can be determined by an algorithm and can be written specifically for a given input example.
But what about Thanos sort? For already-sorted cases sorting is still an identity operation, producing the same result as the input as in other sorting examples:
assert thanos_sorted([]) == []
assert thanos_sorted([42]) == [42]
assert thanos_sorted([42] * 3) == [42] * 3
assert thanos_sorted(range(10)) == list(range(10))
For unsorted cases, however, the result is not generally known in advance. For a reverse-sorted case, we can be confident the result will contain only a single value because there are no ascending or equal subsequences in the input. Although the value will come from the input, we don’t know specifically which one it will be. This means we can write
result = thanos_sorted(range(10, 0, -1))
assert len(result) == 1
assert result[0] in range(10, 0, -1)
In more the general case of mixed inputs we can expect one or more values in the result. Values in the result will occur in the same sequence they do in the input, but which values? For any given example, we can only test against properties of the result rather than against a concrete outcome:
- The result must be sorted.
- For non-empty input, the result is also non-empty.
- The result sequence contains only elements from the input occurring no more than once and in the same sequence they did in the input. The direct successor of an element in the result must be the direct or indirect successor of that element in the input.
Assuming we have a function contains_in_sequence to express the third property, we can now write
result = thanos_sorted([3, 1, 4, 1, 5, 9])
assert is_sorted(result)
assert len(result) in range(1, 4)
assert contains_in_sequence([3, 1, 4, 1, 5, 9], result)
Note that for all its randomness, this degree of property litigation is not required by bogosort, where randomness affects the process but not the outcome.
Points of view
The first set of implementations we looked at in JavaScript are typical of imperative languages: they sort by side-effect, modifying the given population of values in place. The second set of implementations in Python are more typical of functional languages: they return a new collection containing the sorted result, leaving the original population untouched.
There is a third possibility that is related to the second option: return a sorted view of the original values. In theory, a view of immutable values is interchangeable with a copy of those values. In practice, the referential transparency implied by such substitutability is not well supported by mainstream languages, making this third option distinct from the second one.
If we want an efficient view of the original values that is not simply a sliced copy, we need to be returning a contiguous subsequence of the values. Our snappy dimidiation process needs to stumble across a sorted subsequence within the original sequence of values and return the start and end point of that rather than a copy.
Using the C++20 abbreviated template function syntax for brevity, but otherwise taking the classic C++ approach of defining a sequence in terms of a pair of iterators, Thanos sort looks like
auto thanos_sort(auto begin, auto end)
{
auto lower = begin, upper = end;
while (!is_sorted(lower, upper))
tie(lower, upper) = snap(lower, upper);
return pair(lower, upper);
}
Here, thanos_sort and snap return iterator pairs as their result. The first value of the pair refers to the start of the result sequence and the second to one past its end. The tie utility allows the pair from snap to be decomposed into a multiple assignment to the lower and upper iterators.
One of the strategies already looked at would work as an implementation for a view-based snap: eliminate either the first or the second half of the elements by returning iterators that delimit the surviving half. We can actually loosen this a little: we don’t need the eliminated half of the elements to be contiguous, just the remainder. In other words, snap can return any contiguous view of half the elements passed, whether from the beginning, middle or end.
auto snap(auto begin, auto end)
{
auto population = end - begin;
auto survivors = population / 2;
auto new_beginning = begin + rand() % (population - survivors);
return pair(new_beginning, new_beginning + survivors);
}
(Note that this code is written with begin and end as random-access iterators. This simplification is for syntactic convenience only and does not reflect a necessary constraint. To make the code work with forward iterators (single-stepping, single-direction iterators) replace the arithmetic operations with distance and next, as appropriate.)
In some ways, this snap is similar to the now deceased random_shuffle operation in the C++ standard library, which also uses the much maligned C rand function. If we want to make it more like C++’s shuffle operation we have to supply a random number engine and shape it according to a distribution. In this case, we don’t want anything fancy, just a uniform spread of integers:
auto snap(auto begin, auto end, auto && randomness)
{
auto population = end - begin;
auto survivors = population / 2;
uniform_int_distribution<> bounded(0, population - survivors - 1);
auto new_start = begin + bounded(randomness);
return pair(new_start, new_start + survivors);
}
To call snap, thanos_sort needs to supply a source of randomness, such as a Mersenne Twister:
auto thanos_sort(auto begin, auto end)
{
auto lower = begin, upper = end;
mt19937 randomness;
while (!is_sorted(lower, upper))
tie(lower, upper) = snap(lower, upper, randomness);
return pair(lower, upper);
}
If you want it to be more arbitrary or truly random, the mt19937 instance can be seeded explicitly.
However we choose to define thanos_sort, the usage will looking something like
const vector values {3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9};
auto result = thanos_sort(values.begin(), values.end());
By declaring the original population of values as const we can see the sort is non-destructive. To examine the result we can iterate from the first up to the second iterator held by the resulting pair.
for (auto examining = result.first; examining != result.second; ++examining)
cout << *examining << "\n";
An example of what this could print out would be
1
5
9
This is all good… except perhaps for the usage. The first and second members of pair are not particularly intention revealing when we’re accustomed to thinking of iterator ranges in terms of begin and end. C++ code that relies on pairs at different levels (pairs of pairs, for example) and for different purposes can be notoriously difficult to decrypt when wading through layers of first and second accesses (there are C++ codebases that would give 128-bit encryption a good run for its money). We’d also like to use the result directly in iterable contexts, for example:
for (auto value: thanos_sort(values.begin(), values.end()))
cout << value << "\n";
To be usable as an iterable range our thanos_sort result needs to support begin and end operations. Unsurprisingly, this is what other view types available in C++ offer, e.g., string_view, span and initializer_list. Even less surprisingly, it underpins C++20’s ranges library (which is conceptually like the Java Streams API or C# LINQ… but more complicated). This is where we’ll look for a tidy resolution of our code. We can take the code we have and give it a lightweight range wrapping, so it both receives and returns a range:
auto thanos_sort(const auto & values)
{
auto [begin, end] = thanos_sort(values.begin(), values.end());
return subrange(begin, end);
}
The subrange view is found in namespace std::ranges and returns a range described by a starting position and a sentinel value, which in this case is a one-past-the-end iterator.
This means that we can write the following:
const vector values {3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9};
auto result = thanos_sort(values);
for (auto value: result)
cout << value << "\n";
As well as the following:
for (auto value: thanos_sort(values))
cout << value << "\n";
Further integration with range usage is possible, but we’ll stop here rather than run that gauntlet of complexity.
Endgame
Thanos sort is a hybrid algorithm we can appreciate and marvel at in many ways beyond simply being a bogodropsort. As with any esoteric algorithm, it’s fun and provocative and raises questions. But it also serves to highlight how much of our perception of an algorithm and its characteristics depends on a number of implementation choices beyond the core structure of the algorithm itself.
The degree of randomness in the implementation, for example, is a point of implementation we can vary. We can move that slider all the way from randomly choosing on a per-element basis to a limited grouping strategy that eliminates one fixed half or another.
In terms of fidelity to the algorithm’s MCU origins, both the copy- and the view-based approaches are non-destructive. This preservation allows recovery of the original elements. Thus, the eliminated values can be restored, blipped back with a swap of a variable or the end of a scope.