A Sort of Permutation

Practical lessons in the impractical permutation sort

7 min readAug 23, 2021

We’ve previously explored esoteric sorting algorithms that either purport to be linear in execution time but are hugely inefficient or are actually linear but are lossy. It’s time to find ourselves at the other end of the scale: lossless and profoundly non-linear. Permutation sort takes us from O(n) to O(n!) — that’s right, factorial time. O(MG)!

Factorial redux

In case your only recollection of factorial is as a rather uncompelling demonstration of recursion, what you need to know is that the numbers get very large very fast.

The recursive implementation of factorial arises from having 0! = 1 as a base case and expressing other cases as n! = n×(n-1)!. In Haskell this can be expressed elegantly and directly as

factorial 0 = 1
factorial n = n * factorial (n - 1)

Or, less declaratively but in a conditional form that translates more readily to other languages, as

factorial n = if n == 0 then 1 else n * factorial (n - 1)

Haskell is structurally and syntactically well suited to recursive definitions — especially using the pattern-matching form — but this is not the case across all languages.

Sometimes a recursive definition seems the most powerful and elegant way to express something, but sometimes, as Edsger Dijkstra notes, not so much:

Although correct, it hurts me, for I don’t like to crack an egg with a sledgehammer, no matter how effective the sledgehammer is for doing so.

There is a simpler definition of factorial that defines it as a sequence product, i.e., n! = Πⁿi. No fuss, sledgehammer or recursion required:

factorial n = product [1..n]

I’m not saying the reason many programmers are hesitant about recursion is because factorial is their introductory example… but I’m sure it doesn’t help. As Michael Jackson (no, not that one, the other one) remarks:

Software development should not be a trade of constructing difficulty from simplicity. Quite the contrary.

Now we have a way of calculating factorial, we can return to appreciating O(n!) performance complexity. As I was saying, the numbers get very large very fast: 1! is 1; 5! is 120; 10! is 3628800; 12! is the largest integral factorial that can be expressed in a 32-bit integer; the 64-bit budget gets blown after 20!; to go higher, you’re gonna need a bigger int. You get the idea. Very large. Very fast.

But very fast is one thing you won’t get if you have an algorithm with this kind of time consumption. Factorials crop up in determining permutations and combinations of things, from the way a deck of cards might be shuffled to how proteins can be folded. And that’s what permutation sort does: in essence, it is an unoptimised search through the permutations of the input values until it finds the one arrangement that is sorted. Permutation sort is to sorting as unguided trial and error without feedback is to learning.

Searching permutation space

Python’s permutations function returns an iterable that can be used to iterate through successive rearrangements of the values given as its input:

list(permutations([1, 2, 3]))

Generates a list with 3! = 6 permutations of the input, including the input:

[(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)]

If we apply this to a list with 4 items:

list(permutations([1, 2, 3, 4]))

The resulting list contains 4! = 24 permutations:

[(1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 2, 4), (1, 3, 4, 2),
 (1, 4, 2, 3), (1, 4, 3, 2), (2, 1, 3, 4), (2, 1, 4, 3),
 (2, 3, 1, 4), (2, 3, 4, 1), (2, 4, 1, 3), (2, 4, 3, 1),
 (3, 1, 2, 4), (3, 1, 4, 2), (3, 2, 1, 4), (3, 2, 4, 1),
 (3, 4, 1, 2), (3, 4, 2, 1), (4, 1, 2, 3), (4, 1, 3, 2),
 (4, 2, 1, 3), (4, 2, 3, 1), (4, 3, 1, 2), (4, 3, 2, 1)]

And so on for higher n.

Permutation sort is, therefore, simply a search through this result set. It follows a generate-and-test approach:

Possible solutions are generated systematically and evaluated until a suitable solution is found. […] The greatest weakness of generate-and-test procedures is combinatorial explosion.

For the specific examples just shown, a linear search will be successful on the first comparison because the lists [1, 2, 3] and [1, 2, 3, 4] were already sorted. If we start instead with the inputs reverse sorted…

list(permutations([4, 3, 2, 1]))

… it takes n! steps to find the sorted result.

[(4, 3, 2, 1), (4, 3, 1, 2), (4, 2, 3, 1), (4, 2, 1, 3),
 (4, 1, 3, 2), (4, 1, 2, 3), (3, 4, 2, 1), (3, 4, 1, 2),
 (3, 2, 4, 1), (3, 2, 1, 4), (3, 1, 4, 2), (3, 1, 2, 4),
 (2, 4, 3, 1), (2, 4, 1, 3), (2, 3, 4, 1), (2, 3, 1, 4),
 (2, 1, 4, 3), (2, 1, 3, 4), (1, 4, 3, 2), (1, 4, 2, 3),
 (1, 3, 4, 2), (1, 3, 2, 4), (1, 2, 4, 3), (1, 2, 3, 4)]

In the general case, we do not need to generate all the permutations of the input, only up to the one we need to find and then we can terminate. Forcing the result into a list causes generation of all the permutations up front — you end up paying for what you don’t use. Leaving the result of permutations as an iterable means the result is lazily rather than eagerly evaluated, so permutations are deferred and generated on demand.

We can express the search in terms of a filter:

filter(is_sorted, permutations(values))

This returns an iterable that can be used to walk through the result set. As we know there is always one and only one sorted permutation, we can use the iterator helper, next, to advance it to the lone result:

next(filter(is_sorted, permutations(values)))

The returned result is a tuple. Following the expectation set by Python’s sorted function, the following converts the result to a list:

def permutation_sorted(values):
    return list(next(filter(is_sorted, permutations(values))))

This can also be expressed in terms of comprehension syntax for a generator, which some readers might find more explicit:

def permutation_sorted(values):
    return list(next(each for each in permutations(values) if is_sorted(each)))

The comprehension syntax also makes explicit a more familiar framing of search: select the result from the permutations of the given values where the result is sorted, i.e., select…from…where…, which itself comes from set theory’s builder notation, e.g., {x | x ∈ S ∧ f(x)}.

The only thing missing is the definition of is_sorted, which we can write as

def is_sorted(values):
    pairs = zip(values, islice(values, 1, None))
    return all(i <= j for i, j in pairs)

A sequence of values is sorted if all of its adjacent pairs are sorted.

Note that when an implementation of permutation sort explicitly checks a sequence is sorted, that introduces a further O(n) factor into its time complexity. It’s up to you whether you want to account for this explicitly as O(n×n!), or whether you want to round that up to O((n+1)!), or whether you’re happy to say that as Big O notation is about growth, it’s still basically factorial. Whichever way you look at it, it’s a (very) big O.

Minimum viable permute

The Python out-of-place implementation of permutation sort is quite compact, but the in-place C++ version is even more minimal:

void permutation_sort(auto begin, auto end)
{
    while (next_permutation(begin, end))
        ;
}

The standard library next_permutation function shuffles the existing values in place, returning false if the resulting permutation is sorted and true if not, thereby combining into a single construct the lazy evaluation of Python’s permutations, the effect of the next operation and the logic to check for is_sorted.

Let’s talk performance

Apart from being eye-opening, instructive and fun — isn’t that enough? — of what use is permutation sort? One use is for having — even provoking — that awkward conversation about performance.

performance
the degree to which a system or component accomplishes its designated functions within given constraints, such as speed, accuracy, or memory usage

When a customer, manager, product owner or other developer says that the performance of a piece of code or of an application doesn’t matter, that statement is false. Permutation sort is an example of how to make clear that performance is always a consideration, even if it is not currently a problem. If performance doesn’t matter, then it would be fine to include permutation sort and its kin in your code. If you were to do this, you would find — all of a sudden — that performance does matter. Somewhere between “I don’t care about performance” and permutation sort lies acceptable performance. The conversation has begun.

This is a general approach you can use. Although the example is pathological, the approach is not: it’s part of a process of enquiry. Sometimes we don’t know an answer for something and need a strategy, like iterative root finding, to determine suitable answers. The same is true for other qualities — everything from security to maintainability — that may have been dismissed, or not have been considered, or for which the criteria for correctness or acceptability seemed not to have moved much beyond handwavium. Many software characteristics that are confidently framed as ‘requirements’ are often little more than vague desires or velleities.

To qualify as a requirement, we have to know how to determine whether the software has what is required or not — if we don’t know how to do this, we cannot truly claim to have specified a requirement. Of course, this doesn’t mean articulating, testing or even knowing a requirement is an easy task, but that’s precisely why it helps to have thought experiments and other thinking tools available to probe and explore what we do and do not (yet) know.