Need Something Sorted? Sleep on It!
An unexpected paradigm journey into sleep sort
A decade ago I first presented a lightning talk entitled Cool Code. This short talk evolved into a full talk whose iterations I presented over the next half decade. The focus? Code that, for some reason or other, can be considered cool. For example, code that has played a significant role in historical events, such as the source for the Apollo Guidance Computer. Or code that is audacious — if not seemingly impossible — given its constraints, such as David Horne’s 1K chess. There is code that is both simple and profound, such as Peter Norvig’s fits-on-a-slide spelling corrector. And code that demonstrates ingenuity and humour, such as Yusuke Endoh’s Qlobe.
Leaving aside its content for a moment, one of the most interesting things about the talk was its stone-soup nature. Whenever and wherever I gave it, I would usually receive at least one suggestion for more code to include. This drove the talk’s evolution over the years. Summed across all its variations, there’s probably a half-day’s worth of material in total.
The first full-length version of the talk I presented was at JavaZone 2011. Afterwards, someone came up to me and asked, “Have you come across sleep sort?” I hadn’t. “You should look it up. I think you might enjoy it.” I did. And I did.
Bourne to sleep
Possibly the only good thing to ever come out of 4chan, sleep sort was created by an anonymous user in 2011. The original code was written in Bash, but it also runs as a traditional Bourne shell script.
Here is a version with the called function renamed to something more meaningful (it was f in the original post):
function sleeper() {
sleep $1
echo $1
}while [ -n "$1" ]
do
sleeper $1 &
shift
done
wait
We could condense this code by inlining the sleeper function at its point of use:
while [ -n "$1" ]
do
(sleep $1; echo $1) &
shift
done
wait
I have a slight preference for the shorter version but, in this case, keeping the longer form gives a structure we can preserve more easily across implementation variations. It also offers more real estate for supplementary explanation:
function sleeper() {# define a function that...
sleep $1 # sleeps for duration of its argument in seconds
echo $1 # prints its argument to the console
}while [ -n "$1" ] # while the script's lead argument is not empty
do
sleeper $1 & # launch a sleeper with the lead argument
shift # shift argument list, so $2 becomes $1, etc.
done
wait # wait for all launched processes to complete
Assuming this is in a script file named sleepsort, you can run it as follows:
sleepsort 3 1 4 1 5 9
With the following result:
1
1
3
4
5
9
The first two lines arrive after 1 second, the next after 3 seconds, the next after 4 seconds, etc.
Bonkers, brilliant and definitely NSFW.
Reductio ad absurdum
Sleep sort presents itself as an O(n) sorting algorithm for non-negative integers. It’s also a lot of fun. Sleep sort lets us defamiliarise a hackneyed topic — sorting — and see it anew — sorting as an arrangement in time rather than space. We can learn things from it, from the mechanics of time expressed in code to styles of composition in different paradigms and languages. It lets us explore assumptions as we uncover solution possibilities and brush up against their boundaries.
If you want to extend the sortable domain into negative numbers, either you must add a bias to each value of at least the magnitude of the most negative value, or you need to find yourself a source of reverse entropy (e.g., thiotimoline). Although sources of entropy are common in modern CPUs, reverse entropy sources will not available in the foreseeable (or unforeseeable) future.
In theory, this algorithm can work for floating-point numbers… in theory. In practice, scheduler limitations and variability and the cost of process creation makes sorting of floats unreliable the closer the values are to one another.
Which brings us to performance. Sleep sort is not just fun; it’s educational.
That the algorithm appears to take linear time, i.e., the number of steps taken to sort is proportional to the number of items in the input, is both a point of interest and a source of distraction. Big O notation can be considered the GDP of performance. It is a singular observation that, while not necessarily false, may distract you from other measures that matter. There’s more to algorithms and their (un)suitability than just finding the abstracted proportionality of steps to input size.
At first sight, the sorting of each value appears to be done without reference to or co-ordination through any shared state — not even the other values to be sorted. Viewed classically, the only shared resource is the standard output. But look more closely and it becomes clear that there is co-ordination and it is through a shared resource: time. Without the fork–join arrangement (from each & to the wait), this code would terminate before completion of the algorithm and, without the explicit and managed passage of time, there is no algorithm. The wait is a necessary feature not an optional one; for the algorithm, time is of the essence, not simply a consequence.
The specific machinery of time is not a necessary part of a sleep sort implementation, but that time has a design is, which opens up other implementation possibilities.
In a state of threadiness
We can switch from processes to threads, preserving the structure of the solution above. In C++, the sleeper function becomes
void sleeper(int value)
{
this_thread::sleep_for(value * 1s);
cout << value << "\n";
}
The move from pre-emptive multitasking processes to pre-emptive multitasking threads is, in this case, a simple one. The sleeper function is still a function that receives an argument and is launched from within a loop. The separateness or sharing of memory doesn’t play a role in the solution. Had the process-based Bash solution used an IPC mechanism for synchronisation, such as pipes or locks, this would have had to have been translated in some way. But in the absence of any other communication, the thread-based version is essentially a transliteration of the process-based one.
This leaves the significant differences between the shell and the C++ versions as being down to language, such as syntax and typing, and the realisation of the fork–join model.
To make it a little easier on the eye, I’ve assumed the std namespace is made available, along with the std::chrono_literals namespace, which lets us express 1 second directly as 1s, which would also be the value we would change to rescale the unit interval.
void sleepsort(const auto & values)
{
vector<jthread> sleepers;
for (int value: values)
sleepers.push_back(jthread(sleeper, value));
}
The function receives the values to be sorted by reference, along with a const promise that the values will not be altered by the sorting. Type deduction via auto leaves the remaining type bookkeeping to the compiler, so it can be called with a vector<int>:
sleepsort(vector {3, 1, 4, 1, 5, 9});
Or with anything else that satisfies the expectation of the for loop:
sleepsort(list {3, 1, 4, 1, 5, 9});
And, for C++ compilers that also support C compound literals:
sleepsort((int[]) {3, 1, 4, 1, 5, 9});
The lifetime of sleepers is scope-bound to sleepsort. When the function completes, sleepers will be cleaned up automatically and, in turn, will clean up its contained values. In this case, the contained values are jthread instances, which are threads that join — i.e., wait for — their threads to complete before completing their own destruction. Thus, joining is an automatic end-of-life feature of the sleepers variable, and the sleepsort function will not return until all the launched threads have completed.
Fearless symmetry
Python’s global interpreter lock (GIL) and casually modifiable underlying object model make it almost uniquely unsuited to pre-emptive threading. That does not, however, mean that it cannot express concurrency conveniently. Originally with generator functions, and more recently and explicitly with async functions, Python supports coroutines.
Subroutines are special cases of more general program components, called coroutines. In contrast to the unsymmetric relationship between a main routine and a subroutine, there is complete symmetry between coroutines.
— Donald Knuth, The Art of Computer Programming, Volume 1
Coroutines were invented in the late 1950s, with the word itself coined in 1958 by Melvin Conway (yes, that Conway). They were an influential feature across many architectures, paradigms and languages. Their popularity waned to the point of disappearance in the 1980s, although co-operative multitasking became a common feature of runtime environments (e.g., Mac OS and Windows). Over the last decade, this classic of the procedural paradigm has enjoyed new popularity, with many languages and libraries adopting coroutines or coroutine-like constructs.
The coroutine notion can greatly simplify the conception of a program when its modules do not communicate with each other synchronously.
— Melvin Conway, Design of a Separable Transition-Diagram Compiler
Coroutines offer a constrained concurrency model, one based on single-threaded execution of concurrently available re-entrant code, rather than truly concurrent execution. For sleep sort that is enough.
async def sleepsort(values): async def sleeper(value):
await sleep(value)
print(value) await wait([sleeper(value) for value in values])
Structurally, there is little difference between this and the pre-emptive solutions, with all of them sharing the same underlying fork–sleep–join anatomy.
The sleepsort function expects values to be iterable, such as a list or a tuple. The values can be non-negative floats, not just integers, for which the underlying single-threaded nature of coroutines offers a better ordering guarantee than would be delivered for the equivalent pre-emptive execution.
The sleeper function is nested as a private detail within the sleepsort function. The sleeper-launching loop is expressed as a comprehension whose resulting list holds the awaitable coroutine instances that are run and blocked on by asyncio.wait, thereby collapsing the whole fork–join into a single await statement.
An alternative expression of this is to gather together all of the tasks that must be run concurrently, in this case unpacking the list of coroutines to asyncio.gather, which expects each awaitable object to be a separate argument:
async def sleepsort(values): async def sleeper(value):
await sleep(value)
print(value) await gather(*[sleeper(value) for value in values])
In either case, the sleepsort coroutine must be launched explicitly as a coroutine:
run(sleepsort([3, 1, 4, 1, 5, 9])
The coroutine-ness, and therefore the run, can be further wrapped inside an ordinary function, if you prefer.
It’s about time
If we look closely at what we’ve been trying to do, we’ll see we’ve been working around the intrinsic structure of the solution rather than expressing it directly. Consider, for a moment, what the essence of sleep sort is: time-based execution. What have the three solutions shown so far done? They have launched separate paths of execution that have immediately been suspended so as to prevent that very execution. The terms and conditions of suspension have been time-based, but the constructs that shape the code have not been. We’ve been faking timers.
In the words of Morpheus, “Stop trying to hit me and hit me.”
Time can be considered a source of events, as can I/O and user interaction. Event-driven control flow offers an alternative way of structuring applications to the the explicit top–down control flow most commonly used to express algorithms. This inversion of control activates code in response to events rather than embedding blocks or polls into the code to stem the flow of control. Such opposite framing has many design consequences.
Although they both express themselves through asynchrony, threads and event-driven code mix poorly. It is generally better to choose one approach and stick with it. Bringing these two world views together (correctly) under the same architectural roof demands a clear head, a clear separation of responsibilities and, unfortunately, a clear increase in complexity. For example, the POSA2 full write-up of the Reactor pattern, which is an event-loop based dispatch pattern, is 36 pages long; the Proactor pattern, which mixes in asynchronous mechanisms for its event-handling, took 46 pages. Although we managed to condense them to three pages each in POSA4, I recall that for much of the draft Proactor was hitting the four-page mark. We also ranked the patterns differently: Reactor was considered more mature and less tricky than Proactor.
Tempting as it is to mess around with timer events in GUI frameworks or with POSIX timers in C for an illustrative example, life is too short. The corresponding JavaScript/HTML5 implementation, however, is not:
const sleepsort = values =>
values.forEach(
value => setTimeout(
() => document.writeln(value), value * 1000))
To run, this needs to be embedded into HTML as a <script> — timers are part of the HTML5 standard but not the ECMAScript standard.
sleepsort([3, 1, 4, 1, 5, 9])
The sleepsort function can refactored to more closely resemble the separations in the previous three versions:
const sleeper = value =>
setTimeout(() => document.writeln(value), value * 1000)const sleepsort = values => values.forEach(sleeper)
The sleepsort function still iterates through each value to be sorted, but rather than launching a sleeper, it simply calls it as an ordinary function that then registers a lambda expression as a callback after the appropriately scaled number of milliseconds.
In this case, no join boundary is necessary to wait on all the timers expiring because timers and their effect persist in their enclosing web-page environment.
Trigger warning
Time-triggered systems are a particular class of event-driven architecture that offer another variation on the idea of organising application execution in terms of tasks and time.
Rather than regarding timers as plural and first class, the progress of time is uniquely marked out by a single recurring interrupt. This heartbeat is played against a schedule of tasks, a queue of work or a set of pending notifications from external or internal events. The tick of the timer marks out time slices into which tasks are placed rather than associating tasks with timers, threads or other directly asynchronous constructs. Therefore, in contrast to our usual conception of event-driven systems, time-triggered systems have only one event they respond to directly: there is only quantised, periodic time. If there is something to do at that moment, it is done. The constraint that must be respected to make this architecture work is that tasks undertaken in any time slice must fit within that time slice — there is no concurrent execution or arrythmia.
Time-triggered systems have a more even and predictable behaviour that makes them attractive in safety-critical environments. They impose a regular and sequential structure on time that ensures conflicting events are neither lost nor experience priority inversion in their handling.
It is the need to deal with the simultaneous occurrence of more than one event that both adds to the system complexity and reduces the ability to predict the behaviour of an event-triggered system under all circumstances. By contrast, in a time-triggered embedded application, the designer is able to ensure that only single events must be handled at a time, in a carefully controlled sequence.
— Michael Pont, Patterns for Time-Triggered Embedded Systems
Although normally associated with C and embedded systems, we can illustrate the approach using JavaScript and HTML5’s setInterval function:
const sleepsort = values => {
const sleepers = {}
values.forEach(
value => sleepers[value] = sleepers[value] + 1 || 1) let ticks = 0
const tick = () => {
for (let count = sleepers[ticks]; count > 0; --count)
document.writeln(ticks)
++ticks
} setInterval(tick, 1000)
}
I’ve separated out the roles in this code to be a little more explicit, but it is already clear that the shape of this solution has little in common with the previous four solutions. It is more complex, involving an additional intermediate data structure, sleepers, that represents the task table (i.e., number of items of a particular value to be sorted) with respect to the time slice. For the task in hand it is — compared to the other examples — overkill (but keep in mind that we are dealing with sleep sort, so all excess is relative…).
The first tick is counted as 0 and there is an initial offset of 1000 milliseconds before tick 0 drops. The interval timer will run forever (well, until the page is closed), but by decrementing an initial sleeper count the code could be tweaked to cancel the timer when all the sleepers have expired.
We have journeyed through time to a solution that has laid out the sorting structure spatially in a table. The solution is now, in truth, data driven and is simply dressed up to behave like sleep sort. Time is now consumed in a more conventional algorithmic form — albeit in a rather elaborate way — and the defining temporal separation of sleep sort has been rotated into spatial separation through data structure.
Timing out
I previously noted that “Looking at something from a different point of view can reveal a hidden side.” And that is true of this essay, whose subtitle describes an unexpected journey.
I originally had it in mind to show three variants of sleep sort: the simplified shell version I normally use in talks, a multithreaded version and then one other. Deciding on coroutines as the third variant made it clear to me that, nice as that troika was for illustrating sleep sort, excluding an event-driven solution might feel like a glaring omission. And once I started down that path, a time-triggered solution seemed like another natural inclusion. Three becomes four becomes five and, before you know it, what started as a simple examination of sleep sort becomes an exploration of execution paradigms and architectural decisions.
Of course, there are many more paradigms and variations that could be explored and lessons learned from this fun example. For example, although the shell version was based on OS processes, there are other process-based approaches such as actors and CSP that could be explored. Writing, however, shares an important attribute with algorithms: there should be a stopping condition. In this case, we’ll stop at five.
The purpose here is neither the search for an ideal sorting algorithm nor the ideal way of implementing a non-ideal sort. It is to take something that is familiar and commodified and dull and find within it something fun and unexpected and surprising — sorting without sorting — and use it as a way to pry open different approaches to organising execution with respect to the flow and structure of time. How code and data are organised with respect to time is architectural, and changing your model of time has consequences we can appreciate and apply beyond sleep sort.
The value of the exercise comes from approaching sorting, concurrency and events through ostranenie rather than habit, as something unfamiliar and new.
Defamiliarization or ostranenie is the artistic technique of presenting to audiences common things in an unfamiliar or strange way so they could gain new perspectives and see the world differently.
Whether a difference in perspective will shower you with IQ points is an open question, but it can grant you greater knowledge and insight and both satisfy and fuel a curiosity you may not have realised you had.