400,000,000 times faster

Matt Parker is a mathematician and comedian who writes books, makes podcasts and post YouTube videos. He originally release a podcast and then a video to solve a Wordl inspired problem. I’m going to concentrate on his follow up video. You can watch the videos but I’m going to start with a summary.

The Wordl inspired problem is to “find five five-letter words with twenty-five unique letters”. To this end Matt wrote some “terrible Python code” (his own words) to search a big list of English words for matching sets of words. This algorithm took a month to finish and find all the solutions. Then his audience got involved and came up with their own solutions.

Here’s the timeline for algorithms and runtimes.

Matt Parker – 2760670 seconds
Podcast released
Banjamin Paasen – 900 seconds
Neil Coffey – 15 seconds
Initial video released
Ilya Nikolaevsky – 5.86 seconds
Ge Weijers – 2.58 seconds
Kristin Paget – 1.045 seconds
Orson Peters – 0.477 seconds
Sylvester Hesp – 0.0068 seconds

The algorithm has been speeded up millions of times and now reading the text file with all the English words takes up a significant portion of the processing time.

Terrible python code

I think the first thing to realise here is that Matt’s original python code was fine for his needs. It may have been slow. It may have been terrible. However his time was judged as more important than his computers time and so he left it to the search. He even got to make another video about how bad his original solution was.

However most of us aren’t focusing on the videos but the software. If you’re a “stand-up comedian” it’s fine, if you’re software engineer then I’d encourage you to look for faster solutions than this. Running a month long algorithm runs the risk of getting an unexpected or undesirable outcome and having to spend another month re-running the algorithm. I’d even worry about the machine accidentally being turned off or the machine automatically restarting after an update.

I would prioritise functional, well structured, readable, debuggable code before performance. It’s easier to take well structured code and improve the performance than the other way around. If you focus too much on performance to start with you may end up with very fast code that doesn’t work. Try not to write terrible code in any case.

How slow is slow?

Your product specification may give you something concrete to aim for: “30 frames per second” or “web pages must be served within 0.1 seconds”. You may be writing an app that must “feel responsive”. Or it may be an indirect concern about excessive battery usage for a mobile device.

Writing a fancy algorithm that is extra fast might be fun (at least I think so) but best to stick to something simple if you’re hitting you performance targets. If not that might mean changes to the latest code which is particularly slow or just that the cumulative burden of all your existing code is now pushing it over the edge.

Pick your targets

Often a small amount of code ends up taking a large part of the processors time. That’s the best code to optimise. Ideally you can use a profiler to analyse the software and find this directly. It’s also possible to examine logs to find out the rough performance of individual components. You can optimise code that is outside this critical path but it takes just as long and gives much lower returns.

If you watch Matt’s video he talks about a number of techniques that different people used to do this:

Re-purposing existing published algorithms can make a huge difference. Often these have much lower time complexity and perform well on large data sets.
Parallelising the code using threads or GPU code uses the hardware more effectively. However it can be complicated and it’s much harder to debug. This requires more thought up-front.
Being clever with the data. This covers everything else that’s specific to a particular problem. Often there are patterns in the data that you can take advantage of to reduce the amount of work needed.

For this particular problem the biggest speed increase was from choosing a good data representation for the words. Instead of storing each word as a set of letters they can be represented as 26 bit stored in a single integer. Then most of the processing can be done directly with binary operations. This is a fairly simple to implement but makes a huge difference.

On balance

For me the important ideas here are:

Optimising code can make it much much faster.
Focus optimisation where it makes the biggest difference.
Save time and complexity by only optimising when you need to.

Terrible python code

How slow is slow?

Pick your targets

On balance

Comments

Leave a Reply Cancel reply