I’m surprised how often I come across large functions in projects. It’s not uncommon to come across functions with hundreds of lines. I’ve searched for something, jump in to the middle of a file, and then scrolled up desperately looking for the start of the function so I can start to understand what it’s doing.
I think it’s a good idea to adopt a maximum size for functions in a project. If one of your functions is
creeping up towards that size then it needs to be refactored. Normally that can be done
with smaller functions represent sensible parts of the task. Maybe the inner part of a loop is broken out
into it’s own function or each case of a switch-statement gets a function call rather than a body.
Occasionally it might be necessary to have foobar_part1
to foobar_part3
. This isn’t ideal.
While a good function name helps it is often still easier to understand a badly named but small function.
I think the best maximum size for a function is one screen. If you can see the entire function in one glance then it is much easier understand and reason about it. If you have to scroll up and down in order to see the whole thing then you must rely on your memory for the bits you can’t see. Critically it’s much easier to write a really long function from scratch because you do remember the bits you’ve just written. However it’s much harder for everyone else to then understand that function. This also applies to “future you” who has long forgotten what this function was for.
Most modern editors allow some sort of code folding. By hiding some code branches this lets you fit more on one screen. However this means you can’t see everything in one glance. If you fold away code then you are again relying on your memory. Moving code out to a separate function call is fundamentally different as you can see exactly what information is being passed back and forth.
How large do you mean?
When I say the maximum size for a function should be one screen that means it’s not fixed. It depends on what hardware the function is going to be read on. That means if you (and the rest of your team) use ultralite laptops then the maximum size should be smaller. If you all have widescreen monitors rotated 90 degrees then the maximum size can be larger. Getting a better monitor really could change how you code. Right now for me on a laptop that means a maximum size of about 70 lines.
I think some people will find this lack of specificity problematic. I don’t think you can make a set of rules that are going to work everywhere for everyone. If you are the only one who reads your code then look to you own hardware. If you work with a team then consider their hardware as well. If your code is going out for more public consumption then it probably pays to be conservative, maybe, 60 lines.
Breaking the rule
I sometimes write a function which goes above this maximum size. I think it’s okay to do this when the function is very easy to understand despite the larger size. Typically this means it follows a very regular pattern. An example is an if-else-statement with simple if-expressions and one-line if-bodies. However if the if-expressions have a variety of different forms or if there are long if-bodies then try to stick to the size limit.
Can this be a more general idea?
Recently I’ve been wondering if a “maximum size of one screen” rule should be applied more generally. The idea is that it’s easier to understand something if you can see the whole thing at once. That could apply to: classes, interfaces, or system architectures. For a class to be easy to use then, perhaps, the public interface should be at most one screen. For a class to be easy to understand then, perhaps, the whole interface should be at most one screen. For an entire system to be easy to understand then you have to be able to describe or diagram it on one screen.
I don’t think the argument for, say, a class is as strong as it is for a function. It can be possible to use a class while ignoring some of it’s capabilities. However that does leave you open to the class behaving unexpected because you were ignoring them. I’ll have to experiment a bit to see if limiting classes sizes pays off with better code or whether breaking the class apart just ends up with more complexity.
Leave a Reply