I’ve been thinking a lot about good code lately, if only because I’ve been stuck in the unfortunate situation of having to deal with bad code. Without going into the gory details, the people who wrote the bad code were convinced it was good so I had to spend a lot of time and energy explaining why it was not good, and, therefore, what good code is. (With indifferent success, it has to be said; there are none so blind as those who will not see).
A digression: my plan when I started this blog was to keep the articles frequent, regular, game-focussed and specific. Obviously things have been neither frequent nor regular for some time now, and this article continues the ‘recent’ trend of being neither game-focussed nor particularly specific. What can I say? Plans change, life circumstances change, and work on Pandemonium (the game) is on hiatus; at least this isn’t another “how to get into the games industry” article (which is for a blogger what crates are for games designers). I hope you still find it interesting, and as always feedback is very welcome and gratefully received.
So: what is good code? Can we even place a value judgement on code as “good” or “bad”? After all, there are very often many different ways to achieve the result you’re looking for; code is part art and part science, and there’s not a single non-trivial program in all of coding history that is entirely unable to be improved in any way. (Even “Hello, world!” could be localised…). Well, to consider it from the converse point of view, if code crashes or gives the wrong results, it must be bad code; therefore, it is reasonable to conclude that if it ‘works’ (however that is defined), it may be good code. There is however much more to it than that, as we shall see, and there may often be debate, discussion, and extensive philosophising as to exactly what is “good”.
The following list may not be exhaustive, though I think it’s a pretty good start. In general, each item is given in a rough order of priority, those things that I consider most important listed first – but it is critical to understand that what is “important” can vary greatly depending on context. For example you will see that I normally prize readability ahead of performance, but if your profiling reveals that one function is dropping your frame rate from 60Hz to 10Hz, you have no choice but to optimise it, even at the expense of readability.
Good code is…
…correct. It is amazing how often programmers will miss this one when discussing good code. Games programmers, in particular, have an unnerving tendency to mention “fast” as the first thing they think of when talking on this subject. This is a nonsense. The code must do what it is intended to do, correctly, or else it cannot possibly be good.
There is an important implication here. Firstly, the phrase what it is intended to do implies that good code begins with requirements, specification and design – all before a single line of code is written! Exactly how you generate the code design is itself a subject worthy of lengthy discussion, but outside the scope of this article.
Consider: correct isn’t really enough. Provably correct is much, much better. More on this later.
…readable. How long does it take to type enough text to fill, say, a page-long function? Probably only a few minutes. Obviously there’s a significant amount of time invested in working out what to type, but it doesn’t take long. However you are certain to need to read it later – to review it before you commit it into source control, when QA find a bug, when you need to explain it to a colleague, when a colleague needs to understand what it does without you there to explain. Code must therefore be written to WORM – Write Once, Read Many, and I think this is the second most important requirement after correctness.
What contributes toward code being readable? There are many factors – judicious use of whitespace, sensible function and variable names, short functions, use of good patterns and avoidance of antipatterns – but probably the most critical factor is consistency. I don’t really care if you use three spaces or four to indent, but in the name of all that is holy don’t use three sometimes, four other times and five other times still! Style guides, automatic formatting, and tools like StyleCop are helpful in ensuring consistency across a team – but it’s probably more important to stay consistent within a file. So if you’re editing someone else’s code, be sure to match their style. If things are really bad, set aside time to refactor later; but otherwise, match what is already there.
…testable. This requirement is one that has been bubbling up my list of priorities over the years, such that I now consider it one of the most critical factors in “good code”. I have always believed in the adage that “if it’s not tested, it’s broken” – in other words, unless you know and have proved that the code is correct (responds correctly to good input and fails elegantly with bad input) there’s probably some corner case you’ve missed and it will come back to bite you. I used to assume that sufficient QA coverage would be enough. Not any more – code must (wherever humanly possible) be covered by automated unit and soak tests, which must be added to with every bug found and fixed. This is a lesson I have learned through bitter experience. The codebase I am currently working on is untested and bordering on untestable, and it is almost impossible to make changes (even fixes) without breaking something subtle – and there’s no way of catching that subtle breakage until it is too late (ie. the customer has been affected by it).
Interestingly enough, as unit tests are still relatively new to me, I’m still exploring methods of doing it with which I am comfortable. I definitely have more to learn here. I would also add that I do not strictly practice Test-Driven Design myself, though I can see that it might a good idea in principle; however, even though I don’t currently write the tests before the code, I now always consider how the tests can be written while writing the code (which isn’t TDD but it’s close enough to see it on a sunny day) and I think every programmer should do at least the same.
…well-documented. By this I mean not only separate documents listing the requirements, and user guides, and such like; but also comments within the code itself. In general, code should be self-commenting: CalculateDamage() is a better function name than f(), and RemainingHitPoints is a better variable name than hp. In both cases, it is easy to infer what the code does. Comments should not normally describe what the code does, or even how (both of which should normally be understandable from the code itself unless it is highly optimised), but they should be used to explain why the code does what it does in the way that it does it. I have heard it suggested that every comment should contain the word “because”. That’s not a bad guideline.
…robust and reliable. I had a disagreement with someone (a non-coding manager) a while back when he contended that a “genius coder” was someone who had a blazingly brilliant idea, implemented it rapidly, then moved on to the next blazingly brilliant idea, even if the first idea wasn’t completely finished yet – as other coders could then take up the slack. I reject this suggestion utterly. A real coding genius, a guru, a free electron – call them what you will, we all know the kind of programmer I’m talking about, and I don’t claim for one second to be one myself – such a person does indeed implement blazingly brilliant ideas but they make sure it works.
Good code doesn’t crash (for starters). Neither does it scribble over random areas of memory, trash the stack, or give unpredictably different results for the same input. Indeed, restrictions on input are clearly specified, and assertions and/or error handling are used to anticipate bad input and deal with it appropriately (good code is usually tolerant of bad input unless there is a reason for it not to be, such as performance or security). When good code is used by other people, they can be confident that it will work as advertised, and not throw any unexpected spanners in their works.
…maintainable. The simple truth of the matter is this: all code has bugs. At some point, you (or someone else) will need to come back to your code and fix those bugs. Or, you may need to extend or change it to reflect changes in specifications. Do yourself (and others) a favour, and write your code in such a way that it can be easily maintained.
Happily, if your code is readable, well-documented and covered by unit tests it will probably be easy to maintain, but by explicitly remembering that you will probably have to come back to this code while you design and write it, you can make decisions that will make maintenance even easier.
…extensible, flexible and reusable. Once you’ve gone to all the trouble of writing code, do you really want to put yourself through the hassle of writing it again next time you need to do the same thing? Or writing something almost the same but that varies in some minor particular? I would hope not. With good code, it is possible (perhaps using templates or generics) for other people to use the code for purposes that the author may never have expected, except insofar as he deliberately made it extensible. With good code, it is possible for other people to write code that this code then uses, even though it was written first. A study of good software libraries (perhaps most obviously the STL) is enlightening as to how this can be achieved, and how effective it can be.
There is, of course, a risk of over-engineering – remember YAGNI (You Ain’t Gonna Need It): don’t waste time writing code you’ll never need. As always, this is contextual. An IDE like Visual Studio benefits greatly from a plug-in system that allows other people to write software to extend the software itself. Is it worth writing a plugin system for your own game editor? I suspect not.
…efficient, performant and scalable, in terms of memory, CPU, system resources, processor/thread count, internet bandwidth… as much as it’s nice to be able to pretend such things are infinite sometimes, in truth of course they are not. Good code is not wasteful with such things, and makes promises about how much of each it requires (for example sometimes, you can use more memory to make things easier for the CPU; but on an embedded system that memory may not be available, so an understanding of the target platform is required).
In my experience, performance concerns are most often addressed during the design stage, but when it comes to writing the code itself, using appropriate algorithms and data structures (alongside basic guidelines like avoiding allocations/garbage, etc) is usually all that is required. In my experience and by my estimation, 90% of code is not performance critical and so long as you use a sensible algorithm, that will be sufficient. If you do not understand algorithmic complexity, you are not a programmer – whereas an understanding of the effects of branch misprediction, cache misses, false sharing etc. is something that most coders don’t need to spend too much brainpower on, most of the time. (There is still that 10% where those kinds of things do matter, of course; and games programmers find themselves in that 10% more often than most so they tend to have a skewed view of this).
I’m tempted to write considerably more under this heading, but I’m going to stand by my claim that most of the time, “reasonably performant” is enough to qualify for good code. Don’t throw away performance needlessly and you’ll usually be fine.
…secure. I have been lucky enough in my professional career that I have usually been working on products that do not have to worry overmuch about being hacked. Console games and proprietary custom software for a specific company are not usually prime targets for hackers, who are more likely to aim for websites, operating systems or “serious” software where they can either obtain valuable data or cause havoc for lulz.
Or so most people suppose, anyway. I remember breathing a sigh of relief shortly after the release of Kameo because I’d written (among other things) the savegame code for that game, and it was the savegame code in some other game (I forget which, now) that allowed one of the first successful hacks of the Xbox 360: someone managed to edit a savegame file in such a way that the code read off the end, and hey presto, they found a way to launch pirated games. Although I have read a bit about writing secure code I’m sure this is something I could improve on, and I suspect I’m not alone – good code is secure, even if the consequences of being insecure don’t involve giving away people’s credit card numbers.
…discoverable, by which I mean that when someone else comes across your code for the first time, they can grok it quickly and easily. This is something I’d not consciously considered until recently, which is odd given that so many of the problems I’ve been having with the current codebase have been a result of finding that I need to investigate some code, and immediately thinking “WTF is this doing!?”. Had the code been discoverable, the last few months of my career would have been considerably easier. Of course, again, if your code is readable, tested, well-documented and maintainable, it should probably be straightforward for your team-mates to pick it up. But it is always a good idea to think to yourself, “if a reasonably intelligent, reasonably experienced programmer (who was new to this project) had to debug this code, would it take them long to figure out what is going on?”
…simple. Sorry, I’ve had to go back and edit this post because somehow I managed to miss this the first time around! Good code is simple. Even complex good code is comprised of simple building blocks. Good code hides or cuts through the complexity in the problem, to provide a simple solution – the sign of a true coding genius is that he makes hard problems look easy, and solves them in such a way that anyone can understand how it was done (after the fact). Simplicity is not really a goal in its own right, though; it’s just that by means of being simple, code is more readable, discoverable, testable, and maintainable, as well as being more likely to be robust, secure and correct! So if you keep your code simple (as simple as possible, but no simpler), it is more likely to be good code – but that is by no means sufficient in and of itself.
Well, I reckon that’s probably enough to be going on with. Many of the above considerations are worthy of an article in their own right (not that I intend to write such articles any time soon, or indeed at all) but I think I have written enough for tonight! Have I missed anything? Have I got anything in the wrong order? (I guarantee that some people will argue performance considerations should be higher up the list, and in some scenarios they would be correct, but as a rule of thumb I think the above order is generally best). Is there anything I’ve listed that is not actually a requirement for “good code”? Does anything need further clarification? Let me know, in the comments.