September 10, 2010

What is good code?

Filed under: Personal,Tools and Software Development — bittermanandy @ 12:11 am

I’ve been thinking a lot about good code lately, if only because I’ve been stuck in the unfortunate situation of having to deal with bad code. Without going into the gory details, the people who wrote the bad code were convinced it was good so I had to spend a lot of time and energy explaining why it was not good, and, therefore, what good code is. (With indifferent success, it has to be said; there are none so blind as those who will not see).

A digression: my plan when I started this blog was to keep the articles frequent, regular, game-focussed and specific. Obviously things have been neither frequent nor regular for some time now, and this article continues the ‘recent’ trend of being neither game-focussed nor particularly specific. What can I say? Plans change, life circumstances change, and work on Pandemonium (the game) is on hiatus; at least this isn’t another “how to get into the games industry” article (which is for a blogger what crates are for games designers). I hope you still find it interesting, and as always feedback is very welcome and gratefully received.

So: what is good code? Can we even place a value judgement on code as “good” or “bad”? After all, there are very often many different ways to achieve the result you’re looking for; code is part art and part science, and there’s not a single non-trivial program in all of coding history that is entirely unable to be improved in any way. (Even “Hello, world!” could be localised…). Well, to consider it from the converse point of view, if code crashes or gives the wrong results, it must be bad code; therefore, it is reasonable to conclude that if it ‘works’ (however that is defined), it may be good code. There is however much more to it than that, as we shall see, and there may often be debate, discussion, and extensive philosophising as to exactly what is “good”.

The following list may not be exhaustive, though I think it’s a pretty good start. In general, each item is given in a rough order of priority, those things that I consider most important listed first – but it is critical to understand that what is “important” can vary greatly depending on context. For example you will see that I normally prize readability ahead of performance, but if your profiling reveals that one function is dropping your frame rate from 60Hz to 10Hz, you have no choice but to optimise it, even at the expense of readability.

Good code is…

…correct. It is amazing how often programmers will miss this one when discussing good code. Games programmers, in particular, have an unnerving tendency to mention “fast” as the first thing they think of when talking on this subject. This is a nonsense. The code must do what it is intended to do, correctly, or else it cannot possibly be good.

There is an important implication here. Firstly, the phrase what it is intended to do implies that good code begins with requirements, specification and design – all before a single line of code is written! Exactly how you generate the code design is itself a subject worthy of lengthy discussion, but outside the scope of this article.

Consider: correct isn’t really enough. Provably correct is much, much better. More on this later.

…readable. How long does it take to type enough text to fill, say, a page-long function? Probably only a few minutes. Obviously there’s a significant amount of time invested in working out what to type, but it doesn’t take long. However you are certain to need to read it later – to review it before you commit it into source control, when QA find a bug, when you need to explain it to a colleague, when a colleague needs to understand what it does without you there to explain. Code must therefore be written to WORM – Write Once, Read Many, and I think this is the second most important requirement after correctness.

What contributes toward code being readable? There are many factors – judicious use of whitespace, sensible function and variable names, short functions, use of good patterns and avoidance of antipatterns – but probably the most critical factor is consistency. I don’t really care if you use three spaces or four to indent, but in the name of all that is holy don’t use three sometimes, four other times and five other times still! Style guides, automatic formatting, and tools like StyleCop are helpful in ensuring consistency across a team – but it’s probably more important to stay consistent within a file. So if you’re editing someone else’s code, be sure to match their style. If things are really bad, set aside time to refactor later; but otherwise, match what is already there.

…testable. This requirement is one that has been bubbling up my list of priorities over the years, such that I now consider it one of the most critical factors in “good code”. I have always believed in the adage that “if it’s not tested, it’s broken” – in other words, unless you know and have proved that the code is correct (responds correctly to good input and fails elegantly with bad input) there’s probably some corner case you’ve missed and it will come back to bite you. I used to assume that sufficient QA coverage would be enough. Not any more – code must (wherever humanly possible) be covered by automated unit and soak tests, which must be added to with every bug found and fixed. This is a lesson I have learned through bitter experience. The codebase I am currently working on is untested and bordering on untestable, and it is almost impossible to make changes (even fixes) without breaking something subtle – and there’s no way of catching that subtle breakage until it is too late (ie. the customer has been affected by it).

Interestingly enough, as unit tests are still relatively new to me, I’m still exploring methods of doing it with which I am comfortable. I definitely have more to learn here. I would also add that I do not strictly practice Test-Driven Design myself, though I can see that it might a good idea in principle; however, even though I don’t currently write the tests before the code, I now always consider how the tests can be written while writing the code (which isn’t TDD but it’s close enough to see it on a sunny day) and I think every programmer should do at least the same.

…well-documented. By this I mean not only separate documents listing the requirements, and user guides, and such like; but also comments within the code itself. In general, code should be self-commenting: CalculateDamage() is a better function name than f(), and RemainingHitPoints is a better variable name than hp. In both cases, it is easy to infer what the code does. Comments should not normally describe what the code does, or even how (both of which should normally be understandable from the code itself unless it is highly optimised), but they should be used to explain why the code does what it does in the way that it does it. I have heard it suggested that every comment should contain the word “because”. That’s not a bad guideline.

…robust and reliable. I had a disagreement with someone (a non-coding manager) a while back when he contended that a “genius coder” was someone who had a blazingly brilliant idea, implemented it rapidly, then moved on to the next blazingly brilliant idea, even if the first idea wasn’t completely finished yet – as other coders could then take up the slack. I reject this suggestion utterly. A real coding genius, a guru, a free electron – call them what you will, we all know the kind of programmer I’m talking about, and I don’t claim for one second to be one myself – such a person does indeed implement blazingly brilliant ideas but they make sure it works.

Good code doesn’t crash (for starters). Neither does it scribble over random areas of memory, trash the stack, or give unpredictably different results for the same input. Indeed, restrictions on input are clearly specified, and assertions and/or error handling are used to anticipate bad input and deal with it appropriately (good code is usually tolerant of bad input unless there is a reason for it not to be, such as performance or security). When good code is used by other people, they can be confident that it will work as advertised, and not throw any unexpected spanners in their works.

…maintainable. The simple truth of the matter is this: all code has bugs. At some point, you (or someone else) will need to come back to your code and fix those bugs. Or, you may need to extend or change it to reflect changes in specifications. Do yourself (and others) a favour, and write your code in such a way that it can be easily maintained.

Happily, if your code is readable, well-documented and covered by unit tests it will probably be easy to maintain, but by explicitly remembering that you will probably have to come back to this code while you design and write it, you can make decisions that will make maintenance even easier.

…extensible, flexible and reusable. Once you’ve gone to all the trouble of writing code, do you really want to put yourself through the hassle of writing it again next time you need to do the same thing? Or writing something almost the same but that varies in some minor particular? I would hope not. With good code, it is possible (perhaps using templates or generics) for other people to use the code for purposes that the author may never have expected, except insofar as he deliberately made it extensible. With good code, it is possible for other people to write code that this code then uses, even though it was written first. A study of good software libraries (perhaps most obviously the STL) is enlightening as to how this can be achieved, and how effective it can be.

There is, of course, a risk of over-engineering – remember YAGNI (You Ain’t Gonna Need It): don’t waste time writing code you’ll never need. As always, this is contextual. An IDE like Visual Studio benefits greatly from a plug-in system that allows other people to write software to extend the software itself. Is it worth writing a plugin system for your own game editor? I suspect not.

…efficient, performant and scalable, in terms of memory, CPU, system resources, processor/thread count, internet bandwidth… as much as it’s nice to be able to pretend such things are infinite sometimes, in truth of course they are not. Good code is not wasteful with such things, and makes promises about how much of each it requires (for example sometimes, you can use more memory to make things easier for the CPU; but on an embedded system that memory may not be available, so an understanding of the target platform is required).

In my experience, performance concerns are most often addressed during the design stage, but when it comes to writing the code itself, using appropriate algorithms and data structures (alongside basic guidelines like avoiding allocations/garbage, etc) is usually all that is required. In my experience and by my estimation, 90% of code is not performance critical and so long as you use a sensible algorithm, that will be sufficient. If you do not understand algorithmic complexity, you are not a programmer – whereas an understanding of the effects of branch misprediction, cache misses, false sharing etc. is something that most coders don’t need to spend too much brainpower on, most of the time. (There is still that 10% where those kinds of things do matter, of course; and games programmers find themselves in that 10% more often than most so they tend to have a skewed view of this).

I’m tempted to write considerably more under this heading, but I’m going to stand by my claim that most of the time, “reasonably performant” is enough to qualify for good code. Don’t throw away performance needlessly and you’ll usually be fine.

…secure. I have been lucky enough in my professional career that I have usually been working on products that do not have to worry overmuch about being hacked. Console games and proprietary custom software for a specific company are not usually prime targets for hackers, who are more likely to aim for websites, operating systems or “serious” software where they can either obtain valuable data or cause havoc for lulz.

Or so most people suppose, anyway. I remember breathing a sigh of relief shortly after the release of Kameo because I’d written (among other things) the savegame code for that game, and it was the savegame code in some other game (I forget which, now) that allowed one of the first successful hacks of the Xbox 360: someone managed to edit a savegame file in such a way that the code read off the end, and hey presto, they found a way to launch pirated games. Although I have read a bit about writing secure code I’m sure this is something I could improve on, and I suspect I’m not alone – good code is secure, even if the consequences of being insecure don’t involve giving away people’s credit card numbers.

…discoverable, by which I mean that when someone else comes across your code for the first time, they can grok it quickly and easily. This is something I’d not consciously considered until recently, which is odd given that so many of the problems I’ve been having with the current codebase have been a result of finding that I need to investigate some code, and immediately thinking “WTF is this doing!?”. Had the code been discoverable, the last few months of my career would have been considerably easier. Of course, again, if your code is readable, tested, well-documented and maintainable, it should probably be straightforward for your team-mates to pick it up. But it is always a good idea to think to yourself, “if a reasonably intelligent, reasonably experienced programmer (who was new to this project) had to debug this code, would it take them long to figure out what is going on?”

…simple. Sorry, I’ve had to go back and edit this post because somehow I managed to miss this the first time around! Good code is simple. Even complex good code is comprised of simple building blocks. Good code hides or cuts through the complexity in the problem, to provide a simple solution – the sign of a true coding genius is that he makes hard problems look easy, and solves them in such a way that anyone can understand how it was done (after the fact). Simplicity is not really a goal in its own right, though; it’s just that by means of being simple, code is more readable, discoverable, testable, and maintainable, as well as being more likely to be robust, secure and correct! So if you keep your code simple (as simple as possible, but no simpler), it is more likely to be good code – but that is by no means sufficient in and of itself.

Well, I reckon that’s probably enough to be going on with. Many of the above considerations are worthy of an article in their own right (not that I intend to write such articles any time soon, or indeed at all) but I think I have written enough for tonight! Have I missed anything? Have I got anything in the wrong order? (I guarantee that some people will argue performance considerations should be higher up the list, and in some scenarios they would be correct, but as a rule of thumb I think the above order is generally best). Is there anything I’ve listed that is not actually a requirement for “good code”? Does anything need further clarification? Let me know, in the comments.


November 12, 2008

Garbage, Part Two (Director’s Cut): oh, alright then

Filed under: Tools and Software Development — bittermanandy @ 12:27 am

You twisted my arm.

Let’s very quickly go over what you should know already (remember, this isn’t necessarily a beginner’s guide to garbage, rather an attempt to explain why what you know about garbage is like it is, and what it means for your code):

1. Value types (ints, floats, and other in-built types, as well as any user-defined struct type, eg. Microsoft.Xna.Framework.Vector3 or a struct you define yourself) are created on the stack, so you don’t need to worry about garbage.

2. Reference types (any class) are created on the heap, so should be used carefully as they will be garbage collected, which can be bad for the frame rate of your game if it happens at an inconvenient time. Part one explains why, in more detail than you really need to know.

As you may be able to guess, it isn’t quite that simple.

First, it’s not completely accurate to say that value types are always created on the stack. They will be, if they are created within a function, but not if they are themselves a member of a class, in which case they will exist (as part of the class object) on the heap. It’s a minor and fairly obvious point but it’s important to be correct.

Second, it’s not completely accurate to say that value types never generate garbage. They can, of course, contain reference members, in which case creating a new struct object can indirectly create a new class object, which generates garbage exactly as though you’d created the class object yourself. It’s fair to say that this is a slightly unusual thing to do (I’m not sure I can think of an example?) but it’s entirely legal and something to watch out for.

Third, remember how I said value types are created on the stack unless they’re members of a class, and how that was a minor point? Actually it’s not that minor. Value types contained within reference types are allocated on the heap. For example, arrays always catch out new C#/XNA programmers:

int a = 0;               // System.Int is value type. Allocated on the stack

int[] b = new int[5];    // Array is reference type! Allocated on the heap (garbage!)

Slightly confusingly, passing value types by reference (using ‘ref’ or ‘out’) does not “turn them into” reference types. It simply means that the object’s value can (or must, with ‘out’) be changed within the function and those changes are reflected in the original object that exists outside the function. It also means that no temporary object is created to act as the function parameter. For example, v1 below is a copy of another Vector3; creating a new object and copying onto it can be marginally slower than just passing by reference, as with v2, and the difference becomes more pronounced the larger the object. (Remember a reference is eight bytes in size in 32-bit code like XNA, so any struct eight bytes or smaller won’t benefit from passing by reference at all). For this reason, functions like Vector3.CatmullRom, that take many struct parameters, are provided in two flavours: one, which is more convenient to use, that takes (copies) the inputs by value and returns the result; and another, which is more performant, that takes the inputs by reference and the result is an out parameter. It’s a pattern you may want to use in your own code though 90% of the time you’ll just call the more convenient version.

void F( ref int a, out int b, Vector3 v1, ref Vector3 v2 )


    a = 1;        // No garbage here

    b = 2;        // Or here either

    m_v1 = v1;    // Or here… but v1 is a temporary

    m_v2 = v2;    // Or here… but v2 is passed by reference


Of course, reference types are always passed by reference anyway (which means a function can always change them; that’s one thing I miss about C++: const!) so using the ‘ref’ keyword for reference types is pointless, though legal. (FXCop rightly points out your mistake though).

There’s another difference between value and reference types – more specifically, between structs and classes – than simply where they live. Reference types can inherit from other reference types (and you therefore get inheritance, polymorphism, and all the other clever object-oriented stuff) while value types cannot inherit at all. And yet – you are probably aware that any and every type in C# is considered to ultimately derive from System.Object. You will undoubtedly have seen functions like this (particularly in .NET 1.x, before generics came along):

void G( object obj )


    // …


If obj can be anything, including a value type, like an int, a float, or a Vector3 – but value types can’t derive from anything – how come you’re allowed to call G( 3 )? Well, when such a call is made, a temporary object which does derive from System.Object and is known as a “box” is created on the heap, and a copy of the value type object placed inside it. Putting the value type in the box is called “boxing”, and taking it back out (via a cast, or the ‘as’ operator) is known as “unboxing”. Critically, the temporary box object is garbage. This means that using value types in functions or containers that rely on type System.Object generate garbage. As a result, boxing is probably the second most common cause of unexpected garbage. To avoid it, avoid using or writing functions or classes that use System.Object – prefer generics instead – though you can still get caught out if you’re not careful, as my previous post on Reflector showed.

So, usually, you will want to use reference types (classes) for your data that stays in memory for a significant amount of time. You’ll want to be careful with when you allocate it (remember, a garbage collection can only ever happen on an allocation). Value types (structs) are most useful for small, lightweight data. Think about Vector3 – it only contains three floats and has a handful of methods. The XNA team could have written an abstract base class Vector and specialised it with Vector2, Vector3, Vector4, and VectorN, but what a piece of over-engineering that would have been! More to the point, using vectors (very common objects in 3D games) would have generated masses of garbage. As a struct instead of a class, Vector3 is much more elegant – you can write things like SetPosition( new Vector3( 1.0f, 2.0f, 3.0f ) ), a thousand times a frame if you like, and know that there’s no chance of garbage. On the other hand, your main character object is bound to be of class type. It’s likely to derive from things (interfaces if not classes) and needs to be kept in memory, which means on the heap.

This also implies that if you have an object of class type, you should keep it around if you’re likely to be able to reuse it. For example, this kind of trick is useful all over the place:

void F1()


    MyClass myObject = new MyClass( 5 );    // Bad! Generates garbage every call!




static MyClass myStaticObject = new MyClass( 5 );


void F2()


    myStaticObject.DoSomething();            // Good! Does not generate garbage!


Just a couple more things to think about, though I’m a bit short on space. Some types can be considered as “atomic”. Basically what that means is, that if they change, they become a different object. So a Vector3 is non-atomic – you can change v.X and you still have the same Vector3, just in a slightly different place. But if you have a user defined type PlayerDetails, and you change any of the fields, you’ve got a completely different “thing” to deal with:

PlayerDetails player( “Andy”, “Patrick” );

player.FirstName = “Fred”;    // Look out! Now “Fred Patrick”, that’s not right!

player.LastName = “Bloggs”;

The player details for me are fundamentally a different “thing” to the player details for Fred Bloggs. (Language fails me slightly, here, and I’m also not sure I’ve picked a great example). Furthermore, if anything unexpected happens in the middle – the object gets viewed on another thread, or setting LastName throws an exception – the object can be seen in an invalid state. To prevent this, such atomic types should be immutable (I’m just not going to stop linking to those books!), which means you can’t change any single field, and once an object has been created, it never changes. This is really good software engineering (it allows you to write more correct and more secure code) but can lead to surprising results:

string message = “Player “;

message += playerNum.ToString();

message += ” wins by “;

message += points.ToString();

message += ” points!”;

That short code snipper generates not one, but seven objects: the final message object contains “Player 1 wins by 6 points!” for example, while the strings “Player “, “1”, “Player 1”, “Player 1 wins by “, “6”, and “Player 1 wins by 6” are all garbage! System.String is an immutable, atomic, reference type. This was absolutely the right choice by the .NET architects (strings in .NET are wondrous things of great beauty compared to strings in C++) but leads careless game programmers to watch helplessly as their frame rate plummets. Code like the above is probably the number one cause of unwanted garbage – watch out for it. If you must build strings piece by piece, use a StringBuilder object. (And, if you find yourself creating a type that is atomic and immutable, consider creating a non-atomic mutable MyTypeBuilder class to work with it).

I know what you’re thinking. “It would generate less garbage if you didn’t call playerNum.ToString() and points.ToString(), and just passed in playerNum and points!” Nice try, but no. System.String.operator+= works with System.Object types, and calls System.Object.ToString(). That means, if you passed playerNum into it, it would box playerNum and still generate the extra string – boxing and strings, the two worst garbage generators happening at once – you’ve made your garbage problem worse, not better!

There’s a lot to understand about garbage but I hope the last couple of articles have opened your eyes somewhat. I’ve approached it in a slightly unorthodox way – most writers will start with an article like this one, explaining the dos and don’ts, and probably never even cover the topics in the previous article which showed the whys and wherefores. Personally, I like to know what’s going on “under the hood”. You can memorise the rules relating to garbage and write good code, but until you understand the reasons behind the rules you might not write brilliant code. There is one important thing related to garbage I’ve not covered – that’s the Dispose pattern. This book (there it is again!) explains what it is and how to implement it, much better than I can; the only thing it doesn’t really cover is the detail of how objects waiting for disposal relates back to the mechanics of garbage collection I covered in Part One, so I might go over that at some point but I’m not promising anything.

I hope this appeases those of you who wanted a follow-up to Part One, I misjudged what people wanted and apologise for that. There is one way to make sure it doesn’t happen again – your feedback is always welcome, let me know what’s useful, if anything wasn’t clear, and what you’d like me to write about in future (though I reserve the right to choose not to :-).

“There’s four and twenty million doors, on life’s endless corridor” – Oasis

November 11, 2008

Garbage, Part Two: anticlimax

Filed under: Tools and Software Development — bittermanandy @ 12:12 am

Oooooookaaaaaaaaaay, it seems that Part One wasn’t very interesting as nobody commented on it and not many people even read it. I personally think that the way C# manages garbage is fascinating and immensely clever. Perhaps I’m the only one!

So I was going to go into a similar level of detail for much more garbage-related stuff, and spread it out across three or four posts; but that seems a bit pointless given the previous response, a lot of effort for not much gain. I could instead write about it at a very high level, but you can find better resources elsewhere. Sorry… bit of a let-down this post. I’ll try to make up for it with the next one.

I’m planning to get what Pandemonium (such as it is so far) running on XNA 3.0 over the next few days, hopefully I’ll be able to write a bit about that. I’m looking forward to getting some actual gameplay up and running, not just the basic systems, which I intend to show as I go. I have a few other topic ideas too. Of course, if there’s anything you’d like me to cover, say the word. Check out the kind of things I’ve written about so far to see the kind of level I’m writing at (and for).

Back to normal good service soon…

October 19, 2008

Garbage, Part One: Stack and Heap

Filed under: Tools and Software Development — bittermanandy @ 1:06 pm

I’d like to get back to writing about things that will, hopefully, improve your understanding of XNA and C#, so I thought I’d discuss something fundamental to the performance and resource usage of your application: garbage. This is a topic which many C# programmers can probably get away without fully understanding – and many probably do – but to be able to write efficient and performant code, you need to know what’s going on under the hood.

Those of you who really want the nitty gritty can probably skip these articles and read this one, this one, and (for the Compact Framework, used on Xbox 360) this one, but they can be pretty heavy going. I aim to provide a more abstract explanation, and will be always trying to emphasise the implications for your game code.

We should probably start at the beginning. Computer programs are all about manipulating data, and data is stored in memory. A huge variety of data structures have been developed to control that memory, but at the highest level (and with particular importance for C#) some memory is set aside for the stack, and the rest is used by the heap. (There is also constant memory and static memory, where your constants and statics are stored).

The stack is strictly limited in size, and it’s the area of memory set aside for your code to run in. Every time you call a function, the stack grows and information necessary for the running of the program is stored in it, including the return address (so the function knows where to go back to when it completes), function parameters, and local variables. It is possible to write software that only uses the stack, and in fact safety-critical software usually does exactly that – the last thing you need in your nuclear reactor control code is a memory allocation failure. On the other hand, the fact that the stack is of limited size makes some algorithms a bad idea in software – in particular, recursion (where a function repeatedly calls itself) can make the stack grow very rapidly, potentially until it runs out of memory. It is usually therefore preferred to use iteration (a single function containing a loop) rather than recursion.

The heap is limited in size only by the total amount of memory available to the process, which may be close to the total amount of physical memory in the machine. Where data on the stack is temporary in nature, data on the heap can be more permanent. The flipside of the extra longevity is that the rules for controlling data in heap memory are more complex. While stack memory can be reclaimed as soon as the function using that part of the stack exits, heap memory must be allocated and deallocated at less predictable points in the code. In C++, for example, memory for a data object is explicitly requested using operator new, and released using operator delete; it is all too easy to forget to delete something, leading to a memory leak, or attempt to delete it twice, leading to (often) a crash. In addition, allocating and deallocating memory can be slow.

C# and other managed languages attempt to solve these problems by using a garbage collector to control the lifetime of data on the heap. This strategy makes it somewhat easier to write correct programs that do not leak and removes some classes of bugs entirely, but changes how code needs to be written to achieve good performance. In particular, allocating memory in a managed environment is much quicker – but while unallocating it is done automatically and is more likely to be done correctly, it can cause severe performance problems if you’re not careful. Hence why garbage collection is mentioned so often in discussions about C# in general, and XNA in particular (games being more performance-sensitive than a typical software application).

Basically (the articles above go into excruciating detail) data on the heap is stored in a single block. Every time an allocation is requested, the garbage collector expands the size of the block by the necessary amount and returns a reference to that data. Note that this is very fast – it’s literally just incrementing a number (the size of the heap), and that objects allocated at the same time exist next to one another in memory (which can bring performance benefits due to data locality). Also note that, by the nature of software, the reference that is returned will be stored somewhere – either on the stack (in a local variable perhaps) or somewhere else on the heap (as a member variable of an object that exists on the heap).

You can see from the diagram how this works. The stack will grow and shrink in size as functions are called and exited, while the heap will continually grow in size. References, on the stack or the heap, point at data on the heap (you never get a reference to data on the stack – there’s more to say about that later). If you look closely you’ll notice that if you start with a reference on the stack (a “root”) to the object it references, and then each reference in the object to the others in turn, you can build a list of which objects are still (directly or indirectly) referenced by a root, and which are not.

(Note that in the real world, static data is also a root, and heap objects can themselves act as roots under certain circumstances as well; but we’ll keep things simple in this discussion). 

For example, you can see that stack object S2 references heap object H4, which in turn references H2 and H5. However, although H3 references H7, neither H3 nor H7 themselves are referenced by anything on the stack (the function that allocated them must have exited). They are therefore said to be “unrooted” and have been coloured slightly differently to show that. Think about what this means. The stack represents our running program. So if a heap object is unrooted, it must mean that nothing in our running program has any way of affecting, or being affected by, that heap object. It’s no use to us any more.

I mentioned that the heap will continually grow in size. This is clearly something that cannot be allowed to continue forever! At some point we must reclaim some memory. That point will come when we request another memory allocation, and the garbage collector decides that it can’t allocate any more without making room. It might do this when it has literally run out of spare memory, but will usually do it earlier, after some value of memory has been allocated. (This value varies on Windows and Xbox).

This point is important because it has several implications. Firstly, garbage collections cannot happen at any arbitrary moment. They only occur when you request more memory – if you don’t create any more heap objects using new, you will not get a garbage collection happening. This may mean garbage collections are unpredictable, but they are fully deterministic. Secondly, garbage collections happen on the thread that is allocating the memory, inside the new operator, not on a separate “garbage collecting” thread. This means that new is almost always very fast, until a garbage collection is necessary at which point it is very slow (then goes back to being very fast again).

When the garbage collection occurs, it builds a list of rooted and unrooted objects, very much as I described above – starting with roots on the stack, and following references through objects on the heap. It won’t visit any object more than once, so is efficient and can cope very happily with circular references (which, for example, smart pointers in C++ often do not). On Windows, it uses a technique called “generational garbage collection” to speed things up still further. This is incredibly clever, but not available on Xbox so I won’t go into detail here; again see the links above if you’re interested. Once it has worked out which objects are still rooted, it shuffles them together in memory, copying over the unrooted ones. This frees up more space so program execution can continue. Here’s what it will look like after the garbage collector runs on the memory in the diagram above.

Again there are some important consequences of this. We’ve already seen that allocating a new object is (almost always) very fast. However, we can begin to understand that having lots of references in our data means the garbage collector has to follow more links when working out what is rooted and unrooted, which is a small performance cost. It’s also more likely we’ll forget to set a reference to null, so objects may still be rooted when we actually don’t care about them any more. (Note though that a C++-style “true” memory leak is impossible). On the plus side, we never suffer from fragmentation, and we can see that objects allocated together (for good memory locality) will stay together due to the way they are shuffled. Less happily, objects that aren’t actually needed any more could stay in memory for a long time, until the next garbage collection, so our software is more memory hungry than may strictly be required.

Most of all, though, we can begin to understand why a garbage collection is slow. When the collector decides that a collection is necessary (when new is called and certain conditions have been met), it must: (1) Wait for other threads to get to a safe state and halt them, so they don’t interfere. (2) Iterate through all roots, following all references, building a list of rooted and unrooted objects. (3) Shuffle the rooted objects together on the heap, writing over unrooted objects. (4) Iterate through all references in all objects on the stack and heap, updating the pointers they contain to the new locations in memory. (In fact there are a few more complications relating to finalisers and pinning that I won’t cover here). Then, and only then, can it allocate the new object and return it to your code. That’s a lot of work – and that is why it’s so easy for garbage collections to cause your game to run unplayably slowly.

I hope you will now have a better grasp of exactly why the garbage collector exists, what it does during a garbage collection, and why it does it when it does it. If you’re coming from a C++ background, you should probably feel relieved that you don’t have to worry about having to delete data yourself, though it would be only human to be a bit concerned about the fact you have so little control over when the garbage collection runs.

You do have some control, though. As I mentioned above – if you don’t call new to create an object on the heap, you won’t suffer a garbage collection. That’s probably the most important lesson to get from all this. Create new heap objects as infrequently as possible, and you will see great benefits in the performance of your game. One common and very sensible approach is to create all your heap objects at level start, keep them in memory throughout the level gameplay, then forcibly induce a garbage collection during the next level load, when no-one cares about performance. You may be able to think of other approaches for non-level-based games. But we’re getting ahead of ourselves – how do you control what is created on the stack, and what is created on the heap? I’ll talk about that in part two.

“640KB of memory is more than anyone will ever need.” – Bill Gates (attributed – but not true!)

PS. If you think garbage may be causing you problems, the CLRProfiler is your friend.

September 2, 2008

Tools of the Trade – Part Five: Pot Pourri

Filed under: Games Development,Tools and Software Development,XNA — bittermanandy @ 11:02 pm

I think that FxCop, Reflector, CLRProfiler, and PIX – all of which I’ve previously discussed in some detail – are the most useful, bread-and-butter tools you’ll use in your XNA development. (In fact, all but perhaps PIX are useful regardless of whether you’re writing games or applications). There are of course many, many more tools, each with their uses, and I’d like to summarise some of them here. The sign of a good craftsman is using the right tool for the job, so I’d encourage you to explore all the different options available to you. Put bluntly, if you’re spending all your time performing repetitive tasks, or going through endless tweak-test-tweak-test cycles to try and hunt down problems in your code, you’re wasting your time. Your time may not be worth a lot to you, but mine is worth a lot to me, so I for one am always looking out for new tools and I think you should too.

Your primary development tool is Visual Studio itself (the Express version of which is available as a free download – possibly the most amazing free thing ever). I’m going to assume you already have it or you’d not be coding in XNA! However, I can guarantee that you are not using it to it’s full potential. I know I’m not. The reason I can make this guarantee is that the full potential of Visual Studio is huuuuuuuuuge. Almost every day, certainly every week, I discover new things it can do and think “that’s amazing!” Start by keeping up with the Visual Studio Tip of the Day, learn how to write macros, take some time to explore for yourself (especially consider keyboard shortcuts), and look out for plug-ins and extensions too. Project Line Counter is a personal favourite.

We’ve already looked at a couple of profiling tools but Perfmon (free with Windows) is the daddy of them all. As an example, hit Start then Run, and type “perfmon”. When the tool loads, select “Performance Monitor”, right-click on the counter list at the bottom and select “Add Counters…”. Select, say, the “.NET CLR Memory” category, then, for example, “% Time in GC”. Choose your game as the selected object and click “Add >>” then “OK”. Hey presto! A line displaying the exact processor cost you are paying for garbage collection will be added to the graph. There are hundreds of counters like this one, and much more that Perfmon can do.

I’m a bit loathe to mention this next one because, frankly, I hate it, but there are some bugs that can only be identified using WinDbg (or “windbag”, free with the Debugging Tools for Windows). Running out of memory and not sure why? Take a memory dump of your game, load sos.dll, call !DumpHeap -stat to see what’s live on the heap, call !DumpHeap -type <type> on the most memory-expensive type it lists to see all the items of that type, and call !GCRoot with the address of one of those objects to see exactly what is keeping it in memory and why. Sometimes there’s just no other way to work out what’s happening to your memory. WinDbg is an advanced tool and it’s an absolute swine to work with, but if the debugger in Visual Studio can’t solve your problem, WinDbg will.

I previously wrote about Reflector and described how it can reveal any assembly’s code to you. How does the code you write get translated from C# to the CLR and IL, and finally JIT-compiled into machine code? Well, I can’t help you with the JITter but IldAsm (free with Visual Studio) can provide a fascinating insight into the Intermediate Language stage of your code’s existence. Much in the same way that you don’t need to understand assembly language to write or use C++, but knowledge of assembly can help you fine-tune your C++ and fix the really tricky problems, knowledge of IL and an understanding of the translation process – while not essential – will make you a better C# programmer.

There’s a whole bunch of tools that are a bit more specialised or esoteric:

Perforce is absolutely the best choice for source control. I can’t live without Perforce now, it’s as though it has become a part of me. It’s free for up to two users, though very expensive for larger teams than that (absolutely worth it if you’re a professional company, perhaps less so if you’re a group of hobbyists, in which case try Subversion).

– If you’re a pro developer and aren’t using continuous integration, you face months of torment in an endless death-march of crunch at the end of the project. Do yourself a favour and use CruiseControl .NET (free).

– Continuous integration becomes even more useful when a build is run against a set of unit tests, and in fact they’re useful for finding mistakes early which is good for anyone, pro or hobbyist alike. I’ve heard good things about NUnit (free)… do as I say, not as I do, and use it… not doing unit testing is my worst programming habit that one day I will get out of. Don’t fall into that trap.

– Perfmon can tell you when you’re slow on the CPU and CLRProfiler can tell you if it’s garbage at fault, but if not and you want to know which specific functions are slow (and you very often do!) NProf is the tool for you, and it’s free.

– Finally, I’ve not used it yet but RPM (Remote Performance Monitor for Xbox, free with XNA) looks to be pretty damn useful for working out why you’re running fine on PC but slow on 360.

The best thing about all of these tools? Like XNA itself, they’re all free. That’s the kind of money I’m OK with spending! It means you have no excuse for not becoming familiar with them and, hopefully, rather than staying up bleary-eyed until 5am trying to find the bug in your code, you can fire up the appropriate tool, find and fix the bug and be home in time to see your family and get a good night’s sleep. Everyone’s a winner!

There’s only one major category of tool I’m missing, and that’s a decent bug database. I’ve tried Bugzilla and OnTime, and at work we have to use Sunrise, and I hate all of them as well as some others. By far the best defect tracking system I used was Product Studio, when I was at Microsoft, but despite being brilliant it is only available with the Team System version of Visual Studio which is very expensive. If anyone can recommend a good, usable, simple bug database that is not web-based and has a good UI, please let me know.

In fact, undoubtedly many of you out there will have your own favourite tools. Why not share the love, leave a comment and let me and everyone else know which tools make your life easier?

“If the foreman knows and deploys his men well the finished work will be good.” – Miyamato Musashi

September 1, 2008

Tools of the Trade – Part Four: PIX

Filed under: Games Development,Tools and Software Development,XNA — bittermanandy @ 9:44 pm

There is one more tool that I want to cover in a little bit more detail before presenting a round-up of the best of the rest (there really are so many good ones out there that this mini-series could last for months or years if I wrote a post for each one). The last article presented the CLRProfiler, a tool to help you manage your garbage and ensure it is being collected properly. Careless garbage collection is often the cause of poor performance on the CPU – but the CPU is only half the story, and to find out what’s causing poor performance on the GPU, you will need to use PIX (available free as part of the DirectX SDK).

Those of you who downloaded and used my Kensei Dev library might have noticed this comment in the Dev Shapes code:

// TODO I have noted some performance issues with this code when drawing very large

// numbers of shapes, but have not had time to profile it and fix it up yet, sorry!

Recently I had a bit of spare time so decided to go back and revisit this section. I set up a very simple test within Pandemonium, to draw lots and lots of spheres at random positions. I discovered that drawing 1000 spheres, or about 860,000 triangles (which doesn’t seem that many to me), caused the frame rate to plummet to only about 6Hz:

Lots of spheres!

Lots of spheres!

Using tricks that I’ve covered and linked to previously it didn’t take long to determine that the GPU was the bottleneck. (For example, returning early out of the game’s Update method, therefore dropping CPU usage as close to zero as possible, had zero effect on the frame rate). So my next port of call was PIX itself.

PIX (originally an acronym for Performance Investigator for Xbox) is an immensely powerful tool and we’re only going to scratch the surface of it here. At the most basic level, you can think of it as a recorder for absolutely everything that happens on the GPU. You can see exactly when every single function that used the GPU was called, and exactly how long it took. You can even rebuild a frame of your game method call by method call, seeing the results rendered step by step, instead of within a sixtieth of a second.

In this case, I want to see which functions are taking so long within a frame. I therefore chose to sample a single frame, as all frames are likely to be pretty much equal in this case. (If, for example, I was seeing a generally solid frame rate with occasional stutters, I’d have had to have chosen a different option).



After starting the experiment and getting to a point where the frame rate was low, I hit F12 to capture a frame (this can take a second or two). After I’d shut down my game, PIX generated a report:

A PIX report

A PIX report

There’s quite a lot going on in this image so let’s take a look at each section in turn.

The top window shows a graphical timeline. It’s not obvious from this picture, but you’ll see it very clearly when you run PIX for yourself, that the bars on the timeline indicate time when the GPU and CPU are busy doing things. As you click along the timeline, the arrows indicate where the GPU and CPU synchronise to the same call. With some classes of performance problem, you’ll see big gaps in one or other processor – these indicate whether you are CPU or GPU bound, for example, if you are GPU bound, you’ll see gaps in the timeline for the CPU where it was waiting for the GPU to catch up. The red circle in the top right of the picture shows the range of calls within our sampled frame (which occurred about 48 seconds in) – it looks mostly empty in the screenshot, but zooming reveals more details.

The middle window shows the DirectX resources in use (remember, XNA is just a layer on top of DirectX) including pixel and vertex shaders, vertex buffers, surfaces and such like. Not of much interest to us at this point.

In the bottom right I’ve selected the Render window. This shows us a preview of the frame as it was constructed. As you advance the cursor along the timeline, this preview is updated – initially getting cleared to Cornflower Blue, then having more and more things drawn onto it. This can be invaluable for detecting overdraw, and is really interesting in its own right. One of my favourite features is the ability to “Debug This Pixel”, which shows every call that affected the colour of any given pixel in the frame. This kind of thing is very useful when investigating transparencies, occluders, quadtrees etc.

Finally, in the bottom left is a list of GPU events, in sequence. Here you can see every call made to the GPU during the sample (note how they are all Direct3D calls, as mentioned above). Using the timeline view, I was able to visually identify which function call was the most expensive. Clicking on that call in the timeline synchronised it in the events window. I’ve circled the call in question. You can see from the StartTime of each event that the call to IDirect3DDevice9::DrawPrimitiveUP took 107349677 nanoseconds, or 107 milliseconds. When you consider that a whole frame normally completes in just 17 or 33 milliseconds, this one function call taking 107ms is a massive limiting factor on my frame rate.

Using a combination of intuition, logic, common sense, and the Render window (clicking on the call previous to DrawPrimitiveUP removed all the spheres from the preview, so it’s obvious what it was drawing!) I identified the corresponding code in my XNA program:


        PrimitiveType.TriangleList, s_triangle3DVerticesArray,

        0, s_triangle3DVertices.Count / 3 );

You may not think this tells me very much. I already knew that the Kensei.Dev rendering code was slow, that’s why I fired up PIX in the first place! In fact, this is hugely valuable information. I know exactly which line of code is causing my GPU to run like a dog with no legs.

As this is a call to DrawUserPrimitives, it seems likely that the reason for this call being so slow lies in the User part of the method name. That is to say, the Kensei.Dev code builds up an array of vertices (s_triangle3DVerticesArray) each frame, and passes that into the function. This involves copying all those 860,000 triangles from main memory into the GPU memory, and is in contrast to using a vertex buffer, which lives on the GPU. If I can find a way to use native GPU resources and avoid the User methods, I may get a substantial speed boost; on the other hand, the User methods exist for the very usage scenario I’m using here, which is of vertices that can arbitrarily change position from frame to frame and which are controlled by the CPU.

Alternatively, it was suggested on the XNA Creators forums that I may be expecting the GPU to do too much in one go, and that splitting up the calls into smaller batches may improve performance. This is somewhat contrary to my understanding of modern GPUs, which, I was led to believe, vastly prefer to perform fewer operations on larger datasets than more operations on smaller datasets; nevertheless I am far from a GPU expert so will be taking that advice, and experimenting with splitting the vertex array/buffer into smaller pieces to see if this improves matters.

There are a few more possibilities as well. I’d like to say this story has a happy ending, but it doesn’t, at least not yet – I am hopeful for the future. I am still trying to solve this problem and find how to avoid this bottleneck. However, whenever investigating performance it is absolutely essential to base your observations and lines of inquiry on hard evidence. At the beginning of this article, I knew that “something in Kensei.Dev is slow”. PIX has since revealed that “DrawUserPrimitives is taking over 100ms to draw 860,000 triangles”. This will allow me to precisely focus my efforts, and hopefully find a correct, performant fix for the problem.

PIX has an awful lot more to offer than just single-frame samples and as your game nears completion you will probably find a lot of value in it. There are a lot of bugs that simply can’t be solved any other way, and if you are doing anything remotely clever with your graphics I strongly encourage you to learn about what PIX can do for you.

“My mistress’ eyes are nothing like the sun…” – William Shakespeare

August 20, 2008

Tools of the Trade: Part Three – CLRProfiler

Filed under: Tools and Software Development — bittermanandy @ 12:11 am

Over the last few days I’ve been encountering a growing realisation of just how many excellent C#/XNA/programming tools there are out there, that I use regularly; and therefore how large this mini-series will become, if I continue with this pattern of one article per tool. Still, there remain a few that I think are important enough to talk about in detail, before I summarise the rest more concisely (though leaving open the possibility of returning to them later).

So by now, you will hopefully be keeping the compiler warning level as high as possible, checking your code for style-based or semantic errors with FxCop, and when necessary, checking how the assemblies you’re interacting with do the things they do with Reflector. Now, perhaps, you’re starting to wonder about what the code you’re writing is doing at run time.

Probably the most notable thing your XNA/C# game  will be doing compared to, say, C++, which is (or at least, has been for many years) the de facto language used in the games industry, is dealing with memory in a completely different way. C# is a managed language. This means that where in C++ you use “new” to allocate memory and “delete” to deallocate it (with the possibility of leaks, scribbles and access violations), in C# you use “new” to get a reference to an object, and the garbage collector tidies it up for you when you’re finished with it.

There is a lot that I could write about the garbage collector. It’s probably fair to say that dealing with garbage correctly is key to programming in C#. In fact, until two minutes ago this article was about 1000 words longer, until I realised that I’d written a whole bunch of stuff more relevant to GC itself than the tool the article is about, and even then had either gone into too much or not enough detail depending on your point of view. (Perhaps something for a future article, or even series, if there is demand). In summary:

– GC on Windows is not a good thing and you should try to avoid generating garbage where reasonably convenient.
– GC on Xbox is a very bad thing, and if your game is intended for Xbox 360 you should go out of your way to avoid it everywhere possible.

The other day I was working on Pandemonium and noticed that my frame rate had plummetted from 1800 Hz in the morning to 200 Hz by the evening. Now, 200Hz is more than enough – for a finished game! But I only had one character running around a simple background so it seemed a bit low. Losing 1600 Hz in a day is not a good day. Luckily, using some of the Kensei.Dev.Options I’ve shown before, it took just a couple of mouse clicks to realise that I was suffering lots of garbage collections. This was confirmed when I used another Kensei.Dev.Option to show more details:

    if ( Options.GetOption( “Profile.ShowGarbage” ) )


        Kensei.Dev.DevText.Print( “Max Garbage Generation: “ + GC.MaxGeneration.ToString(), Color.LimeGreen );

#if !XBOX

        for ( int i = 0; i <= GC.MaxGeneration; ++i )


            Kensei.Dev.DevText.Print( “Generation “ + i.ToString() + “: “ + GC.CollectionCount( i ).ToString() + ” collections”, Color.LimeGreen );



        Kensei.Dev.DevText.Print( “Total Memory: “ + GC.GetTotalMemory( false ).ToString( “n0” ), Color.LimeGreen );


It was clear that something was generating lots of garbage. Suspicion soon fell on my background hits implementation, as that’s what I’d worked on that day. It was deliberately simple, just an array of triangles, each with a BoundingBox, against which I would test lines and swept spheres. I know an octree or kd-tree would be better but that’s something I can address later. It might not be a very quick method, but it shouldn’t be generating garbage.

Visual inspection of the code revealed nothing so to find the cause, I fired up the CLRProfiler (top tip: make sure to use version 2.0 and run as Administrator) and asked it to run my game. The first thing you should note is that it is invasive: my frame rate dropped from about 200 Hz without it, to less than 10 Hz with it running.

The CLR Profiler

The CLR Profiler

So, clear the “Profiling Active” checkbox, only then “Start Application”, get yourself to the part of your game that’s slow, and only then tick “Profiling Active” for a few seconds before clearing it again. That will generate a report.

A CLR Profiler report

A CLR Profiler report

109 Generation 0 garbage collections is quite a lot in the few seconds I ran the game for. Ideally, a game would have zero garbage collections during gameplay. (That’s not possible if you’re using XACT for sound, but it’s a good ideal to get as close to as possible). The Time Line button in the centre gives the best view of when GCs happen and why.

The CLR Profiler Timeline

The CLR Profiler Timeline

The coloured peaks show garbage, which you can see was increasing very sharply before dropping as each collection occurred. The vertical black line can be moved to show what the garbage consists of at each time slice; here I placed it just before a GC, and it shows me that 2MB (99% of my garbage) was from System.SZArrayHelper.SZGenericArrayEnumerator<T>. The rest was basically strings generated by the Kensei.Dev prints, listed above, and can be ignored. But – where did this SZGenericArrayEnumerator come from?

Back on the report page, I was interested in Heap Statistics, and wanted to see what had been allocated, so clicked Allocation Graph at the top. This shows a diagram indicating where all the garbage comes from. The highest level is on the left, and you go progressively further right to get more detail. So, Program::Main is the first significant box on the left and will have created 100% of the garbage – so all the garbage in your program was created by your program, go figure – but what’s lower down (ie. to the right)?

The Allocation Graph

The Allocation Graph

As expected, 99% of the garbage is SZGenericArrayEnumerator – which comes from SZArrayHelper::GetEnumerator – which comes in turn from BoundingBox::CreateFromPoints. There was only one place this appeared in my code, and yes, it was bad. In order to provide a (small) speedup to my collision detection, which I’ve already mentioned I knew to be non-optional but wasn’t ready to spend time on yet, I’d done a Line-BoundingBox check to reject misses before doing a Line-Triangle check. And, being lazy and naughty, I’d created a new BoundingBox every test:

    foreach ( Triangle triangle in data.Triangles )


        BoundingBox box = BoundingBox.CreateFromPoints( triangle.Vertices );

Just to be clear, here. foreach has gotten a really bad reputation for garbage because in CLR 1.0 the enumerator that foreach uses to traverse the container was of reference type. Some people, wrongly, claim that you should avoid it for that reason. That’s nonsense – in CLR 2.0 onwards, foreach is safer, easier to read, and at least as performant as a for loop. You should definitely prefer foreach in almost all cases. All my garbage was coming not from foreach, but from BoundingBox.CreateFromPoints.

The fix was easy – instead of creating a new BoundingBox for each Triangle for each hit test, I’d store the BoundingBox with the Triangle when they were created in the Content Pipeline. No more garbage, or at least, none at runtime; and the Content Pipeline doesn’t care about garbage. Really, I ought to have done it that way in the first place, so a definite slap on the wrist for me.

Only one thing still niggled. BoundingBox is a value type (created on the stack) so why was BoundingBox.CreateFromPoints creating garbage? The answer comes when you look back at the screenshot of Reflector in the previous article. BoundingBox.CreateFromPoints has a foreach loop in the middle of it, and foreach creates garbage.

“But Andy, wait!” I hear you cry. “You just said foreach doesn’t create garbage, in fact, you said that was nonsense!” Well, yes, though I did insert the word “almost” in a key position. The truth is that foreach does not create garbage for arrays, Lists, LinkedLists and any container other than Collection<T>. However, BoundingBox.CreateFromPoints has been designed to handle any container via the IEnumerable interface, which means the enumerator has to be boxed. Boxing means it is moved onto the heap, and is therefore garbage.

I’m still a little surprised that an array-specialised version of BoundingBox.CreateFromPoints isn’t provided but then, I guess it’s not exactly difficult to write if you desperately need it – especially given that Reflector shows how! In any case, I hope that this demonstration of how CLRProfiler helped me with a garbage problem, has shown how it can help you, too. Leave a comment and let me know how helpful you’re finding these articles.

“First, solve the problem. Then, write the code.” – John Johnson

August 16, 2008

Tools of the Trade – Part Two: Reflector

Filed under: Tools and Software Development — bittermanandy @ 11:51 pm

I had originally been intending to write about something completely different in this second part of the Tools of the Trade mini-series, but I encountered a problem today that I simply could not have solved without Reflector. It reminded me of just how cool this tool is, and I just had to make sure everyone else knew how cool it is too.

Following on from my article about FxCop, I decided to have a crack at writing some custom rules to use alongside the default Microsoft ones. The first sign that this might not be as easy as I had assumed it to be was when I realised there was a total absence of documentation about it. There weren’t even many articles – the best ones I did find are here, here and here. I will cut a long story short by saying that the task of setting up my custom rules was more arduous than I expected. (That said, it’s working now and I’m very happy with the results).

I eventually reached the stage where I believed my custom rules library was ready to add into FxCop, but unfortunately when I attempted to do so, it complained that the XML file I had embedded as a resource in my Kensei.FxCopRules assembly was incorrectly formatted. And that’s about all it said – the error messages were not exactly verbose in their description of the problem. I spent a significant amount of time trying to work out what was wrong.

After some head-scratching, I realised that the default rules libraries that ship alongside FxCop must (obviously) contain a correctly-formatted XML file. If only there were some way for me to examine those XML files in the assemblies themselves! In the Bad Old Days, this might have been possible by loading the assembly into a binary viewer and scouring it by eye, looking for readable phrases; otherwise it would be a lengthy and painful process of reverse engineering. Happily this is Space Year 2008 and .NET has made everything so much easier.

Enter Reflector.

This wonderful little tool uses reflection (surprisingly enough) to analyze any .NET assembly you care to throw at it, and expose its innards in glorious detail. All I had to do to solve my problem was open up one of the FxCop assemblies, browse through it until I found the Resources section, find the XML file and click “View Resource”:

An FxCop rules library XML file viewed in Reflector.

An FxCop rules library XML file viewed in Reflector.

And there it was. A correctly-formatted FxCop custom rules XML file laid out in all its glory. From there it was mere seconds to see where I had been going wrong, and just a couple of minutes to fix it. I now have custom FxCop rules up and running and catching errors in my code – and I genuinely believe it would have been, if not impossible, at least very difficult and time-consuming without Reflector.

Of course, Reflector’s awesomeness doesn’t stop there. As well as showing you resources that have been embedded into assemblies, it can show you the code they contain as well. And I don’t mean “code” as in machine code, assembler, or even MSIL – I mean it can show you what is for all intents and purposes the same C# the assembly was originally written in. And it will do it for any assembly you care to throw at it. That’s pretty damn cool. To spell it out – if, say, you’re writing an XNA game and are curious as to how the actual XNA Framework code was written, just fire up Reflector and fill your boots:

The Microsoft.XNA.Framework assembly in Reflector.

The Microsoft.XNA.Framework assembly in Reflector.

If that’s not the single coolest tool anyone has written in the history of computing, I don’t know what is. And, again – just like FxCop, and indeed XNA and (should you not have the full version) Visual C# 2005 Express, it’s completely free. (Incidentally, if you’re wondering why I was inspecting BoundingBox.CreateFromPoints – all will be revealed in Part Three of this mini-series).

I feel as though I should spend longer explaining why Reflector is so amazing but honestly, if what you’ve seen already doesn’t excite you I’m not sure what will. (Perhaps the add-ins will convince you?) In any case, it is unquestionably a tool that any XNA (or indeed C# or .NET) programmer should be familiar with, and you will probably find that you use it often. I certainly do.

“Like the carpenter, the samurai looks after his own tools.” – Miyamato Musashi

August 13, 2008

Tools of the Trade – Part One: FxCop

Filed under: Games Development,Tools and Software Development — bittermanandy @ 9:04 pm

The established wisdom – and there’s undoubtedly a lot of truth in it – is that catching a bug late in the development cycle can cost as much as 1000 times as much to fix as the same bug would have cost, if you’d caught it earlier. Really bad bugs can even (depending on what you’re making) make you miss your Christmas launch date, fail to reach launch altogether, force a product recall after launch, or even open you up to legal action. Some bugs have cost their authors literally millions of pounds, and the hardware bug that causes the Xbox 360 “Red Ring of Death” cost Microsoft billions. Of course, “cost” could equally mean time instead of money – but in either case, I’d rather spend ten minutes fixing a bug just after I wrote it, than wait till just before releasing my game and realise I have to spend hours, days or weeks finding and fixing the same bug.

The first weapon in your armoury to find and fix bugs early is to turn the compiler warnings up to the most practical maximum, which in the case of Visual Studio means Warning Level 4, or /W4. (If you’re stuck using Visual C++ you might be surprised to learn that some amazingly useful, in fact, critical warnings are actually disabled by default at level 4). You should always ensure your code compiles without warnings at /W4. You may think some warnings are unnecessary; fix them anyway. It won’t hurt, and might even help without you realising.

Once you’ve got your warnings turned up and fixed, the next step is to use static analysis tools like FxCop. Static analysis tools find bugs in your code without you even needing to run it. In fact, the next version of FxCop isn’t even a separate tool, it’s a compiler option (/analyze). These tools parse your code (or in FxCop’s case, examine the MSIL assemblies) and compare what you’ve written against a set of rules that help to identify dangerous patterns in your code.

With FxCop installed, you simply need to open it up, start a new project, and point it at where to find the assemblies (.dll or .exe) for your code. It works equally well with either XNA game code or standard .NET non-game code. Hit “Analyze”, and a few seconds later it will come up with a list of problems that it has detected, along with an explanation about why they are problems and suggestions on how to fix them.

You will probably be shocked and surprised by how many problems it finds in your code. The good news is, it’s probably not as bad as it looks; the bad news is, that’s because FxCop is slightly on the noisy side, though you have complete custom control over which rules you enable (and you can even define your own). For example, FxCop regularly complains at me about the following:

IdentifiersShouldBeSpelledCorrectly: Correct the spelling of ‘Kensei’ in namespace name ‘Kensei’.

Well, first of all, Kensei is the correct spelling, it’s the codename of my engine. Secondly, I’m not completely sure that I care if the spelling is correct – nobody is going to play my game and think, “it’s a good game, but he misspelled one of his namespace names”. Thirdly, I’ve followed the instructions to create my own user dictionary for FxCop to add “Kensei” to it, and couldn’t get it to work; and I’m certainly not going to try and think up a whole new codename for my project just to keep FxCop quiet. So, I usually disable this rule, or at least exclude the specific occurrence of the warning.

There are some warnings that are a little bit trickier and need more thought. Like this one:

MarkAssembliesWithClsCompliant: Mark ‘Kensei.dll’ with CLSCompliant(true) because it exposes externally visible types.

Really…? ‘CLS compliant’ means, basically, that the assembly can be used by other languages in the Common Language Specification that aren’t C#, like Managed C++ or Visual Basic. I guess, on one level, that seems like a pretty noble thing to aim for, though I wouldn’t expect to ever use those other languages given that I love C# so much. But it’s possible I might make parts of my code available to other people, and they might want to use those other languages. So should I follow this advice or not?

In this case the answer comes when I do what it says to do. Marking the assembly as CLSCompliant results in a number of compiler warnings, because I’ve been using types that are not CLS compliant. That’s fair enough, except that these types include things like GameTime, which is a core XNA type. The XNA team chose not to make their assemblies CLS compliant; so there is absolutely no point in me trying to do so.

Anyway, by now you’re probably wondering why I’m talking about FxCop when so far I’ve just described some things it does less than amazingly well. As it happens, I think it’s a wonderful little tool, especially since it’s completely free. I run it on all my code – the Kensei engine, and my games, including Pandemonium – regularly; at minimum, after each “milestone” I’ve set myself. I’ll then fix the problems it highlights, either by changing the code as recommended or by marking a warning as Excluded, to say that the infraction was deliberate. So… why? Here’s why (a real-life example, though the names have been changed to protect the innocent):

    public class BaseClass


        public BaseClass()





        protected virtual void InitValues()


            // …



        // Rest of class implementation



    public class DerivedClass : BaseClass


        public DerivedClass()


            // …



        protected override void InitValues()


            // …




        // Rest of class implementation


Spotted the error yet? You probably should have done – I’ve certainly stripped it down far enough! When I wrote this, the BaseClass was in BaseClass.cs in a project in my Kensei solution, and the DerivedClass was in DerivedClass.cs in my Pandemonium solution, and both files were pushing 1000 lines of code. So the problem was a lot less obvious… but no less real for all that. FxCop spotted it straight away:

DoNotCallOverridableMethodsInConstructors: ‘BaseClass.BaseClass()’ contains a call chain that results in a call to a virtual method defined by the class. … Virtual methods defined on the class should not be called from constructors. If a derived class has overridden the method, the derived class version will be called (before the derived class constructor is called).

In other words, when I did something like this:

    DerivedClass derived = new DerivedClass();

… it would (conceptually; this isn’t completely accurate) first allocate memory on the heap, then call the BaseClass constructor, the BaseClass constructor would call the DerivedClass virtual InitValues method, which would call into the BaseClass InitValues method, and only then would it call the DerivedClass constructor before giving me back the reference to the object. That is to say – it would be calling a member function intended for a DerivedClass object, on an object that wasn’t a DerivedClass at all, only a BaseClass.

Most of you will be well aware that this is a Bad Thing. I certainly was already. Such behaviour is not polymorphism (a good and noble ideal), it’s just broken and wrong. It could rely on members being in a certain state that weren’t in that state yet. It could call out to other functions in other objects, and end up interacting with them despite being only partially constructed. It could, frankly, do just about anything, and almost none of the things it could do would be good. (For more on what is allowed in C#, see the C# Standard). However, despite the fact I’ve been programming for [mumblemumblemumble] years in total, and for six of those years I’ve been paid to do it, I still made this mistake. Well, I am only human after all – and we humans make mistakes.

It’s a mistake that FxCop spotted and I’ve since fixed it. It spotted it almost as soon as I’d written it, and it took five minutes to fix. Late in the dev cycle, I might only have spotted it after hours trying to work out why my DerivedClass objects were behaving strangely, and so much other code might have become dependant on that behaviour that it might have taken hours longer to fix. I could have lost days or weeks to one silly little bug – and believe me, I’ve seen exactly that happen before now.

There are many other genuinely code-breaking bugs that the compiler can’t spot but FxCop will pick up immediately. (Have you implemented the Dispose pattern correctly throughout your program? Are you really sure about that? How about private members and functions that are never used?) If you’re willing to spend money, there are other static analysers out there that check correctness and performance at an even lower level, therefore finding even more subtle (or critical!) bugs, and all without actually running a single cycle of your code. (Indeed, if you’re using C++ for some reason and not using Lint, BoundsChecker, PREFast or something similar, I bet real money your code is broken). But for C# and/or XNA, FxCop is free, easy to use, easy to customise, and if you’ve not used it yet I strongly recommend that you start right now.

“If debugging is the process of removing bugs, then programming must be the process of putting them in.”Edsger Dijkstra

August 5, 2008

At Your Command

Filed under: Games Development,Tools and Software Development — bittermanandy @ 11:11 pm

Previously I wrote at some length about tips and tricks you can use to ease your game development. The focus for these articles was code-facing techniques to help you identify and diagnose bugs, and easily provide data about the current state of whichever facet of the game world you’re currently interested in. Code is only half of what makes a game – less than half, really, and a smaller proportion the bigger the game gets; most of what makes a modern game is art, design, and music content – and when it comes to generating content, you need good tools. Today I would like to talk about a feature of your game editor tools that you really shouldn’t try to live without: infinite undo and redo. Once again, I’ll provide an implementation from the Kensei library, at the end.

I won’t lie to you; there’s rather more work involved in setting up undo and redo than in, say, using Kensei.Dev.GetOption(), but I can guarantee it’s worth the effort, even if you’re in a team of one (can you imagine using Visual Studio without being able to hit Ctrl-Z to undo?) but especially if you’re in a team of many (particularly arty, creative, temperamental types, who are liable to hit you over the head with a shovel shouting “UNDO THIS, ASSHOLE!” if their tools are awkward to use – and understandably so). In fact, it’s fair to say that “tools programmer” is now a full time position at most modern games companies; you may not have time to work full-time on tools if you’re doing this for a hobby, but the ability to undo and redo is a feature that is sure to save you time in the long run.

I first read about how to implement infinite undo/redo in the so-called Gang of Four book, which came as something of a relief as until that point there had been nothing in it of any use whatsoever beyond stating the obvious. Gamma et. al. designate this the “Command Pattern”, because at its core lies the concept of encapsulating user commands as objects; to me, that’s not a great name as you can encapsulate user commands as objects and still not have infinite undo and redo. Nevertheless, to avoid confusion I will be using the established terminology in this article and accompanying code.

Let’s imagine your game has an Editor mode (or perhaps you have a separate Editor tool; it’s not important). Let’s further imagine that one of the key variables in your game is the Health of your Player. Taking the line of least resistance, you provide a mechanism (menu option? Dialog box? Command prompt? Property Grid or Reflection? Keyboard shortcut? All of the above, each calling the same function?) for the user to edit the starting value:

    LevelData.Player.Health = userInput;

Pretty straightforward. Until, that is, one day while using your Editor, you decide the player’s health is too low, so you use your Editor to increase it to 250, but then you have second thoughts, and can’t remember if it used to be 210 or 220. The first step in the solution, as noted above, is to encapsulate user input as an object:

    abstract public class ICommand


        abstract public void Execute();



    class SetPlayerHealth : ICommand


        public int newHealth;


        public SetPlayerHealth( int health )


            newHealth = health;



        public override void Execute()


            LevelData.Player.Health = newHealth;




    // … Loads of other code …


    ICommand cmd = new SetPlayerHealth( newHealth ); 


A few points to note here: firstly, throughout, I’m using public variables only to keep things simple. Secondly, you can probably guess already that you have to be very careful with the accessibility of your interface; if the code elsewhere can also set LevelData.Player.Health directly, it will be too easy for the code to change it without using a SetPlayerHealth object. Thirdly, you may still wish to wrap the command object behind a function so the external call remains the same. Fourthly, “Execute” is the term used by the GoF, so I use the same term here; I guess they thought “Do” wasn’t pretentious enough.

We’ve not really changed very much yet though. The trick is that, instead of calling Execute ourselves, we push it onto a stack owned by the Editor, and it’s the stack that is responsible for calling Execute:

    public class CommandStack


        public Stack<ICommand> commands = new Stack<ICommand>();


        public void AddCommand( ICommand command )



            commands.Push( command );



Now that we have a stack, any time the user hits Ctrl-Z we can take the last command off the top of the stack, reverse its effects (ie. undo it), and do that as many times as we like until we get all the way back to where we started, if we want to. (The stack can grow as large as the memory in our PC, hence, “infinite” undo/redo). Of course, we’ve not yet defined how to reverse a command, which we will do now:

    abstract public class ICommand


        abstract public void Execute();

        abstract public void Unexecute();



    class SetPlayerHealth : ICommand


        public int newHealth;

        public int oldHealth;


        public SetPlayerHealth( int health )


            // Store old state before changing it to the requested new state

            oldHealth = LevelData.Player.Health;

            newHealth = health;



        public override void Execute()


            // Set the new state

            LevelData.Player.Health = newHealth;



        public override void Unexecute()


            // Restore the old state

            LevelData.Player.Health = oldHealth;




    public class CommandStack


        public Stack<ICommand> commands = new Stack<ICommand>();


        public void AddCommand( ICommand command )       



            commands.Push( command );



        public void Undo()


            ICommand command = commands.Pop();




Stripped of all irrelevant detail that’s all there is to it. But we don’t need to stop there; there is some very low-hanging fruit that we can pluck to improve this feature yet further.

The most obvious next step is to support redo as well as undo. Very simply, this requires adding a second stack (redoCommands) to our CommandStack. Every time we pop a command off the main stack, after calling Unexecute, we push it onto the redo stack; and to redo, we simply pop off the redo stack, Execute it, and push it onto the main stack. If a new command is added, we empty the redo stack, as you can’t redo something when you’ve done something different since undoing it. If that doesn’t make sense, draw yourself a diagram, think about how Undo/Redo works in Visual Studio, and it should all become clear.

Speaking of Visual Studio – if you type in “abcdef” then hit Ctrl-Z, it will delete the whole string rather than leave you with “abcde”. Evidently, the “add text” command behind the scenes in Visual Studio isn’t limited to single characters. Instead, with the first key press, “add text: a” is added to the command stack, then when the second key is pressed, rather than adding a new command, the top command is modified to represent “add text: ab”. I’m not sure if there is an official term for this; I call it “merging” commands. It’s not too hard to implement, but the code I provide in a moment shows how anyway.

I’ll not go into detail on some of these other features, though again the supplied code implements them; but it would be nice to tie the Command Stack to the Editor’s Save system, so we can provide a “Save changes before closing?” dialog box when the user clicks “Exit” or similar. This is simply a case of recording the size of the stack every time the Editor saves the data file, and if we Do, Undo or Redo away from that size, mark the command stack as dirty. Additionally, some commands are too complex or destructive for us to be able to Undo them. In that case, we’ll need to warn the user (this part is left up to the calling code) then, once the command is executed, clear the undo and redo stacks – they will no longer be valid.

You can download Kensei.CommandStack here. You are, as always, free to do anything you like with it, including use it or not use it, as long as you don’t blame me if something goes wrong. Or, you can use the techniques I’ve discussed to come up with something different – perhaps even something better. Please tell me about it if you do.

I would just like to mention some final caveats:

  • This is an all-or-nothing system. If some parts of your Editor implement the Command Pattern and other parts do not, bad things will happen. At best, you will confuse your user. At worst, you may leave your data files in an invalid state with no way to recover from the damage.
  • Execute() and Unexecute() must be exact opposites of one another with no side effects. An Undo system that does something other than actually Undoing is no use at all. This is an ideal situation for unit tests; I recommend that you do as I say, not as I do, as Kensei.CommandStack doesn’t come with a unit test suite. Sorry.
  • It may be tempting to say, “aha! To Unexecute, I simply need to restore the game world state to what it was before I Executed. So if I just store the whole world state in my command objects, that will be all I need to be able to Unexecute!” Don’t be tempted. It’s not unreasonable for game world data to approach 5MB in a simple game; in a more complex game, it can be 100MB or more. (Compare that to the 20 bytes or so required for SetPlayerHealth, above). Just ten “Executes” later, with each storing the whole game state, and you’ve used 1GB of memory in your CommandStack. Even if you don’t run out completely, Windows will eventually start swapping memory with the hard disk and your tool will be unusably slow. Not an improvement. Store the changes (deltas) only, and change only what you need to, to Execute and Unexecute.

I hope this article has helped. I’d love to hear your comments if it did, or, even more so if it didn’t. I wrote at the beginning that there is more work in this than the “naive” way of writing an Editor, and that is true. But I hope you can see that it is not that much more work, and given how simple it is to understand, implementing infinite Undo/Redo in your tools is well worth the effort.

“In the midst of this our mortal life, I found me in a gloomy wood, astray.” – Dante

Blog at