You twisted my arm.
Let’s very quickly go over what you should know already (remember, this isn’t necessarily a beginner’s guide to garbage, rather an attempt to explain why what you know about garbage is like it is, and what it means for your code):
1. Value types (ints, floats, and other in-built types, as well as any user-defined struct type, eg. Microsoft.Xna.Framework.Vector3 or a struct you define yourself) are created on the stack, so you don’t need to worry about garbage.
2. Reference types (any class) are created on the heap, so should be used carefully as they will be garbage collected, which can be bad for the frame rate of your game if it happens at an inconvenient time. Part one explains why, in more detail than you really need to know.
As you may be able to guess, it isn’t quite that simple.
First, it’s not completely accurate to say that value types are always created on the stack. They will be, if they are created within a function, but not if they are themselves a member of a class, in which case they will exist (as part of the class object) on the heap. It’s a minor and fairly obvious point but it’s important to be correct.
Second, it’s not completely accurate to say that value types never generate garbage. They can, of course, contain reference members, in which case creating a new struct object can indirectly create a new class object, which generates garbage exactly as though you’d created the class object yourself. It’s fair to say that this is a slightly unusual thing to do (I’m not sure I can think of an example?) but it’s entirely legal and something to watch out for.
Third, remember how I said value types are created on the stack unless they’re members of a class, and how that was a minor point? Actually it’s not that minor. Value types contained within reference types are allocated on the heap. For example, arrays always catch out new C#/XNA programmers:
int a = 0; // System.Int is value type. Allocated on the stack
int b = new int; // Array is reference type! Allocated on the heap (garbage!)
Slightly confusingly, passing value types by reference (using ‘ref’ or ‘out’) does not “turn them into” reference types. It simply means that the object’s value can (or must, with ‘out’) be changed within the function and those changes are reflected in the original object that exists outside the function. It also means that no temporary object is created to act as the function parameter. For example, v1 below is a copy of another Vector3; creating a new object and copying onto it can be marginally slower than just passing by reference, as with v2, and the difference becomes more pronounced the larger the object. (Remember a reference is eight bytes in size in 32-bit code like XNA, so any struct eight bytes or smaller won’t benefit from passing by reference at all). For this reason, functions like Vector3.CatmullRom, that take many struct parameters, are provided in two flavours: one, which is more convenient to use, that takes (copies) the inputs by value and returns the result; and another, which is more performant, that takes the inputs by reference and the result is an out parameter. It’s a pattern you may want to use in your own code though 90% of the time you’ll just call the more convenient version.
void F( ref int a, out int b, Vector3 v1, ref Vector3 v2 )
a = 1; // No garbage here
b = 2; // Or here either
m_v1 = v1; // Or here… but v1 is a temporary
m_v2 = v2; // Or here… but v2 is passed by reference
Of course, reference types are always passed by reference anyway (which means a function can always change them; that’s one thing I miss about C++: const!) so using the ‘ref’ keyword for reference types is pointless, though legal. (FXCop rightly points out your mistake though).
There’s another difference between value and reference types – more specifically, between structs and classes – than simply where they live. Reference types can inherit from other reference types (and you therefore get inheritance, polymorphism, and all the other clever object-oriented stuff) while value types cannot inherit at all. And yet – you are probably aware that any and every type in C# is considered to ultimately derive from System.Object. You will undoubtedly have seen functions like this (particularly in .NET 1.x, before generics came along):
void G( object obj )
If obj can be anything, including a value type, like an int, a float, or a Vector3 – but value types can’t derive from anything – how come you’re allowed to call G( 3 )? Well, when such a call is made, a temporary object which does derive from System.Object and is known as a “box” is created on the heap, and a copy of the value type object placed inside it. Putting the value type in the box is called “boxing”, and taking it back out (via a cast, or the ‘as’ operator) is known as “unboxing”. Critically, the temporary box object is garbage. This means that using value types in functions or containers that rely on type System.Object generate garbage. As a result, boxing is probably the second most common cause of unexpected garbage. To avoid it, avoid using or writing functions or classes that use System.Object – prefer generics instead – though you can still get caught out if you’re not careful, as my previous post on Reflector showed.
So, usually, you will want to use reference types (classes) for your data that stays in memory for a significant amount of time. You’ll want to be careful with when you allocate it (remember, a garbage collection can only ever happen on an allocation). Value types (structs) are most useful for small, lightweight data. Think about Vector3 – it only contains three floats and has a handful of methods. The XNA team could have written an abstract base class Vector and specialised it with Vector2, Vector3, Vector4, and VectorN, but what a piece of over-engineering that would have been! More to the point, using vectors (very common objects in 3D games) would have generated masses of garbage. As a struct instead of a class, Vector3 is much more elegant – you can write things like SetPosition( new Vector3( 1.0f, 2.0f, 3.0f ) ), a thousand times a frame if you like, and know that there’s no chance of garbage. On the other hand, your main character object is bound to be of class type. It’s likely to derive from things (interfaces if not classes) and needs to be kept in memory, which means on the heap.
This also implies that if you have an object of class type, you should keep it around if you’re likely to be able to reuse it. For example, this kind of trick is useful all over the place:
MyClass myObject = new MyClass( 5 ); // Bad! Generates garbage every call!
static MyClass myStaticObject = new MyClass( 5 );
myStaticObject.DoSomething(); // Good! Does not generate garbage!
Just a couple more things to think about, though I’m a bit short on space. Some types can be considered as “atomic”. Basically what that means is, that if they change, they become a different object. So a Vector3 is non-atomic – you can change v.X and you still have the same Vector3, just in a slightly different place. But if you have a user defined type PlayerDetails, and you change any of the fields, you’ve got a completely different “thing” to deal with:
PlayerDetails player( “Andy”, “Patrick” );
player.FirstName = “Fred”; // Look out! Now “Fred Patrick”, that’s not right!
player.LastName = “Bloggs”;
The player details for me are fundamentally a different “thing” to the player details for Fred Bloggs. (Language fails me slightly, here, and I’m also not sure I’ve picked a great example). Furthermore, if anything unexpected happens in the middle – the object gets viewed on another thread, or setting LastName throws an exception – the object can be seen in an invalid state. To prevent this, such atomic types should be immutable (I’m just not going to stop linking to those books!), which means you can’t change any single field, and once an object has been created, it never changes. This is really good software engineering (it allows you to write more correct and more secure code) but can lead to surprising results:
string message = “Player “;
message += playerNum.ToString();
message += ” wins by “;
message += points.ToString();
message += ” points!”;
That short code snipper generates not one, but seven objects: the final message object contains “Player 1 wins by 6 points!” for example, while the strings “Player “, “1”, “Player 1”, “Player 1 wins by “, “6”, and “Player 1 wins by 6” are all garbage! System.String is an immutable, atomic, reference type. This was absolutely the right choice by the .NET architects (strings in .NET are wondrous things of great beauty compared to strings in C++) but leads careless game programmers to watch helplessly as their frame rate plummets. Code like the above is probably the number one cause of unwanted garbage – watch out for it. If you must build strings piece by piece, use a StringBuilder object. (And, if you find yourself creating a type that is atomic and immutable, consider creating a non-atomic mutable MyTypeBuilder class to work with it).
I know what you’re thinking. “It would generate less garbage if you didn’t call playerNum.ToString() and points.ToString(), and just passed in playerNum and points!” Nice try, but no. System.String.operator+= works with System.Object types, and calls System.Object.ToString(). That means, if you passed playerNum into it, it would box playerNum and still generate the extra string – boxing and strings, the two worst garbage generators happening at once – you’ve made your garbage problem worse, not better!
There’s a lot to understand about garbage but I hope the last couple of articles have opened your eyes somewhat. I’ve approached it in a slightly unorthodox way – most writers will start with an article like this one, explaining the dos and don’ts, and probably never even cover the topics in the previous article which showed the whys and wherefores. Personally, I like to know what’s going on “under the hood”. You can memorise the rules relating to garbage and write good code, but until you understand the reasons behind the rules you might not write brilliant code. There is one important thing related to garbage I’ve not covered – that’s the Dispose pattern. This book (there it is again!) explains what it is and how to implement it, much better than I can; the only thing it doesn’t really cover is the detail of how objects waiting for disposal relates back to the mechanics of garbage collection I covered in Part One, so I might go over that at some point but I’m not promising anything.
I hope this appeases those of you who wanted a follow-up to Part One, I misjudged what people wanted and apologise for that. There is one way to make sure it doesn’t happen again – your feedback is always welcome, let me know what’s useful, if anything wasn’t clear, and what you’d like me to write about in future (though I reserve the right to choose not to :-).
“There’s four and twenty million doors, on life’s endless corridor” – Oasis