Pandemonium

November 12, 2008

Garbage, Part Two (Director’s Cut): oh, alright then

Filed under: Tools and Software Development — bittermanandy @ 12:27 am

You twisted my arm.

Let’s very quickly go over what you should know already (remember, this isn’t necessarily a beginner’s guide to garbage, rather an attempt to explain why what you know about garbage is like it is, and what it means for your code):

1. Value types (ints, floats, and other in-built types, as well as any user-defined struct type, eg. Microsoft.Xna.Framework.Vector3 or a struct you define yourself) are created on the stack, so you don’t need to worry about garbage.

2. Reference types (any class) are created on the heap, so should be used carefully as they will be garbage collected, which can be bad for the frame rate of your game if it happens at an inconvenient time. Part one explains why, in more detail than you really need to know.

As you may be able to guess, it isn’t quite that simple.

First, it’s not completely accurate to say that value types are always created on the stack. They will be, if they are created within a function, but not if they are themselves a member of a class, in which case they will exist (as part of the class object) on the heap. It’s a minor and fairly obvious point but it’s important to be correct.

Second, it’s not completely accurate to say that value types never generate garbage. They can, of course, contain reference members, in which case creating a new struct object can indirectly create a new class object, which generates garbage exactly as though you’d created the class object yourself. It’s fair to say that this is a slightly unusual thing to do (I’m not sure I can think of an example?) but it’s entirely legal and something to watch out for.

Third, remember how I said value types are created on the stack unless they’re members of a class, and how that was a minor point? Actually it’s not that minor. Value types contained within reference types are allocated on the heap. For example, arrays always catch out new C#/XNA programmers:

int a = 0;               // System.Int is value type. Allocated on the stack

int[] b = new int[5];    // Array is reference type! Allocated on the heap (garbage!)

Slightly confusingly, passing value types by reference (using ‘ref’ or ‘out’) does not “turn them into” reference types. It simply means that the object’s value can (or must, with ‘out’) be changed within the function and those changes are reflected in the original object that exists outside the function. It also means that no temporary object is created to act as the function parameter. For example, v1 below is a copy of another Vector3; creating a new object and copying onto it can be marginally slower than just passing by reference, as with v2, and the difference becomes more pronounced the larger the object. (Remember a reference is eight bytes in size in 32-bit code like XNA, so any struct eight bytes or smaller won’t benefit from passing by reference at all). For this reason, functions like Vector3.CatmullRom, that take many struct parameters, are provided in two flavours: one, which is more convenient to use, that takes (copies) the inputs by value and returns the result; and another, which is more performant, that takes the inputs by reference and the result is an out parameter. It’s a pattern you may want to use in your own code though 90% of the time you’ll just call the more convenient version.

void F( ref int a, out int b, Vector3 v1, ref Vector3 v2 )

{

    a = 1;        // No garbage here

    b = 2;        // Or here either

    m_v1 = v1;    // Or here… but v1 is a temporary

    m_v2 = v2;    // Or here… but v2 is passed by reference

}

Of course, reference types are always passed by reference anyway (which means a function can always change them; that’s one thing I miss about C++: const!) so using the ‘ref’ keyword for reference types is pointless, though legal. (FXCop rightly points out your mistake though).

There’s another difference between value and reference types – more specifically, between structs and classes – than simply where they live. Reference types can inherit from other reference types (and you therefore get inheritance, polymorphism, and all the other clever object-oriented stuff) while value types cannot inherit at all. And yet – you are probably aware that any and every type in C# is considered to ultimately derive from System.Object. You will undoubtedly have seen functions like this (particularly in .NET 1.x, before generics came along):

void G( object obj )

{

    // …

}

If obj can be anything, including a value type, like an int, a float, or a Vector3 – but value types can’t derive from anything – how come you’re allowed to call G( 3 )? Well, when such a call is made, a temporary object which does derive from System.Object and is known as a “box” is created on the heap, and a copy of the value type object placed inside it. Putting the value type in the box is called “boxing”, and taking it back out (via a cast, or the ‘as’ operator) is known as “unboxing”. Critically, the temporary box object is garbage. This means that using value types in functions or containers that rely on type System.Object generate garbage. As a result, boxing is probably the second most common cause of unexpected garbage. To avoid it, avoid using or writing functions or classes that use System.Object – prefer generics instead – though you can still get caught out if you’re not careful, as my previous post on Reflector showed.

So, usually, you will want to use reference types (classes) for your data that stays in memory for a significant amount of time. You’ll want to be careful with when you allocate it (remember, a garbage collection can only ever happen on an allocation). Value types (structs) are most useful for small, lightweight data. Think about Vector3 – it only contains three floats and has a handful of methods. The XNA team could have written an abstract base class Vector and specialised it with Vector2, Vector3, Vector4, and VectorN, but what a piece of over-engineering that would have been! More to the point, using vectors (very common objects in 3D games) would have generated masses of garbage. As a struct instead of a class, Vector3 is much more elegant – you can write things like SetPosition( new Vector3( 1.0f, 2.0f, 3.0f ) ), a thousand times a frame if you like, and know that there’s no chance of garbage. On the other hand, your main character object is bound to be of class type. It’s likely to derive from things (interfaces if not classes) and needs to be kept in memory, which means on the heap.

This also implies that if you have an object of class type, you should keep it around if you’re likely to be able to reuse it. For example, this kind of trick is useful all over the place:

void F1()

{

    MyClass myObject = new MyClass( 5 );    // Bad! Generates garbage every call!

    myObject.DoSomething();

}

 

static MyClass myStaticObject = new MyClass( 5 );

 

void F2()

{

    myStaticObject.DoSomething();            // Good! Does not generate garbage!

}

Just a couple more things to think about, though I’m a bit short on space. Some types can be considered as “atomic”. Basically what that means is, that if they change, they become a different object. So a Vector3 is non-atomic – you can change v.X and you still have the same Vector3, just in a slightly different place. But if you have a user defined type PlayerDetails, and you change any of the fields, you’ve got a completely different “thing” to deal with:

PlayerDetails player( “Andy”, “Patrick” );

player.FirstName = “Fred”;    // Look out! Now “Fred Patrick”, that’s not right!

player.LastName = “Bloggs”;

The player details for me are fundamentally a different “thing” to the player details for Fred Bloggs. (Language fails me slightly, here, and I’m also not sure I’ve picked a great example). Furthermore, if anything unexpected happens in the middle – the object gets viewed on another thread, or setting LastName throws an exception – the object can be seen in an invalid state. To prevent this, such atomic types should be immutable (I’m just not going to stop linking to those books!), which means you can’t change any single field, and once an object has been created, it never changes. This is really good software engineering (it allows you to write more correct and more secure code) but can lead to surprising results:

string message = “Player “;

message += playerNum.ToString();

message += ” wins by “;

message += points.ToString();

message += ” points!”;

That short code snipper generates not one, but seven objects: the final message object contains “Player 1 wins by 6 points!” for example, while the strings “Player “, “1″, “Player 1″, “Player 1 wins by “, “6″, and “Player 1 wins by 6″ are all garbage! System.String is an immutable, atomic, reference type. This was absolutely the right choice by the .NET architects (strings in .NET are wondrous things of great beauty compared to strings in C++) but leads careless game programmers to watch helplessly as their frame rate plummets. Code like the above is probably the number one cause of unwanted garbage – watch out for it. If you must build strings piece by piece, use a StringBuilder object. (And, if you find yourself creating a type that is atomic and immutable, consider creating a non-atomic mutable MyTypeBuilder class to work with it).

I know what you’re thinking. “It would generate less garbage if you didn’t call playerNum.ToString() and points.ToString(), and just passed in playerNum and points!” Nice try, but no. System.String.operator+= works with System.Object types, and calls System.Object.ToString(). That means, if you passed playerNum into it, it would box playerNum and still generate the extra string – boxing and strings, the two worst garbage generators happening at once – you’ve made your garbage problem worse, not better!

There’s a lot to understand about garbage but I hope the last couple of articles have opened your eyes somewhat. I’ve approached it in a slightly unorthodox way – most writers will start with an article like this one, explaining the dos and don’ts, and probably never even cover the topics in the previous article which showed the whys and wherefores. Personally, I like to know what’s going on “under the hood”. You can memorise the rules relating to garbage and write good code, but until you understand the reasons behind the rules you might not write brilliant code. There is one important thing related to garbage I’ve not covered – that’s the Dispose pattern. This book (there it is again!) explains what it is and how to implement it, much better than I can; the only thing it doesn’t really cover is the detail of how objects waiting for disposal relates back to the mechanics of garbage collection I covered in Part One, so I might go over that at some point but I’m not promising anything.

I hope this appeases those of you who wanted a follow-up to Part One, I misjudged what people wanted and apologise for that. There is one way to make sure it doesn’t happen again – your feedback is always welcome, let me know what’s useful, if anything wasn’t clear, and what you’d like me to write about in future (though I reserve the right to choose not to :-).

“There’s four and twenty million doors, on life’s endless corridor” – Oasis

About these ads

15 Comments »

  1. Thanks for a great followup – I hadn’t really realized that carelessness with strings would create that much garbage (although I had the related knowledge – just didn’t put them all together in one place…)
    This will help me out a lot later on – I was planning on some silly tricks with strings and object names to get in-game events working easily, but that sounds like it might not be the best idea now.
    And you certainly sold at least one copy of Effective C# – I hope you’re getting a good commission!

    ~Alex

    Comment by Alex — November 12, 2008 @ 1:59 am | Reply

  2. Well I am from a PC (none game) development background using C# and the GC there does it’s job well so makes me a lazy developer I guess, no real need to worry about the GC; so it’s nice to see what is actually going on under the hood, especially as the compact framework has such a bad (wrong words I know..) GC, this will help my XNA 360 development no end.

    So glad you posted it.
    :)

    The XNA community is an odd place when it comes to blogs, goes very quiet one moment, then everyone is commenting…maybe it’s just my blog that happens on lol

    Comment by Charles Humphrey — November 12, 2008 @ 2:26 pm | Reply

  3. TRhbaks for posting it, great article. Nice to see you listen to your readers! Great blog :)

    Comment by Venatu — November 12, 2008 @ 4:58 pm | Reply

  4. Good work, cheers. Nice description of the reference/value type differences, and the boxing/unboxing overhead. Keep ‘em coming :)

    Comment by JB — November 12, 2008 @ 5:40 pm | Reply

  5. Andy, good follow up, very interesting. I wasnt aware the GC havok that careless use of strings could wreak. Keep up the excellent posts.

    Comment by Kevin — November 13, 2008 @ 5:13 pm | Reply

  6. Good work Andy! I’ve read of lot of C# texts, but you have a nice, easy to understand, way of explaining important issues.

    Kostas

    Comment by thinkinggamer — November 17, 2008 @ 11:30 am | Reply

  7. Just to chime in — I stumbled onto this article, which wasn’t really what I was searching for, but a pleasant surprise instead. I’m familiar with these issues from doing a lot of Java work, but it’s nice to see some detailed treatment of value types (which we sadly lack). I also appreciated the link to CLR Profiler; the free tools are so much better than last time I was playing with .Net! (And anything’s better than running your code in a loop and checking GC stats before and after… it works, but it’s a hassle after a few passes.)

    Comment by Seth — November 19, 2008 @ 12:06 am | Reply

  8. [...] Bittermanandy served up a second course on dealing with garbage in C# as it relates to the XNA Framework. [...]

    Pingback by XNA Team Blog : Creators Club Communiqué 09 — November 19, 2008 @ 3:03 am | Reply

  9. Thanks for another great post!

    I have a couple of nit-picking comments:

    “Second, it’s not completely accurate to say that value types never generate garbage. They can, of course, contain reference members, in which case creating a new struct object can indirectly create a new class object, which generates garbage exactly as though you’d created the class object yourself. It’s fair to say that this is a slightly unusual thing to do (I’m not sure I can think of an example?) but it’s entirely legal and something to watch out for.”

    Well, you will never indirectly create a new class object by creating a struct that contains reference members, unless you use a specific non-default struct constructor that does that, and since you can’t override the default struct constructor it will probably not happen unless you did want it to happen!

    Also, it’s not entirely correct to say that value types cannot inherit at all – structs can can implement interfaces :)

    Waiting for a new post – keep it up! :)

    Comment by Filip — November 29, 2008 @ 12:27 am | Reply

  10. Great article, I finaly understand garbage collection :). I’ve never been able to learn somthing without knowing the “why and How”, memorising rules puts to much trust in what somone else thinks is the right thing to do. so thank you for explaining it *Goes off to clean code*

    Comment by BlackSpiderWolf — December 10, 2008 @ 2:02 pm | Reply

  11. Great article !
    BTW Isn’t 32-bit system means that the size of a reference is 4 bytes and not 8.

    Also I think the string concat example would only generate 3 garbage strings because of string interning.

    To Filip:
    structs can indeed implement interfaces, but if you use them through the IF then boxing (and possible garbage) will occur.

    Comment by P4 — February 2, 2009 @ 10:29 am | Reply

    • I believe – though I am fully prepared to be shown that I am wrong on this – that a reference in a 32-bit system is 8 bytes because, as well as a four-byte pointer, it also contains a four-byte type ID.

      It’s also possible I miscounted the number of strings that become garbage in my example – the important thing is that string operations generate lots of garbage in places that you may not expect…

      Glad you liked the article anyway!

      Comment by bittermanandy — February 2, 2009 @ 1:23 pm | Reply

  12. Once I read a good article about this, but unfortunately I cannot find it now.
    Here is an other one :
    http://msdn.microsoft.com/en-us/magazine/cc163791.aspx

    Comment by P4 — February 4, 2009 @ 11:22 am | Reply

  13. Still trying to come to grips with GC; this was very helpful, I now have some idea of what it actually is (as opposed to yet another acronym).

    Comment by Nicholas — March 30, 2009 @ 8:36 pm | Reply

  14. [...] was quickly followed by this site, Pandemonium, which was another that simply asked us to reduce the amount of garbage being collected. It [...]

    Pingback by Don’t Call GC.Collect Every Frame | Chad Stewart: Game Programmer — December 9, 2009 @ 5:51 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: