Pandemonium

August 20, 2008

Tools of the Trade: Part Three – CLRProfiler

Filed under: Tools and Software Development — bittermanandy @ 12:11 am

Over the last few days I’ve been encountering a growing realisation of just how many excellent C#/XNA/programming tools there are out there, that I use regularly; and therefore how large this mini-series will become, if I continue with this pattern of one article per tool. Still, there remain a few that I think are important enough to talk about in detail, before I summarise the rest more concisely (though leaving open the possibility of returning to them later).

So by now, you will hopefully be keeping the compiler warning level as high as possible, checking your code for style-based or semantic errors with FxCop, and when necessary, checking how the assemblies you’re interacting with do the things they do with Reflector. Now, perhaps, you’re starting to wonder about what the code you’re writing is doing at run time.

Probably the most notable thing your XNA/C# game  will be doing compared to, say, C++, which is (or at least, has been for many years) the de facto language used in the games industry, is dealing with memory in a completely different way. C# is a managed language. This means that where in C++ you use “new” to allocate memory and “delete” to deallocate it (with the possibility of leaks, scribbles and access violations), in C# you use “new” to get a reference to an object, and the garbage collector tidies it up for you when you’re finished with it.

There is a lot that I could write about the garbage collector. It’s probably fair to say that dealing with garbage correctly is key to programming in C#. In fact, until two minutes ago this article was about 1000 words longer, until I realised that I’d written a whole bunch of stuff more relevant to GC itself than the tool the article is about, and even then had either gone into too much or not enough detail depending on your point of view. (Perhaps something for a future article, or even series, if there is demand). In summary:

– GC on Windows is not a good thing and you should try to avoid generating garbage where reasonably convenient.
– GC on Xbox is a very bad thing, and if your game is intended for Xbox 360 you should go out of your way to avoid it everywhere possible.

The other day I was working on Pandemonium and noticed that my frame rate had plummetted from 1800 Hz in the morning to 200 Hz by the evening. Now, 200Hz is more than enough – for a finished game! But I only had one character running around a simple background so it seemed a bit low. Losing 1600 Hz in a day is not a good day. Luckily, using some of the Kensei.Dev.Options I’ve shown before, it took just a couple of mouse clicks to realise that I was suffering lots of garbage collections. This was confirmed when I used another Kensei.Dev.Option to show more details:

    if ( Options.GetOption( “Profile.ShowGarbage” ) )

    {

        Kensei.Dev.DevText.Print( “Max Garbage Generation: “ + GC.MaxGeneration.ToString(), Color.LimeGreen );

#if !XBOX

        for ( int i = 0; i <= GC.MaxGeneration; ++i )

        {

            Kensei.Dev.DevText.Print( “Generation “ + i.ToString() + “: “ + GC.CollectionCount( i ).ToString() + ” collections”, Color.LimeGreen );

        }

#endif

        Kensei.Dev.DevText.Print( “Total Memory: “ + GC.GetTotalMemory( false ).ToString( “n0” ), Color.LimeGreen );

    }

It was clear that something was generating lots of garbage. Suspicion soon fell on my background hits implementation, as that’s what I’d worked on that day. It was deliberately simple, just an array of triangles, each with a BoundingBox, against which I would test lines and swept spheres. I know an octree or kd-tree would be better but that’s something I can address later. It might not be a very quick method, but it shouldn’t be generating garbage.

Visual inspection of the code revealed nothing so to find the cause, I fired up the CLRProfiler (top tip: make sure to use version 2.0 and run as Administrator) and asked it to run my game. The first thing you should note is that it is invasive: my frame rate dropped from about 200 Hz without it, to less than 10 Hz with it running.

The CLR Profiler

The CLR Profiler

So, clear the “Profiling Active” checkbox, only then “Start Application”, get yourself to the part of your game that’s slow, and only then tick “Profiling Active” for a few seconds before clearing it again. That will generate a report.

A CLR Profiler report

A CLR Profiler report

109 Generation 0 garbage collections is quite a lot in the few seconds I ran the game for. Ideally, a game would have zero garbage collections during gameplay. (That’s not possible if you’re using XACT for sound, but it’s a good ideal to get as close to as possible). The Time Line button in the centre gives the best view of when GCs happen and why.

The CLR Profiler Timeline

The CLR Profiler Timeline

The coloured peaks show garbage, which you can see was increasing very sharply before dropping as each collection occurred. The vertical black line can be moved to show what the garbage consists of at each time slice; here I placed it just before a GC, and it shows me that 2MB (99% of my garbage) was from System.SZArrayHelper.SZGenericArrayEnumerator<T>. The rest was basically strings generated by the Kensei.Dev prints, listed above, and can be ignored. But – where did this SZGenericArrayEnumerator come from?

Back on the report page, I was interested in Heap Statistics, and wanted to see what had been allocated, so clicked Allocation Graph at the top. This shows a diagram indicating where all the garbage comes from. The highest level is on the left, and you go progressively further right to get more detail. So, Program::Main is the first significant box on the left and will have created 100% of the garbage – so all the garbage in your program was created by your program, go figure – but what’s lower down (ie. to the right)?

The Allocation Graph

The Allocation Graph

As expected, 99% of the garbage is SZGenericArrayEnumerator – which comes from SZArrayHelper::GetEnumerator – which comes in turn from BoundingBox::CreateFromPoints. There was only one place this appeared in my code, and yes, it was bad. In order to provide a (small) speedup to my collision detection, which I’ve already mentioned I knew to be non-optional but wasn’t ready to spend time on yet, I’d done a Line-BoundingBox check to reject misses before doing a Line-Triangle check. And, being lazy and naughty, I’d created a new BoundingBox every test:

    foreach ( Triangle triangle in data.Triangles )

    {

        BoundingBox box = BoundingBox.CreateFromPoints( triangle.Vertices );

Just to be clear, here. foreach has gotten a really bad reputation for garbage because in CLR 1.0 the enumerator that foreach uses to traverse the container was of reference type. Some people, wrongly, claim that you should avoid it for that reason. That’s nonsense – in CLR 2.0 onwards, foreach is safer, easier to read, and at least as performant as a for loop. You should definitely prefer foreach in almost all cases. All my garbage was coming not from foreach, but from BoundingBox.CreateFromPoints.

The fix was easy – instead of creating a new BoundingBox for each Triangle for each hit test, I’d store the BoundingBox with the Triangle when they were created in the Content Pipeline. No more garbage, or at least, none at runtime; and the Content Pipeline doesn’t care about garbage. Really, I ought to have done it that way in the first place, so a definite slap on the wrist for me.

Only one thing still niggled. BoundingBox is a value type (created on the stack) so why was BoundingBox.CreateFromPoints creating garbage? The answer comes when you look back at the screenshot of Reflector in the previous article. BoundingBox.CreateFromPoints has a foreach loop in the middle of it, and foreach creates garbage.

“But Andy, wait!” I hear you cry. “You just said foreach doesn’t create garbage, in fact, you said that was nonsense!” Well, yes, though I did insert the word “almost” in a key position. The truth is that foreach does not create garbage for arrays, Lists, LinkedLists and any container other than Collection<T>. However, BoundingBox.CreateFromPoints has been designed to handle any container via the IEnumerable interface, which means the enumerator has to be boxed. Boxing means it is moved onto the heap, and is therefore garbage.

I’m still a little surprised that an array-specialised version of BoundingBox.CreateFromPoints isn’t provided but then, I guess it’s not exactly difficult to write if you desperately need it – especially given that Reflector shows how! In any case, I hope that this demonstration of how CLRProfiler helped me with a garbage problem, has shown how it can help you, too. Leave a comment and let me know how helpful you’re finding these articles.

“First, solve the problem. Then, write the code.” – John Johnson

18 Comments »

  1. I love this mini series. While I was already using the CLRProfiler for some time now, the information about FXCop was very welcome.
    I also like how you give an example of actually using the tools you talk about.

    Comment by Catalin Zima — August 20, 2008 @ 12:56 am | Reply

  2. Good stuff. I quite enjoy this series and have been trying out all these tools along the way. Will there be anything on NUnit or unit testing in general?

    Comment by smack0007 — August 20, 2008 @ 8:21 am | Reply

  3. I find these articles 0.9 helpful on the helpful scale, where 0 is useless and 1 is awesome 🙂

    Comment by Roel — August 20, 2008 @ 10:35 am | Reply

  4. Great articles. Looking forward to what you have for us next.
    More teasing video footage wouldnt go a miss 😀

    Comment by Conkerjo — August 20, 2008 @ 10:58 am | Reply

  5. Thanks for the awesome feedback guys. 0.9H eh? Not bad!

    smack0007: I won’t be going into NUnit in detail as (to my shame) I don’t currently use it. Everyone’s got one bad habit, and mine is not doing unit tests – even though I know they are quick to write, prevent errors later, provide a living form of documentation and are invaluable during development to both the original author and anyone else who uses the code. I know all that, and it’s all true, and I’d recommend unit tests for anyone else… but somehow I still just haven’t got into the habit of doing it yet. I feel like I’m accumulating bad karma. This will probably keep me from nirvana.

    Comment by bittermanandy — August 20, 2008 @ 12:48 pm | Reply

  6. Wow, these articles are great. I have learnd a lot in the past weeks. Yours and shawns blogs are my favourites at the moment. Keep that good work going 🙂

    Comment by Ginie — August 20, 2008 @ 2:44 pm | Reply

  7. Don’t feel bad, I don’t do it either. I currently write PHP code on top of the Zend Framework for a living though and we are discussing how to best start our own library on top of Zend and one of the sticking points is that we must start doing unit tests. I know a little bit about them but never used them extensively.

    Comment by smack0007 — August 20, 2008 @ 5:55 pm | Reply

  8. Great articles, between you and Shawn ive picked up on loads of advice I would never have found any other way. In this article you mentioned how many tools you use. I would most certianly be interested in usage examples like the last few posts, so please keep them coming! If you have the time, please post thos articels on the GC also. Great blog, keep it up

    Comment by Venatu — August 20, 2008 @ 6:38 pm | Reply

  9. Great article series so far! I’m still pretty new to production C# and XNA, so getting to see how to use real tools and solve real problems is a huge boon to me.

    Thanks very much! Keep up the good work!

    Comment by Alex — August 20, 2008 @ 11:44 pm | Reply

  10. Very helpful! Keep up the good work!

    Comment by Andrew — August 22, 2008 @ 9:41 am | Reply

  11. Good job. Enjoyed them all so far.

    Comment by nsthsn — August 28, 2008 @ 6:02 am | Reply

  12. […] If you think garbage may be causing you problems, the CLRProfiler is your […]

    Pingback by Garbage, Part One: Stack and Heap « Pandemonium — October 19, 2008 @ 1:06 pm | Reply

  13. […] Profiling tools (This blog is generally awesome) […]

    Pingback by Protoplay London 2008 and XNA 360 Garbage « Luke’s Devblog — November 5, 2008 @ 3:03 pm | Reply

  14. Hi Andy,

    would the CLRProfiler 2.0 work (correctly) with .NET 3.5? I’ve just upgraded to XNA 3.0 and I can’t seem to find a CLRProfiler 3.5…

    Thanks,

    Kostas

    Comment by thinkinggamer — December 17, 2008 @ 8:26 am | Reply

  15. I know of no reason why it wouldn’t, .NET 3.5 is mostly the same as .NET 2.0 at runtime anyway, most of the new stuff is compile time. I seem to recall using it the other day and it all seemed fine? It’s a free, small download… try it out and see :-).

    Comment by bittermanandy — December 17, 2008 @ 9:22 am | Reply

  16. I did and it has already indicated a silly bug in my code. Seems to work ok! 🙂

    Thanks for the tips in this blog, you make my work much easier (I am using XNA Game Studio to teach game development at the local Uni).

    Kostas

    Comment by thinkinggamer — December 19, 2008 @ 11:09 am | Reply

  17. […] delvings into XNA, and in one of these chronicles, he touched on the subject of the CLR Profiler again. That same year, jwatte gave away a free potion that could be used for performing the profiling […]

    Pingback by Scrolls from the past: Profiling « Sgt. Conker — November 4, 2009 @ 10:04 am | Reply

  18. […] this case, as the newer version of the CLR runs through foreach loops much better, as explained in this article about memory profiling. Nothing really should have to move to the heap […]

    Pingback by Cutting down on garbage collection « Electronic Meteor — December 4, 2011 @ 9:46 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a reply to smack0007 Cancel reply

Create a free website or blog at WordPress.com.