Thursday, March 10, 2011

DECAFBAD

Makes you write code like this...
if ((ent = dynamic_cast(what)) && ent->GetGISClass() == gis_Composite) true;
At least C++ isn't judgmental.

Monday, March 07, 2011

Instancing Numbers


A quick stat on instancing performance. There are a lot of OpenGL posts with developers posting their instancing performance numbers, and others asking, so here's X-Plane.

On a 2.8 ghz Mac Pro (a few years old) with an ATI 4870 and OS X 10.6.6, we can push 87,000 meshes at just under 60 fps using instancing. The average instance call is pushing 32 instances per draw call.

Don't Go Anywhere!

I'm debugging X-Plane's autogen engine. In debug mode, with no inlining, optimizations, and a pile of safety checks, the autogen engine is not very fast. Fortunately, my main development machine has 8 cores, and the autogen engine is completely thread-crazy. The work gets spooled out to a worker pool and goes...well, about 8 times as fast.

All is good and I'm sipping my coffee when I hit a break-point. Hrm...looks like we have a NaN. Well, we divided by a sum of some elements of a vector. What's in the vector?
print ag_block.spellings_s.[0].widths[1]
Ah...8 tiles. At this point I am already dead. If you've debugged threaded apps you already know what went wrong:
  • The array access operator in vector is really a function call (particularly in debug mode - we jam bounds checks in there).
  • GDB has to let the application 'run' to run the array operator, and at that instant, the sim's thread can switch.
  • The new thread will run until it hits some kind of break-point.
  • If you have 8 threads running the same operation, you will hit the break point you expect...but from the wrong thread.
To say this makes debugging a bit confusing is an understatement.

A brute force solution is to turn off threading - in X-Plane you can simply tell the sim that your machine has one core using the command line. But that means slow load times.

Fortunately gdb has these clever commands:
set scheduler-locking on
set scheduler-locking off
When you set scheduler locking on, the thread scheduler can't jump threads. This is handy before an extended inspection session with STL classes. You can apparently put the scheduler into 'step' mode, which will switch on run but not on step, but I haven't needed that yet.

Sunday, March 06, 2011

CSM for Dummies

This quote from NVidia's GPU Programming Guide amused me:
There are many techniques available. However, the general recommendation is
that unless you know what you are doing you should just implement simple
multi-tap cascaded shadow maps.
Or put another way:
If you have no idea what the hell you're doing, try cascaded shadow maps -- what could go wrong?
Oh wait, X-Plane 10 uses CSM. Well, I guess that's for the best...

(The guide also suggests that "3 levels are sufficient to provide good shadow detail for any scene." Have they seen our scene graph?)