Wednesday, August 26, 2009

Atomics and Threaded OpenGL

My previous post describes how to load a VBO or texture on a thread, then "send it" to the main thread for use. This is almost exactly how X-Plane loads its VBOs on worker threads.

For textures we use a slightly differntn strategy. The problem is that we (theoretically) need our textures immediately - that is, if we hit a part of the scene graph that references an unloaded texture, we still need to draw something.

Rather than have the state setup go into some fallback "untextured" mode, we use a proxy-and-switch scheme.
  1. When we first hit the need for a texture, we create one of our C++ objects that manages the texture. This texture object is inited to refer to a dummy gray texture that we use as our proxy for all unloaded textures.
  2. We queue the texture object to be loaded approximately whenever we get around to it. This might come up soon or it might take a while, depending on how much background work is being done.
  3. When a worker thread finally gets around to loading the texture, it loads the real image into a new texture "name" (GLuint texture ID).
  4. When the load is done and flushed, we do an atomic swap , swapping out the old gray proxy texture and swapping in the new texture.
The advantage of this is that the rendering code can be running at full speed, using this texture object (with its OpenGL texture ID inside it) without any thread safety or checking what-so-ever; it's a solution that has zero cost for the rendering engine.

And since the C++ texture object always has something in it, we can use the same shaders even before we've loaded, which simplifies the casing for shader setup quite a bit.


  1. Why does it have to be an atomic swap? It gets loaded in a queue anyway, so there is a chance that the gray texture will be rendered, at any point in time the textureid will either = graytexture or will = loadedtexture, it won't be anything else "midswap", can't we just do a normal swap and risk showing the graytexture for 1 more frame?

  2. The loading code is the same as the re-loading code, which needs to dealocate the old texture once it is swapped out for the new one. In this case, if we didn't have all of the goodness of an atomic op:

    we could have this calling sequence:

    Thread A.
    "Swap". Write of the swap to memory has not gone through to REAL memory yet.
    Delete old texture.

    Thread B.
    Read memory (not yet synchronized). Uh oh, texture is gone!

    (I realize that it is VERY unlikely that the caches could be out of sync through that much code - the GL texture management stack is going to do a lot. Still, I do believe the atomic op is necessary for _correctness_ of the reload and deallocate case.)