So first of all: told you so.
Now that I got that out of my system: it was interesting to read Apple's documentation on how queues (their name for message queues whose messages are "do this") solve threading problems. The docs spend a lot of time describing how much you won't have to lock, but I don't think this is entirely true. The rule is pretty simple:
Resources that are owned entirely and exclusively by the "task" that has been dispatched to the queue do not need to be locked, because a task cannot be run in two places at once.And that alone is a pretty nice improvement! (If I could have a nickel for every forum thread I read where people suggest two threads and a mutex as a way to do concurrent processing. The over-use of a mutex as something it is not is probably it's own blog topic.) At least if we can separate our data object to be processed from the world, now it is lock free.
But what do we do if that processing task needs to talk to the rest of our application? There are basically only a few options:
- Make the APIs that we have to call non-blocking and lock-free.
- Put fine-grained locking into the various APIs (e.g. use internal locks as needed). Now there is a risk that our tasks can block, but perhaps it won't be too bad. Use a performance analyzer to see if this is hosing us.
- Put some kind of course-grained lock on the entity the task will need - perhaps for the entire execution of the task, perhaps for longer. At this point you have to wonder if you're losing concurrency.
An example of strategy 1: by using atomics to swap textures into their container objects, we don't need to obtain a lock on a texture object to use it.
Strategy 1 and 2 combine: a central registry of objects is used only for async loader code. Once we have an object, we are lock free. (Objects are reference counted, and reference counting can be lock-free - see also the glibc++ string implementation.)
A final comment: Apple suggests replacing fine-grained resource locking (e.g. grab a lock, do a few things, release the lock) with queueing a "block" (a small piece of code) on a serial queue (a message queue with only one thread). This can be done synchronously or asynchronously.
- It's refreshing to see someone abusing threading primitives by suggesting that a semaphore be used instead of a mutex - normally on the net I see people doing just the opposite (and wondering why their design has 5000 edge cases and blows up as soon as they add a third thread).
- If the block is queued asynchronously this is a performance win. Except for the part where you don't know if your work has actually been done. It's a great refactoring if you can get away with it but I wouldn't expect to hit this case any time soon.
- If the block is queued synchronously this solution is exactly the same concurrency wise as a mutex.
By comparison, my understanding is that a message queue is at least a message queue - I would normally expect the serial thread with sync queue to have to "toss" the work to another thread, then wait (because if the main thread did the work, the worker thread would be available to pick up work tasks out of order). But perhaps Apple has optimized this, with a fast path to execute synchronously queued tasks on the calling thread when a serial queue is "idling".