Tuesday, November 28, 2006

Returning an Array in C - the Least of 3 Evils

In C++ you can write things like this:
void get_some_strings(vector& out_strings)
for (some loop)
Easy! How do we do this if our API has to be pure C? Well, there are three options.

First you could always allow the function to allocate memory that's deallocated by the caller. I don't like this option at all. To me, it's a walking memory-leak waiting to happen. You'll have to check for memory leaks every time you use the function and there are no compile-time mechanisms to encourage good programming technique.

A second option is to pass in some preallocated buffers. Hrm - I think I tried that once. Unfortunately as you can read here, the technique is very error prone and it took Sandy (the better half of the SDK) to clean up the mess I made.

The third option, and least evil IMO is to use a callback. Now we have something like this:
void get_some_strings(void (* cb)(const char * str, void * ref), void * ref)
for (some stuff)
cb(some string, ref);
If you can tollerate the C function callbacks (have a few Martinis - they'll look less strange) this actually provides a very simple implementation, which means less risk of memory-management bugs.

On the client side if you're using this library from C++ you can even turn this back into the vector code you wanted originally like this:
void vector_cb(const char * str, void * ref)
vector * v = (vector*)ref;
Now I can use this one callback everywhere like this:
vector my_results;
What I like about the callback solution is that both the client code and implementation code can be written in a simple and consise manner, which translates into solid code.

Tuesday, November 07, 2006

Surprise - pthread_cond_timedwait takes an absolute time!

Whoever reads the man pages anyway? Turns out pthread_cond_timedwait takes an absolute time! In otherwords, if you want it to sleep for 1 second, you have to pass one second more than the current time, as returned by gettimeofday (yuck) and converted from a timeval to a timespec (double yuck).

As much as I gripe about this because most threading APIs take relative timeouts (as does select), there actually is a use for this.

When writing a thread-safe message queue you might write something like this to read the queue:
lock critical section.
increment waiting thread count.
while(we don't have a message)
if(pthread_cond_timedwait(condition var, timeout)==ETIMEOUT)
decrement thread count, unlock, return "timed out"
decrement waiting thread count
read message out of queue
unlock critical section
Now...you might wonder, why do we need a while loop? The answer is: it's possible that between the time that our thread is woken up (vai the condition variable wait) due to a message being queued and the time we actually run, get the lock, and continue execution, another thread could have come along and stolen our message. (Note that another thread can go through this code without ever calling pthread_cond_timedwait if there is already a message, which is good for performance. This is not a FIFO message queue!) Thus we have to while-loop around until we reach a point where we've woken up, acquired the lock, and the message is still there. Once we have that lock, the message isn't going anywhere and we can safely exit the loop.

This is where the absolute time is handy - we might go around the loop 6 times. But the "deadline" - the time after which there's no point in waiting, we should return to the user, is an absolute time, and can be invariant across the loop. Thus there is no need to measure how much time has gone by on each loop and decrement our wait time.